Paris As the World Wide Web Consortium hammers out specifications on how to recode the databases of the world so that natural-language queries can be intelligently answered online, Sony Corp. says it has found a better way.
Sony Computer Science Laboratory is positioning its "emergent semantics" as a self-organizing alternative to the W3C's Semantic Web that does not require any recoding of the data currently available online. Based on successful experiments with communities of robots, emergent-semantic technology is built on the principles of human learning, representatives of the Sony lab said at an open house here last month.
Much as these communities of "agents" extract meaning (semantics) from the character of their interactions, emergent semantics extracts the meaning of Web documents from the manner in which people use them, the researchers said. Based on just-patented emergent-semantics principles for its robots, the Sony scheme harnesses the human communication and social interaction among peer-to-peer file sharers, database searchers and content creators to append the semantic dimension to the Web automatically, instead of depending on the owner of each piece of data to tag it.
The latter methodology forms the basis of W3C's Semantic Web. Conceived by Sir Tim Berners-Lee, the inventor of the World Wide Web, the Semantic Web uses extended markup language to assign "meaning" to elements of Web pages. A dedicated team of people at the World Wide Web Consortium (www.w3.org) are dutifully spinning out specs for database coding. At its open house, Sony argued that this is similar to attempting artificial intelligence by writing if-then statements about everything in the world the bane of traditional AI.
"Our emergent-semantics technology is an alternative to the Semantic Web," said Luc Steels, director of Sony Computer Science Laboratory (CSL). Also at the open house, the lab showed off its latest research on the origins and evolution of language, as well as advances in computational neuroscience.
A previous research project at Sony CSL called Talking Heads, in which Steels played a principal role in 1999, became the foundation for the development of emergent semantics. In the Talking Heads project, Steels and his team demonstrated how agents could self-organize a shared lexicon as a side effect of their interactions. The experiment examined how agents might establish relations between a real-world object and a segmented image, followed by relating the segmented image to its conceptualization.
Further, the project studied how a conceptualization can be related to an utterance and how this can result in the self-organization of lexical and ontological constructs that explain meaning and relationships.
After Talking Heads, Steels' team began developing emergent semantics with an eye to solving interoperability problems in sharing data among peer-to-peer networks.
Emergent semantics will directly compete against the Semantic Web, which requires database vendors to give well-defined meanings to their information and thereby enable a common framework for sharing and reusing data across application, enterprise and community boundaries. By comparison, Sony's mechanism harnesses the communication already ongoing between software agents that self-organize a shared lexicon and a metadata descriptor, rather than depend on a data's owner to tag it.
"The Web has enormous amounts of information, and yet computers today can't communicate without conforming to specified fixed descriptors," said Peter Hanappe, associate researcher at Sony CSL. "The world has so far tried unsuccessfully to impose a top-down approach, such as the Semantic Web."
Hanappe said that in this model, only new data can easily get new descriptors attached to it. But there is already a vast amount of data online, he pointed out, and no guarantee that even new databases will adhere to W3C's Semantic Web specifications.
"We need to deal with legacy systems too," said Hanappe. "It's very hard to agree on how to describe certain things as it is, and what needs to be described continues to evolve."
The semantic interoperability problem is a big stumbling block, according to Sony, even for today's consumers using peer-to-peer file sharing of music, pictures or movies. Individuals, each speaking their own languages and subscribing to personal styles of organizing and categorizing content, already face difficulties in finding content they want to share or to exchange. "Users should be able to keep the autonomy of their own conceptual organization," said Hanappe, "rather than imposing a fixed ontology and taxonomy to each item of content and each individual."
In emergent semantics, a user's agent bootstraps the information and categorization of content, such as the classification of music in genres. Through interactions among agents trading "favorite" songs, genres emerge that are common to sets of users. Such emergent semantics as self-organizing genres are automatically tagged onto the content as an extra layer of information rather than depending on people to do the tagging, Hanappe said.
Sony CSL filed patents in Europe for emergent semantics last month, according to Steels, who claimed that the technology building blocks were ready for integration. "The algorithms and mechanisms necessary for theoretical models on the interaction of agents have already been mastered and are well-understood," he said. "It's just a matter of putting this thing to work."
Beyond file sharing
A separate research project described at the open house, called Malleable Mobile Music, was an attempt to demonstrate group interaction principles using off-the-shelf mobile handsets, a sensor subsystem and GPS. Using a newly developed music recomposition engine sitting at a server, Atau Tanaka, researcher at Sony CSL Paris, developed a system that transforms music from a fixed entertainment medium into what the company terms a "social remix." The music engine is designed to reconcile data from the listener group and remix it into a networked music stream that a group of friends can share. The system allows structural reorganization of the music from the high level of song "form" to the lowest level of "rhythm and melody" variation.
All data is adjusted in accordance with the location of each listener captured by GPS augmented by the way the user responds to the music, as captured by a sensor subsystem attached to the listener's handheld device. "This is a peer-to-peer experience that goes beyond file sharing," said Tanaka.
He said the new system can create a sense of common purpose, even in an environment of anonymous peer-to-peer file sharing. Tanaka plans to submit his project to a user study this winter among university students on various campuses in Paris.
Several projects demonstrated in computational neuroscience include advances in adaptation, generalization, continuous learning from experience and conceptualization. After investigating the process used by the brain and its neural substrate, researchers built several prototypes based on how impulses from spiking neurons enable rapid decision-making and learning. The ultimate goal is real-time adaptive systems that can continuously learn from experience.
Separately, the lab has developed a brain-inspired software-hardware hybrid system for adaptive motor control, which forms a complete sensory-motor loop from perception to action. The team claims the system mimics the response of neurons in the cerebellum by computing their impulses, combining a neural hardware chip with software models.