Expert System Brings Semantic Smarts to Advertising
Expert System is a perhaps little known “semantic intelligence” company; but it has 15 years of experience in the tech industry, 115+ employees and is bringing in a very solid $12 Million a year in revenues from over 100 corporate and government clients (at 40% growth over the past two years). The Italian company’s core technology is the Cogito platform, a sophisticated system which searches, extracts and classifies unstructured information and makes it into structured data. Cogito (which translates to “I think” in Latin) is bringing semantic technologies to the mainstream commercial world, including online advertising.
We spoke to Brooke Aker, CEO of the US subsidiary of Expert System, to find out more about the underlying technology of Cogito and its commercial applications. In particular we talked about how Expert System is using semantic technologies to power a new type of advertising.
The basic premise of Cogito is that it transforms unstructured information into structured data. Out of this process comes a “semantic network”, which is much the same thing as what rival company Cognition called a “semantic map” in our September ‘08 interview with them. It’s important to point out that Cogito isn’t necessarily a ‘Semantic Web’ application, but it does use things like natural language processing and other semantic analysis. It can output RDF, but that isn’t a fundamental part of Cogito.
Brooke Aker described the system to us as being like an “electronic dictionary”. There are 350,000 words and 2.8 million “relationships” in Cogito. Cognition claimed 10 million “semantic connections” in its map back in September, but Aker suggested that it wasn’t quite an apples and oranges comparison. According to Aker, Cogito’s semantic network is “richer” than Cognition’s.

Expert System’s new semantic advertising solution, Cogito Semantic Advertiser (CSA), came about because the company saw that traditional contextual keyword advertising is resulting in a lot of inaccuracies and mistakes when matching ads to page content. The classic example is the jaguar one, where a story about a jaguar (the animal) is accompanied by a ‘contextual’ advert for jaguar the car. Expert System told us that this kind of scenario won’t occur with their semantic advertising system.

Expert System claims to have come up with an advertising solution that understands “the real meaning” of words, based on theories of human comprehension. For example, their system analyzes the semantic meaning of words and their context. So in the jaguar example, Expert System would ‘understand’ that jaguar is an animal in the context of the story – and therefore it would not serve up ads about the jaguar car.
One feature of Cogito Semantic Advertiser that stood out for us was the granular categorization, which allows for very fine grained targeting of ads. Brooke Aker told ReadWriteWeb that their product has already created around 2 million niches for advertisers to target, which is something that many Long Tail publishers will find appealing.

We’re impressed by the semantic software that Expert System is producing, not just with advertising but in other commercial and government applications. Let us know in the comments if you’ve come across this company before and if so, what your thoughts are.
Open Knowledge Sharing for the Dynamic Web
The EU-funded OpenKnowledge program is a smart toolkit designed to unlock the hidden resources of the web that can’t be accessed by web sites and browsers alone. With a small, downloadable piece of Java code, users can coordinate and share information with each other more directly than through traditional means. To highlight the potential of the OpenKnowledge system, researchers have put it to work in three different areas: healthcare services, emergency management, and proteomics research.
1) OpenKnowledge Healthcare
The first demonstration of the OpenKnowlege system is aimed to enhance the abilities of those seeking health-related information on the web. Instead of solely relying on a doctor to prescribe a course of treatment, people today tend to seek out medical information on their own using the web. Unfortunately, that data is often inaccurate and misleading. What OpenKnowledge intends to do is provide patients with structured information that has been checked for accuracy. To test this system, OpenKnowledge is working with Cancer Research UK on a project related to treatment methods.
2) Emergency Response
When there’s an emergency situation, there is often a centralized point that disseminates critical information to people in need. But if that system itself breaks down, people are out of luck. OpenKnowledge aims to decentralize those systems so that a “backup” decentralized network of peers could be put into place. There, people could help each other out when the centralized system failed. This is currently being testing with emergency response authorities in Trentino, Italy.
3) Protemoics Research
Protemoics research (the study of the structure and function of proteins) can also benefit from the OpenKnowledge framework. In this area of science, many researchers worldwide rely on a small number of databases, creating a bottleneck of sorts which stresses the infrastructure of the databases themselves as well as those that maintain them. Researchers also find it hard to share data and results directly with other groups. In addition, the quality of the information in those databases is very mixed.
OpenKnowledge aims to solve all three problems by letting the researchers share data with each other directly, peer-to-peer style. This relieves the burden on the databases while the feedback will continually improve the quality of the data shared. This is currently being tested in an existing proteomics network in Spain called ProteoRed.

So…What Is It Exactly?
Understanding how a system like this works is difficult and the Open Knowledge web site doesn’t make the process of comprehension any easier. Even despite the cute, Harry Potter-themed slideshow meant to describe the process, the actual details are hard to grasp. Obviously written by brainy researchers, they can’t even call the slideshow a “slideshow,” instead referring to it as a “simple pictorial introduction.”
Still, if you can wade through the academic speech on the site, what you may find is a creative idea for sharing information. Basically, through open source downloadable code, OpenKnowledge sets up a peer-to-peer network where users can trade in information and data similar to how BitTorrent users trade mp3s and video files.
In the OpenKnowledge system, anyone can easily become a peer or even create their own peer by sharing existing code or writing their own. In order to become an OpenKnowledge user, you simply need to download the OpenKnowledge kernel from here together with some additional components that you might want to use. In addition to users, services, such as WSDL services, can also be made into peers on the OpenKnowledge network.
OpenKnowledge is more of a framework for decentralizing the systems on the web. It’s not so much of a consumer-friendly web app than it is a model for information sharing that can help advance areas of science and research. You may not ever use OpenKnowledge yourself on your home computer, but your life may very well be impacted one day by the innovations it made possible.
The Most Innovative Companies in Web 2.0
- Google: With Google Docs and the new Chrome browser, the search titan continues to take aim at Microsoft.
- Facebook: The social network that everyone loves to hate (but secretly loves anyway) overtook MySpace with 150 million active users.
- Digg: More than 35 million visitors per month submit 20,000 stories a day to the site. A recent $29 million financing round will keep the crowd thumbing through the downturn.
- Twitter: The microblogging phenom is adding up to 10,000 users a day, with the likes of The New York Times, JetBlue, and Lance Armstrong Tweeting.
- Meebo: More than 40 million users instant-message and chat across the Web through any format including the iPhone. A new partnership with Universal Music will connect chatters with artists, while a $25 million financing round in 2008 will fuel ‘09 expansion in Asia.
- FriendFeed: Relatively new to the life-streaming melee, FriendFeed aggregates every comment, posted picture, and status update from 59 Web services.
- Ning: More than 700,000 custom social networks have popped up on Ning, the platform that lets people make their own facebooks. The site now attracts 2,500 to 3,000 new networks a day.
- MySpace: Make fun of the chaos, but MySpace is the only social network that has made serious progress with targeted marketing. MyAds generates a reported $140,000 to $180,000 in daily revenue, making it a $50 billion business in theory.
- Yelp: With 4 million user-submitted reviews of everything from corner cafés to dog groomers, Yelp can make or break local businesses nationwide.
- Kayak: Already operating in 11 markets including India and China, the travel search engine is aggressively expanding in Europe. The profitable site boosted revenue by 240% last year.
The iPhone Becomes a Web Server
When those Apple advertisements tout “there’s an app for just about anything,” they aren’t kidding. The latest example? A new iPhone application which just debuted in Japan’s App Store transforms the handheld into a full-blown web server. Called “ServersMan@iPhone”, the application allows your iPhone to appear just like any other web server on the internet.
The new application was developed by a Japanese operation called FreeBit, a Tokyo-based venture company known for providing its network platform to many VNO/ISPs (virtual network operator/Internet service providers).
Once the app is installed, PCs on the internet can access the iPhone to upload or download files through a browser or they can use the webDAV protocol. If the PC and the iPhone are on the same network, the PC can connect directly. If they are on separate networks, then FreeBit’s VPN software will engage the connection.
The name “ServersMan” is said to be inspired by Sony’s “WalkMan,” and its no coincidence that FreeBit has invited Sony’s former CEO Nobuyuki Idei to be a business advisor for the company.
At the moment, the ServersMan@iPhone is only available in the Japanese App Store, but an English version is coming in March. A port for Windows Mobile devices is also under development.
Beyond Latitude: 4 Innovative Location-Based Apps

Google’s new geo-aware mobile application Latitude which lets you share your location with friends may have received all the hype, but that doesn’t necessarily mean it’s the best or the most innovative app out there. We’ve recently come across some smaller, lesser-known services that could give Google a run for their money – that is, if anyone knew they existed.
Google’s Latitude was not the first location-based service on the market by any means. Here at RWW, we’ve been fans of other mobile social networking applications like Loopt and Brightkite as well as location-aware Twitter clients like Twinkle among others. So of course when we ran across some other smaller location-based services, we had to take a look. Each of the services listed below are doing something innovative that goes beyond Google’s current offering. We just wish more people knew about them.
Bliin
Bliin is a Dutch service that combines the location sharing of Google Latitude with the journey recording of Nokia’s viNe application (our coverage). With Bliin, you can explore the world around you, zooming in and out on the map while viewing geo-tagged photos, the last recorded locations of other Bliin users, and even the live movements of anyone sharing their journeys on the network. So why isn’t Bliin catching on? Blogger Martin Bryant theorizes it could be because the network doesn’t integrate with your address book the way viNe or Latitude does. It’s also a much smaller European company without the financial clout of either Google or Nokia.

Toaí
Toaí (Portuguese for “I’m Around”) is actually a new social network which, at first, appears to be just like any other: you sign up, create a profile, etc. However, Toaí also asks you for your mobile phone number which it uses to text you with music, art, and stores that are near where you are. The recommendations it sends are based on you and your friends’ favorites. It also lets you know when your Toaí friends are nearby. Toaí is a great example of how a social network can add in location-based features to take networking beyond the virtual world and into the “real” world. Unfortunately, there is no English-based version of Toaí yet.

Parallel Kingdom
Who cares what your friends are doing when there are dragons to slay, enemies to fight, and weapons to upgrade? The new RPG (role-playing game) Parallel Kingdom for Android and iPhone uses location-based networking to let you see other nearby players on a Google map. Also, when finding monsters to battle, you have to physically move to a new location to do so. The end result is a mashup of traditional gaming and geocaching. (U.S. Only)

Radar
Outside.in’s new iPhone application Radar (not to be confused with the photo-sharing iPhone app also called Radar) delivers location-based content including news, blog posts, and tweets to your iPhone. Once installed, the app will geo-locate you upon being launched to deliver the news within 1000 feet of where you are standing, from your neighborhood, or from the entire city. You can also load up Radar with profiles for various locales, including your home, office, and more. Shake the app to switch from the news listings to the map view. (U.S. only)

10 Semantic Apps to Watch
One of the highlights of October’s Web 2.0 Summit in San Francisco was the emergence of ‘Semantic Apps’ as a force. Note that we’re not necessarily talking about the Semantic Web, which is the Tim Berners-Lee W3C led initiative that touts technologies like RDF, OWL and other standards for metadata. Semantic Apps may use those technologies, but not necessarily. This was a point made by the founder of one of the Semantic Apps listed below, Danny Hillis of Freebase (who is as much a tech legend as Berners-Lee). The purpose of this post is to highlight 10 Semantic Apps. We’re not touting this as a ‘Top 10′, because there is no way to rank these apps at this point – many are still non-public apps, e.g. in private beta. It reflects the nascent status of this sector, even though people like Hillis and Spivack have been working on their apps for years now.
What is a Semantic App?
Firstly let’s define “Semantic App”. A key element is that the apps below all try to determine the meaning of text and other data, and then create connections for users. Another of the founders mentioned below, Nova Spivack of Twine, noted at the Summit that data portability and connectibility are keys to these new semantic apps – i.e. using the Web as platform.
In September Alex Iskold wrote a great primer on this topic, called Top-Down: A New Approach to the Semantic Web. In that post, Alex Iskold explained that there are two main approaches to Semantic Apps:
1) Bottom Up – involves embedding semantical annotations (meta-data) right into the data.
2) Top down – relies on analyzing existing information; the ultimate top-down solution would be a fully blown natural language processor, which is able to understand text like people do.
Now that we know what Semantic Apps are, let’s take a look at some of the current leading (or promising) products…
Freebase
Freebase aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.
Powerset
Powerset (see our coverage here and here) is a natural language search engine. The system relies on semantic technologies that have only become available in the last few years. It can make “semantic connections”, which helps make the semantic database. The idea is that meaning and knowledge gets extracted automatically from Powerset. The product isn’t yet public, but it has been riding a wave of publicity over 2007.

Twine
Twine claims to be the first mainstream Semantic Web app, although it is still in private beta. See our in-depth review. Twine automatically learns about you and your interests as you populate it with content – a “Semantic Graph”. When you put in new data, Twine picks out and tags certain content with semantic tags – e.g. the name of a person. An important point is that Twine creates new semantic and rich data. But it’s not all user-generated. They’ve also done machine learning against Wikipedia to ‘learn’ about new concepts. And they will eventually tie into services like Freebase. At the Web 2.0 Summit, founder Nova Spivack compared Twine to Google, saying it is a “bottom-up, user generated crawl of the Web”.
AdaptiveBlue
AdaptiveBlue are makers of the Firefox plugin, BlueOrganizer. They also recently launched a new version of their SmartLinks product, which allows web site publishers to add semantically charged links to their site. SmartLinks are browser ‘in-page overlays’ (similar to popups) that add additional contextual information to certain types of links, including links to books, movies, music, stocks, and wine. AdaptiveBlue supports a large list of top web sites, automatically recognizing and augmenting links to those properties.
SmartLinks works by understanding specific types of information (in this case links) and wrapping them with additional data. SmartLinks takes unstructured information and turns it into structured information by understanding a basic item on the web and adding semantics to it.
[Disclosure: AdaptiveBlue founder and CEO Alex Iskold is a regular RWW writer]
Hakia
Hakia is one of the more promising Alt Search Engines around, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. The company told us in a March interview that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. One point worth noting here is that, currently, Hakia has limited post-editing/human interaction for the editing of hakia Galleries, but the rest of the engine is 100% computer powered.
Hakia has two main technologies:
1) QDEX Infrastructure (which stands for Query Detection and Extraction) – this does the heavy lifting of analyzing search queries at a sentence level.
2) SemanticRank Algorithm – this is essentially the science they use, made up of ontological semantics that relate concepts to each other.
Talis
Talis is a 40-year old UK software company which has created a semantic web application platform. They are a bit different from the other 9 companies profiled here, as Talis has released a platform and not a single product. The Talis platform is kind of a mix between Web 2.0 and the Semantic Web, in that it enables developers to create apps that allow for sharing, remixing and re-using data. Talis believes that Open Data is a crucial component of the Web, yet there is also a need to license data in order to ensure its openness. Talis has developed its own content license, called the Talis Community License, and recently they funded some legal work around the Open Data Commons License.
According to Dr Paul Miller, Technology Evangelist at Talis, the company’s platform emphasizes “the importance of context, role, intention and attention in meaningfully tracking behaviour across the web.” To find out more about Talis, check out their regular podcasts – the most recent one features Kaila Colbin (an occassional AltSearchEngines correspondent) and Branton Kenton-Dau of VortexDNA.
UPDATE: Marshall Kirkpatrick published an interview with Dr Miller the day after this post. Check it out here.
TrueKnowledge
Venture funded UK semantic search engine TrueKnowledge unveiled a demo of its private beta earlier this month. It reminded Marshall Kirkpatrick of the still-unlaunched Powerset, but it’s also reminiscent of the very real Ask.com “smart answers”. TrueKnowledge combines natural language analysis, an internal knowledge base and external databases to offer immediate answers to various questions. Instead of just pointing you to web pages where the search engine believes it can find your answer, it will offer you an explicit answer and explain the reasoning patch by which that answer was arrived at. There’s also an interesting looking API at the center of the product. “Direct answers to humans and machine questions” is the company’s tagline.
Founder William Tunstall-Pedoe said he’s been working on the software for the past 10 years, really putting time into it since coming into initial funding in early 2005.
TripIt
Tripit is an app that manages your travel planning. Emre Sokullu reviewed it when it presented at TechCrunch40 in September. With TripIt, you forward incoming bookings to plans@tripit.com and the system manages the rest. Their patent pending “itinerator” technology is a baby step in the semantic web – it extracts useful infomation from these mails and makes a well structured and organized presentation of your travel plan. It pulls out information from Wikipedia for the places that you visit. It uses microformats – the iCal format, which is well integrated into GCalendar and other calendar software.
The company claimed at TC40 that “instead of dealing with 20 pages of planning, you just print out 3 pages and everything is done for you”. Their future plans include a recommendation engine which will tell you where to go and who to meet.
Clear Forest
ClearForest is one of the companies in the top-down camp. We profiled the product in December ‘06 and at that point ClearForest was applying its core natural language processing technology to facilitate next generation semantic applications. In April 2007 the company was acquired by Reuters. The company has both a Web Service and a Firefox extension that leverages an API to deliver the end-user application.
The Firefox extension is called Gnosis and it enables you to “identify the people, companies, organizations, geographies and products on the page you are viewing.” With one click from the menu, a webpage you view via Gnosis is filled with various types of annotations. For example it recognizes Companies, Countries, Industry Terms, Organizations, People, Products and Technologies. Each word that Gnosis recognizes, gets colored according to the category.

Also, ClearForest’s Semantic Web Service offers a SOAP interface for analyzing text, documents and web pages.
Spock
Spock is a people search engine that got a lot of buzz when it launched. Alex Iskold went so far as to call it “one of the best vertical semantic search engines built so far.” According to Alex there are four things that makes their approach special:
- The person-centric perspective of a query
- Rich set of attributes that characterize people (geography, birthday, occupation, etc.)
- Usage of tags as links or relationships between people
- Self-correcting mechanism via user feedback loop
As a vertical engine, Spock knows important attributes that people have: name, gender, age, occupation and location just to name a few. Perhaps the most interesting aspect of Spock is its usage of tags – all frequent phrases that Spock extracts via its crawler become tags; and also users can add tags. So Spock leverages a combination of automated tags and people power for tagging.
