Search is radically changing to become more contextual, relevant and focused on producing the right answer for the user. The shift from a web of documents to a web of data is evolving. The impact can be seen on search engine results pages across many product categories, which foreshadows the future of travel search.
Moderating the “The Impact of Structured Data on Travel Search” panel discussion at the Open Travel Alliance 2010 Advisory Forum in Seattle last week provided an opportunity to highlight some of the dramatic steps forward in the area of travel search. Unfortunately, in the process, there were also some unintentionally ugly results discovered as the industry and the major search engines struggle to cope with the geometrically expanding range of unstructured data.
The topic was exceptionally timely given the breaking news regarding Google’s rumored acquisition of ITA Software and Facebook’s launch of its Open Graph protocol (that leverages RDFa data sets.) Even Apple Computer’s leaked iTravel patent filing incorporates search functionality for flights, hotels, rental cars, cruises, trains and buses. These three applications create new paradigms for travel search – all relying on access to the “Deep Web” – the source information that serves as the foundation for today’s web pages.
NOTE: The presentation navigation controls are in the lower right corner of the frame, including a full screen option. Please note that video is embedded, so set your volume control accordingly. Simply click the large right arrow to advance.
Impact of Structured Data on Travel Search – Open Travel Alliance 2010 Advisory Forum
If the presentation is not embedded above, try using this link to view it on the Prezi website: Impact of Structured Data on Travel Search on Prezi.com
For those Flash challenged souls (staunch devotees of Steve Jobs, or purchasers of his iPhone or iPad products) here is a link to a .pdf of the presentation: Impact of Structured Data on Travel Search.
Be forewarned that the .pdf is very large – 20MB – and it only provides screenshots of the embedded videos. I will refrain from providing commentary on the irony of using a formerly proprietary Adobe file format (PDF) to view content that is forbidden to be viewed on Apple devices due to use of a proprietary Adobe file format (SWF)…
Finally for those link challenged, bandwidth constrained or memory deprived (plus perhaps extremely zealous Adobe haters) the following are a synopsis of the key points from the presentation:
Before starting, it is important to recognize the contribution of panelists Stephane Donze, vice president of technology at Exalead and Dan Pritchett, chief platform architect at Rearden Commerce who both provided exceptional insights during the panel discussion portion of the session. Their perspectives and commentary reinforced the growing importance of structured and linked data within travel and its central role in advancing the quality and personalization of search experiences.
The two organizations approach the challenge of sourcing and implementing structured data very differently. Exalead provides a SAAS platform that enables its customers to rapidly and intelligently integrate structured and unstructured data from a diverse range of sources using advanced semantic search technologies. This Exalead video shows how organizing unstructured data and pairing it with structured data can provide Richer Content, Better Search for Online Services.
Rearden on the other hand, creates a highly curated proprietary environment containing purely structured data to provide travelers with a high quality end-to-end travel itinerary. This Rearden Commerce video shows how structured data can simplify planning and enhance a travel itinerary; in this case, the example is their Mobile Personal Assistant on iPhone.
It all starts with the evolution of the Internet from Web 1.0 to Web 2.0 to Web 3.0. Structured Data (also referred to as Linked Data) enables the development of the highly personalized, semantic and intelligent web experience by providing access to the “Deep Web”.
A brief technical interlude to summarize the building blocks of Structured Data:
- URI’s identify things
- HTTP references URI
- Metadata provides structured descriptions
- Links exposed to other related URI’s
RDF – Resource Description Framework:
- Uses Subject | Property | Object triplets to define things
- Triplets describe, capture knowledge & define relationships
- Uses MetaData to define semantic web relationships
Other core technologies:
- RDFS (RDF Schema) Describes properties & classes of RDF’s
- SPARQL (Query language for RDF) Enables queries across diverse data sources
- OWL (Web Ontology Language) Describes characteristics & relationships of RDF properties & classes
- RDFa (RDF with attributes) Allows XHTML attribute extensions to embed metadata in web documents
The goal is more relevant web search, for example, is a Jaguar a Car, a Cat or a Team? Are you certain? “It depends” is what RDF’s and the semantic web strive to sort out.
Complicating matters is an extremely complex, inter-related, multi-step travel process:
Additionally, personalization requires context and travel decisions are driven by multiple traveler personas, for example, an individual representing a defined demographic will use significantly different criteria to plan a business trip, romantic getaway or family vacation.
Relevance is gained by providing context and personalization. Understanding the relationships between linked data is an excellent starting point for providing relevant search results. The linked data that matters is often found in the Deep Web.
Three types of Deep Web search:
- Vertical Search – Queries Databases that sit bhind web sites
- Semantic Search – Understands searcher intent & contextual meaning
- Product Search – Product features, attributes & relationships
Two types of search queries:
- Exploratory – Information about a topic, references a source to locate data
- Factual – Computational knowledge, provides actionable information
Google Places & Bing Price Predictor both ultimately rely on structured data sourced from the Deep Web to provide more relevant search results. In the case of Google Places, the information may be manually entered or curated to provide the necessary structure. In the case of Bing Price Predictor, the fare information is sourced from ITA Software and processed by proprietary analytics to create predictions.
With the integration of universal search, shopping, local, checkout and Adwords links into the organic search results, for some vertical markets, the pages are getting crowded. Due to the opportunity to monetize search results pages by including PPC advertising, Google is providing ever increasing opportunities to capture revenue by adding additional advertising based links.
An excellent example of a product search page with comparatively high monetization potential for Google is a search for “Coffee”:
The search engine results page not only features traditional organic search results and the ubiquitous text-based “Sponsored Listings” generated by AdWords, but also
- Google Shopping results with photos listing the best price for several coffee-related products
- Several Adwords Image Ads, each with price links to three suppliers of the product
- A map providing both links to the supplier site, as well as a Google Places page
- Adwords listings with Google Checkout – some incorporating instant couponing
Hotel search results are not yet quite so cluttered with advertising options, but they are headed the same direction. Google has been experimenting with adding sponsored links featuring date-sensitive hotel pricing in Google Maps. That is the same platform that serves the hotel maps as universal results on search results pages. The same functionality can provide deep links to travel booking sites from search engine results pages. Those booking sites might be the supplier or an Online Travel Agency – it depends who makes the highest bid and converts more bookings.
Google Places pages for hotels provide a mashup of a daunting variety of information.. The Marriott Seattle Waterfront, used as an example, featured information extracted from 28 independent sources including:
- Property descriptions
- Maps & Streetviews
- Facilities & amenities listings
- Guest reviews
- Supplier and user generated photos & videos
- and, of course, AdWords ads
There are still challenges with aggregating data from so many sources. Some of the information provided is contradictory. The hotel can submit content, but has limited control to edit external content. For example, one prominent link was for the Jakarta Marriott. Videos of the hotel from VFM Leonardo were not highlighted. Despite the anomalies, there is a considerable amount of information available, particularly a large number of consumer generated reviews. In many cases, the information on Google Places pages is more rich than the information provided on the hotel’s own website.
The presentation concluded with a video providing an example of a highly personalized Wolfram|Alpha search gone horribly wrong. Full Disclosure: The source is CollegeHumor.com. It’s not real…
With search becoming more relevant and personal, the question of privacy certainly arises, but that is an extremely complex area that deserves another blog post altogether.
The obvious conclusion for the travel industry is that search is changing dramatically. The burning question is if the travel industry will help shape that change or sit back and see where the technological advances will carry them. One must hope it will be the former as opposed to the latter…
Structured data sourced from the Deep Web is essential to provide a solid foundation for linked data to define semantic relationships between pieces of information. This is what transitions the web of documents to a web of data.
The resulting relevant, convenient and highly personalized search experiences create trust. Whoever earns the trust of the consumer wins. The endgame for Web 3.0 players has never been more clear.