With Google's support, Microformats could transform the web

The Web is broken in lots of ways, one of which is that its content is mostly unstructured. Most web pages have a title and some content, and that is about all you can assume. Researchers and standards bodies have been trying to impose more semantic order on the web since its earliest days, by adding metadata so that web crawlers can parse the content accurately, rather than relying on inference. These efforts have had little success outside academia; and the semantic content of typical pages has arguably got worse rather than better, thanks to the trend towards richer pages using JavaScript and Flash, rather than simple text marked up with HTML.

Microformats are a softer approach to adding metadata, using standard HTML but with conventions that identify certain types of content, such as name and address details (hCard), events (hCalendar), or even CVs (hResume). If you have web pages that include content of this type, you can usually mark it up according to the microformat specification without damaging the appearance of the page. The benefit is greater searchability; in fact, you could think of it as a kind of search engine optimization.

A good example is hReview, a draft specification for reviews of anything from products to events or places. If I'm considering a purchase, I often type in a search for "product x review", and then find myself sifting through lots of useless results, because all the ecommerce and affiliate sites know we do that kind of search, and pretend to have reviews when really they do not. If all the sites used hReview I could do more precise searches, perhaps specifying "only reviews in the last 12 months", and sorting by rating, and the search engine could structure the results nicely so I can see at a glance the author's name and an extract from the review itself.

Simple standards like this can have a huge impact. The whole blogging revolution was driven by RSS, which itself is a kind of microformat for news.

The snag is, they have to be widely adopted to be useful. I first wrote about microformats in 2006, but despite my enthusiasm they have had little impact to date. That may be about to change. On Tuesday May 12th the mighty Google announced a new feature called Rich Snippets, which means it will be exposing microformat metadata in its search results, along with another more generic type of metadata called RDFa.

It all sounded familiar to me, as the previous weekend I attended Yahoo Hack Day in London, and heard about its Search Monkey project which also uses microformats and RDFa. Yahoo was there first; but it is Google that has the power to shake up the web.

Should you care about microformats? If Google is serious, then it will have a wide impact. Business names and locations should be marked up with hCard, for example. Anyone designing an Ecommerce site with user reviews should be looking at hReview. Although the list of microformats Google is taking note of is small at the moment, if it catches on that list will inevitably grow.

The message that Google is giving an advantage to sites that use microformats and RDFa will be heard loud and clear by the SEO community, and given the commercial importance of effective search ranking this could grow quickly.

Then again, I may be over-optimistic now (and yes, I do think it is a good idea) as I was in 2006. Watch this space.

0 TrackBacks

Listed below are links to blogs that reference this entry: With Google's support, Microformats could transform the web.

TrackBack URL for this entry: http://www.itjoblog.co.uk/blogadmin/mt-tb.cgi/96

1 Comments

Google and XFN / FOAF

http://code.google.com/apis/socialgraph/

"Based on open standards
We currently index the public Web for XHTML Friends Network (XFN), Friend of a Friend (FOAF) markup and other publicly declared connections. By supporting open Web standards for describing connections between people, web sites can add to the social infrastructure of the web."

Leave a comment

Current Vacancies from CWJobs

(* Required field)










Preferred format