Friday, June 18, 2010

Bibliographers Shall Inherit...Data Monopolies - Repost

I recently heard Fred Wilson speak and it reminded me of this post from February 5th, 2007:

Fred Wilson is a founder of Union Square Ventures a private equity firm located in NYC. He was also part of Flatiron Partners until he left to start Union Square. He was the key note speaker at Monday’s SIIA Previews meeting and spoke about Content; specifically that content "wants to be free."

He ended the session with a potentially more interesting theme which related to tagging and content descriptions. In answer to a question about the potential power of social nets and the attendant tagging possibilities he suggested that we shouldn’t have to tag information at all; that is, content should be adequately described for us. The questioner stated that ‘publishers are good’ at describing their content. Wilson disagreed, confirming (to me) that publishers are definitely not good at tagging or classifying their content. His comments confirm for me a belief that intermediaries that insert descriptors, subject classifications and other metadata to improve relevance and discovery will play an increasingly important role. Personally, I do not think the battle has yet been joined that will determine one provider of standardized meta-data within specific product or content categories. (Some players have clear positioning, take for example Snap-On tools purchase of Proquest’s Business Solutions unit which opens many intriguing opportunities – if you like car parts).

You may think that books are effectively categorized by and therefore Amazon is the standard. This is untrue: In fact there are several bibliographic book databases and none of them are compatible across the industry. Additionally, while Amazon allows great access to their data, they are not a good cataloguer of bibliographic information. Their effort is enough to serve their purposes. As a seeker of books and book (e)content, I will want to be able to search on a variety of data elements (publisher, format, subject, author) and find what I am looking regardless of the tool I am using. In my view a single source of quality bibliographic information distributed at the element level will solve this problem. Suppliers of content are beginning to understand that it is the description of the content (metadata) that is as important as the content itself.

It is really quite simple: A database provider needs to spend time standardizing their deep bibliographic content, distribute it to anyone who wants it and then figure out how they can make money doing that. Historically, a vendor had to create their own product catalog because either one didn’t exist or they preferred to build it themselves. Look at office products or mattresses. It is nearly impossible to compare items across vendors. Books and other media products are slightly easier but the legacy of multiple databases continues to reduce efficiency. Management of a product database/catalog should never be a competitive advantage unless it is your business.

Fred Wilson stated that if information wants to be free then where is the value in information? Unsurprisingly it is in attention. To quote, "there is a scarcity of attention and narrowing users’ data ‘experience’ to mitigate irrelevance is the future." Furthermore the ‘leverage points’ in the attention driven information model are Discovery, Navigation, Trust – ratings around content (page rank is good example), Governance, Values and Metadata – data about the data. The likes of Google, Yahoo and Microsoft have the first couple of these items well in hand but they will all increasingly need good meta-data that describes the content they are serving up. This is where aggregators/intermediaries step in whether it be tools, tv programs and movies, advertising or books.

He has provided a link on his web site to the presentation from this meeting.

No comments: