Thursday, August 06, 2009

Kenyan Birther

Well, I thought it was funny....



Build your own: http://kenyanbirthcertificategenerator.com/

Tuesday, August 04, 2009

The ISBN Is Dead

There are few greater supporters of the ISBN standard than I (and most of us are named "Michael" so we are easily identified); however, I am increasingly concerned about the future health of the ISBN. In its current form the ISBN is not yet dead but therein lies the problem: 'in its current form.' In order to gain entry to the supply chain, most small and medium-sized publishers will continue to buy their ISBNs from agencies around the world as they have since the 1970's. (In contrast, most large publishers have reservoirs of ISBNs sufficient to last almost forever and only occasionally buy new prefixes to establish new imprints).

Five years ago, I participated in the once-a-decade ISO ISBN revision process that resulted in the current ISBN standard. (Michael Healy ran this two year process on behalf of ISO). That revision included the expansion from 10 to 13 digits, but this was tame compared to the contentious issue of separate ISBNs for every eBook format. I support this position (although I did not have a vote in the revision) and agreed with others who viewed assigning separate ISBNs as consistent with the way ISBNs had historically been assigned to other title formats. Despite the passage of time, this issue continues to generate significant comment and has become (to me) one of several indications that the ISBN in its current form may not be sufficient to support the migration to a digital world.

A second problem the ISBN faces is driven by some down-stream suppliers who don't see the ISBN as relevant. The most prominent (egregious - pick your label) of these has been Amazon - and this is not just because no Kindle title carries an ISBN. Amazon has long been disdainful of the ISBN and, almost from the opening of the bookstore, they assigned "ASINs" to books. In his defining Web 2.0 article, Tim O'Reilly used the example of Amazon's ASIN as an indicator of Amazon's application of the principles of Web 2.0. At the time (while I was at Bowker in 2005), I took a more sanguine view in an email:
Amazon’s ASIN creation was built out of expediency. If they received a title from a publisher that (for whatever reason) had no ISBN, they assigned a number just so they could get it in their system. (Don’t laugh, we get frantic calls from publishers who are at their printer and don’t have a number). At first they were designating these as “ISBN”s which we had them change. There was never an intention to take ISBN and make something better and different. So while I would agree on your point about extending the bibliographic content, in the case of ASINs Amazon were not looking to create additional value or take the identifier to some other more valuable place: they needed 10 digits to identify a SKU. Now they have polluted the supply chain with these numbers. No other vendor has seen a requirement to create their own SKUs; there has never been a need, because the ISBN has been the most effective product identifier ever established.
Hence, at Amazon, the lack of ISBNs on Kindle titles isn't really new; although it was a fairly rare occurrence (albeit from a very large player). Others now new to the supply chain (including suppliers of print-on-demand titles) have decided not to use ISBNs. Some of these suppliers are using the Google Book settlement titles as their 'inventory' and thus, by definition, this issue becomes a significant challenge to the ubiquity of the ISBN.

A third issue concerns the rapid influx of new titles as a result of digitization programs. At this point, it's unknown whether any of these titles will be subsequently broken down into parts, (although this seems inevitable,) but that further compounds the issue of how ISBNs - or other identifiers - will identify this content.

Some may argue that, as the supply chain compacts the connection between producer and supplier becomes tighter and a specific item identifier isn't required. Maybe that's true; however, I believe it's far too early in the transition to digital content to make this judgment. Unfortunately, if we shrug our collective shoulders to these issues, this non-action will set a precedent from which we as a publishing industry will be unable to recover.

The ISBN standard united the industry from author royalty statement to store shelf and, while I emphasize the ISBN is far from dead, there are sufficient warning signs to suggest that the ISBN may be unable to thrive in the 21st century as it has over the past 40 yrs. As a community, we need to recognize that the ISBN may not be meeting its intended market need and that the future may make this deficiency even more stark. From an international perspective, ISO could help by reconvening a partial (or full) revision of the standard; it seems incompatible with the speed at which all industry changes that we can continue to live with a 10 year revision cycle. In my view, ISBN could benefit from an accelerated revision cycle while the result of non-action could be increasing irrelevance.

Into this mix I would also add that ISBN can no longer stand generally independent of other identifiers, such as a work ID or party ID. For example, while assigning ISBNs to pre-1970 titles may make an ISBN agency's revenues bulge, it may not be the most effective proposal for the supply chain. A more appropriate approach may be a combination of work ID, party ID and ISBN and, for this, we require a cohesive methodology and possibly a 'merging' of these standards in a more formal way.



This commentary naturally leads into a discussion of the construction of bibliographic databases, which I hope to present in the future.

Sunday, August 02, 2009

MediaWeek (Vol 2, No 30): Amazon, Reed Elsevier, India, Bloomsbury

Nicholson Baker in The New Yorker takes a look at Amazon and the Kindle taking to account how Amazon ranks as an electronics manufacturer (TNY):
Amazon, with its listmania lists and its sometimes inspired recommendations and its innumerable fascinating reviews, is very good at selling things. It isn’t so good, to date anyway, at making things. But, fortunately, if you want to read electronic books there’s another way to go. Here’s what you do. Buy an iPod Touch (it costs seventy dollars less than the Kindle 2, even after the Kindle’s price was recently cut), or buy an iPhone, and load the free “Kindle for iPod” application onto it. Then, when you wake up at 3 A.M. and you need big, sad, well-placed words to tumble slowly into the basin of your mind, and you don’t want to wake up the person who’s in bed with you, you can reach under the pillow and find Apple’s smooth machine and click it on. It’s completely silent. Hold it a few inches from your face, with the words enlarged and the screen’s brightness slider bar slid to its lowest setting, and read for ten or fifteen minutes. Each time you need to turn the page, just move your thumb over it, as if you were getting ready to deal a card; when you do, the page will slide out of the way, and a new one will appear. After a while, your thoughts will drift off to the unused siding where the old tall weeds are, and the string of curving words will toot a mournful toot and pull ahead. You will roll to a stop. A moment later, you’ll wake and discover that you’re still holding the machine but it has turned itself off. Slide it back under the pillow. Sleep.
Checking in on what's going on in Indian publishing (The Economic Times):
To most people, India is at the cusp of the publishing story and the action waiting to play out will be worth the wait. That belief is not without reason. The country has a large literate population and the reading habit is often inculcated early in life. Besides, the opportunity to write and translate books across languages is an opportunity that any marketer will give his right hand for. The key is to deliver a quality product at any time. “I’d like to believe that there will always be an audience and a market for truly original works of literature regardless of commercial fluctuations,” says a rather emphatic Altaf Tyrewala, author of the critically acclaimed No God in Sight. India has never had a paucity of quality writers and that is the best piece (of news) for the industry. Now how these creative artists come together with publishers will form the next round of the story.
Publishers Weekly announce the sale of themselves (PW):
Reed Business Information is putting Publishers Weekly and its affiliated publications, Library Journal and School Library Journal, up for sale. The sale of the group is part of RBI’s strategy to divest most of its trade magazines in the U.S. Last year, Reed Elsevier, parent company of RBI, tried to sell all of RBI but dropped the sale when it couldn’t get the price it wanted in a depressed market for media properties. In a related announcement, Tad Smith, CEO of RBI US, has resigned. John Poulin has been named acting CEO and he will head the sales process.
Also Reed Elsevier announced a rights issue to stem the debt. May raise $1Billion against $4bill debt. (The Bookseller) Having seen the one of the authors on The Daily Show I was intrigued about this book. Subsequently I see there is some controversy between the authors of this book and one on the same subject published a number of years ago. (NYT) Also Stewart interview.

On June 27 Ms. Bynum got a copy of the new book. The next day, in an e-mail message to academic friends and colleagues at universities across the country, she wrote: “I am appalled at the manner in which these authors have written what is touted as a scholarly work. I am also deeply hurt by the manner in which they have appropriated, then denigrated, my work.”

In a three-part review posted on the Renegade South blog, renegadesouth.wordpress.com, Ms. Bynum lit into the Doubleday book. She particularly objected to what she saw as the new book’s tendency to romanticize Mr. Knight and his love life, its insistence on the idea that Jones County actually seceded and its attempt to place Mr. Knight at the Battle of Vicksburg — touches that do not hurt the story’s cinematic potential.
Bloomsbury (UK) get some ink in the Evening Standard for their Bloomsbury Library Online which is 'powered' by Exact Editions (ES):

Bloomsbury currently offers several children's shelves, along with a "book club" shelf of titles designed to be read and discussed by groups.

"The service means you never have to worry about overdue books again," said Daryl Rayner of Exact Editions, the company behind the service.

"You can be on a beach in Greece, and simply log in using your library card to download new books. We have also set the system up on several terminals inside the library."

Although only Bloomsbury has signed up so far, Miss Rayner said her firm was in discussions with all of the major UK publishers. It has also signed up several other libraries across the UK.

Exact is planning to add online access to Wisden, the cricketing bible, to the service in the near future, along with access to the works of Shakespeare and other historical authors.

Bloomsbury executive director Richard Charkin said: "While never forgetting the importance of books themselves, libraries are being pressured to adapt to the demands of the 21st century."

Graphic of Amazon's acquisitions over the years. (Link)

Saturday, August 01, 2009

Bob Stein on the Future of Reading

In advance of the Melbourne Writers festival later this month where he is scheduled to speak, Bob Stein offers a perspective on the future of authorship and reading in The Age:

Traditionally, authors have made a commitment to engage with a subject matter on behalf of future readers, with whom they would have no particular contact. In the new paradigm, I think, an author's commitment will be to engage with readers in the context of a subject matter.

Essentially, authors are about to learn what musicians have grasped during the past 10 years - that they get paid to show up. For musicians, this means live performances account for an increasingly significant percentage of their income in contrast to ever-shrinking royalties from sales. With books, as we redefine content to include the conversation that grows up around the text, the author will increasingly be expected to be part of that ongoing conversation and, of course, expect to be paid for that effort.

For their part, readers will see the experience of reading expand to include a range of behaviours, all situated firmly within a social context. To illustrate, here's a mother in London describing her 10-year-old boy's reading behaviour: "He'll be reading a (printed) book. He'll put the book down and go to the book's website. Then he'll check what other readers are writing in the forums, and maybe leave a message himself, then return to the book. He'll put the book down again and Google a query that's occurred to him."

Tuesday, July 28, 2009

Google Book Settlement Video and Discussion

Harvard's Beckman Center for Internet and Society hosted a presentation and discussion about the proposed Google Book Settlement which included Alex Macgillivray and Dan Clancy from Google (both are introduced at the start of this video):



The video is over an hour long but in listening to it I took the following notes. If something is not clear, best watch the video. (Also, don't take my notes as gospel watch the video).

Google's Alexander Macgillivray on the Google Book Search Settlement

AM: Google Book Search: Why did we do it? “To make books easier to find”
First lesson learned about book search: full text search is really powerful and harnessing this is really powerful.

Three places to go to build a full text database:
1. Born digital books
2. Books less new but owned by publishers: can find them
3. Not currently held by publishers or rights unclear and public domain books. Where rights were unknown recognized these are still useful and wanted to include them as full text searchable and also enable someone looking for them to know where to go to get them

AM Referred this as following as “Books 1.0”

Deals with libraries to scan books and index them.

10+ million books scanned
1.5mm in public domain
1.5mm in the Partner Program from 25K partner program and 40 libraries across number of countries

Continued to scan ‘at pace’ and didn’t stop in the face of the lawsuits came in 2005:
2 US Lawsuits:
Broad Class action: Authors Guild
Narrower: Publishers
1 French
1 German – subsequently withdrawn when looked like they would loose

Conversation in settlement: Only time happened to me at Google where the other side was thinking bigger than we were. Started thinking about doing things with the class that would provide enormous benefit. “actually increasing access to the information” saw an opportunity that “once you found the book you could actually read it”. Wasn’t a lot of disagreement around the room. Also, how do we preserve the place of the library in this environment?

Opens up access in various ways:

1. Consumer Access:
Ability get free full text search results, find it for sale or in library. Also if it is out of print (essentially all out of print books) you can get 20% of the content to sample and determine if this is the book you are looking for. Which books are useful to you and expands options to access: amazon, alibris, etc. Buy online access to the book: Lasts forever and no ‘1984 Amazon’ problem and sits forever sits on your bookshelf. Priced by the publisher or rightsholder. If none exists the price is set by an algorithm. (Simulates a market which prices the book at a price it would be if there was a market).

DC On pricing for Books: An algorithm has been built to determine the best/appropriate price for books where price not set by a rightsholder. Initial distribution is as follows but real experience will change these prices.

80% of prices are $15 or less
50% of prices are $5.99 or less

“Really think the prices will go down”

2. Institutional Access: Subscription based and pricing governed by the agreement which states pricing should offer a “fair return and broad access”

Comment: to users of an institutional license this may be akin to ‘free like water’ for all the users (or those who have access) to the institutional license.

Another comment on the institutional license:
For the entirety of the subscription not book can be removed from the collection. No 1984 problem. Once have subscribed to a set of books these can’t be removed for the entirety of the subscription. Next year there could be a different set of books which changes the composition of the license.

3. Public access model: Can go to a public library will have access to the entire ‘subscription for free’. All out of print books available at any library that wants it. Google would like it so that you “never have to worry that the amount of money you have will determine access – either in Academic or Public setting. So don’t have the money to go to Harvard but would be able to gain access to this material and the content of all the other libraries.

One terminal in every library (hope over time to be able to provide more access points for public libraries)

Obviously in addition all public domain titles will be available via the internet

AM Also notes the ability of the agreement to expand access to those with disability – especially those with print disabilities (the blind).

Professors can now select from a much wider universe/set of books: moving from a relatively small set of titles to a much more inclusive set


Orphan Works: - Notes blog posts.

Google has been fighting for Orphan works legislation for years that would allow for mass digitization projects (including but not exclusive to books)

Still think this effort is important for a number of reasons:
Settlement includes Orphans and non-Orphans
No clear cut definition as to what an Orphan is
Constant problem in Washington and disagreement: ever competing definitions within groups even within cohesive groups

“Works where the rightsholder is very very hard to find.”

May be copyright holder out there but the connection between (me) and the holder is hard or can’t be made

Clancy: Books have some advantage over other intellectual works because authors name, publishers name (other info), is printed in the book. Many of these books have publication information.

Not just books: images, physical objects, other things but even harder to find copyright holder.

More scholarly books from libraries: Professors at the university at the time of publication.

“Can find them – little hard but could if you tried. These are not really Orphans”
For many casual uses finding them for class use (or for permissions) is not too difficult. Noted the Author’s Guild research asking their authors whether finding copyright holders for permissions: ‘Success 90% of the time. (PND Note: I think % is higher than actual but not by much). These books aren’t really Orphans is just a little hard to find the rightsholder.

Challenges: Books less of an issue but still an issue for some percentage of the titles:
1. Lots of books that aren’t Orphans but still a bit of a pain to go ahead and find who the rightsholder.
2. Because of statutory risks in copyright titles may be ‘practically dead in the marketplace’ because the economic value is small versus the costs of getting hold of the rightsholder and getting the title authorized. Has to do with rightsholder indemnifying the seeker of the rights against a future claim. Money rightsholder receives in this transaction is much smaller than his/her economic risk of error if they don’t in fact retain the rights to the work.

AM: Addressing the twin problems of Orphan works
1. making it easier to find rightsholder
2. makes these things (cultural items) themselves accessible

AM: Make really clear (w/r/t Orphan works legislation) inserted clause that Orphan works legislation will trump the settlement.

DC: Important point that all information is freely and publicly available as to the disposition of the copyright:
Who claims what book is public information
Can also ask “Tell me which books have not been claimed”

AM “Fact that this information is public is really an important part of the agreement” J – compares this openness with other rights distribution agencies which are closed. Keep as private which content is part of their collection.

“BRR is unable to be obscure about rightsholder information”

Question of ‘fair use doctrine’: isn’t this the end of fair use?

AM Currently have more fair use cases than anyone else.
Continue to be subject to lawsuits with respect to photos and foreign works. Google is never on the plaintive side in Fair use cases. Always on the defendant side. “Understand it may be convenient to say we are abandoning fair use but its bull shit”

DC: Going in to the agreement we felt we would win the lawsuit: “felt pretty good”. In the agreement it was important that we did not erode fair use. We don’t believe the agreement erodes fair use and continue to conduct ourselves (scanning images, unregistered works, opted out works). All still believe in fair use. “if we felt the agreement was undermining our belief in fair use we would be adjusting our actions with respect to some of the things we are doing” (images etc.)

AM: Just to be clear: “Google built its whole business on fair use and we are not backing down from this at all” We are not backing down from this at all.

Question about where the money goes (specifically what happens to uncollected funds): “Not clear why anyone would have a claim on the collected but unclaimed money”.

AM: Two streams:
For consumer purchases the money is held for 5yrs. If unclaimed the BRR can use the (5th year) money to operate the BRR, if money left over then can use the remaining money to ‘top up’ the payments to rightsholders from the 63% to 70%, if there is any money remaining after that it is disbursed to charities.

for institutional: after 5yrs registry operating costs, any remaining left over is divided across the rightsholders in the institutional license

Heard people say the money shouldn’t be divided this way because BRR etc have no right however; there is no consistency on where the money should go. Different groups have different ideas as to where/how the dollars are divided. The way the settlement distributes it is similar to other rights organizations; however, the settlement also says that if there is Orphan works legislation this will trump the settlement. “You can easily get a resolution to the extent you can get all the other constituents to agree” on where the money could go.

Question about the research corpus: Largest collection of ‘parellel corpa’ with respect to translation. Who’s got access to it?

DC: Right now in the current world Google has access to the entire database. Because of the current copyright we can’t open it up to everyone to come in and do what they want. Secondly, each library only has access to their collection. Each partner has a subset. Google has the whole thing.

Creation of a research corpus for non-consumptive research allowing for computational research on the entire corpus. Word usage, Machine translation, OCR, New search technologies over large texts like books

Participating and fully cooperating libraries get to create up to 2 of these research corpus’. Google is putting up $5mm to set up these research corpus’

Up to the libraries to use these research projects: has to be non-consumptive research. Libraries have the responsibility but can sponsor anyone they want. They have responsibility to secure the corpus. Can sponsor any university or person they want.

31 partners and most are expected to come on: could be another 50 or 100. Any of the libraries can sponsor others.

Michigan is the only one doing anything: Something with Hathi Trust
Looking to build one corpus on public domain stuff and working with them on this. Google want them to get going because once get it going ‘they will discover things’ which will make the research opportunities more tangible.

AM Absent the settlement this doesn’t happen. Once settlement approved we get to provide all the content

Question about Competition: Specifically most favored nation clause. A suggestion this removes any incentive for a competitor to enter this market because they can never ‘beat’ Google:

AM Stated the clause without the limitations:
Only for first 10 yrs. Deal is long and the first mover is taking on a lot of up front risk. First mover deal for the length of copyright of the last book in the database by definition is a long time. Scanning and the $125mm in the settlement addition points.

Second limitation: Only to the extent that a deal with a third party impacts a significant number of unclaimed (other than registered rights holders) works (slightly bigger than Orphans) will the clause be relevant.

AM: This clause is regarded by anti-trust as a ‘good thing’: Very easy for a second entrant with the blue print (via BRR) of a deal already done. Wanted to ensure that for the first 10 yrs that Google could complete with any entrant be they Amazon, MS, or other. Anti-trust views this as a good thing because it encourages the type of innovation we have with this settlement.

Monday, July 27, 2009

Boston University Discuss Open Access

In the University's alumni magazine this quarter, Boston University discuss their recently launched open access research repository under a title "Research Wants to be Free":
While most published scholarly work is copyrighted and distributed by subscription-based journals, an open-access system allows an article or data to be shared as widely and easily as possible with both the public and potential collaborators who might build on one another’s work. The movement began a few years ago among university librarians unsettled by ever-rising subscription costs and emboldened by the promise of the Internet. It quickly spread to university faculty and has since spawned a bur­geoning library of open-access journals and institutional repositories. In Feb­ruary, Boston University moved to the forefront of the movement when the faculty unanimously voted to establish the nation’s first university-wide open-access archive.

The archive will be a free, search­able Web catalogue of BU scholarship ranging from neuroscience research to folk dance videos. Faculty who opt to use the archive can submit a journal article, a dissertation, or any other piece of scholarship, and material that is submitted will be made available to anyone for noncommercial use.

Pearson Reports Interim Results: Better than Anticipated

Pearson reported their interim financial results this morning. From the press release:
* Strong profit growth: Adjusted operating profit up 25%* and adjusted EPS up 41% in headline terms.
* Good competitive performances: FT Group and Penguin performing well in challenging markets and trading in line with expectations; Education trading ahead and gaining share.
* Healthy outlook: Strong positions in growth markets combined with accelerating digital and services businesses underpin confidence for 2009 and beyond.
* Dividend growth sustained: Interim dividend raised 3.4% to 12.2p.
* Trading ahead of expectations: Stronger business performance offsets negative currency impact, providing an effective upgrade of 3p to adjusted EPS guidance for 2009. So, full-year adjusted EPS still expected to be at or above the 2008 level of 57.7p per share.

Marjorie Scardino, chief executive, said: "The transformation we've been pursuing for a decade - from 'publishing' company to content, technology and services company - is paying off. Over the past six years, Pearson has delivered substantial growth; this year is about proving our resilience and competitive edge. So far, we've passed the test. Market conditions are tough and may stay that way; but we are confident that we will perform well this year and next."
Some other points from the release:

Education appears to be outpacing the company's expectations with underlying performance in the period better than expected and a traditionally better second half of the year yet to come. (To some degree this performance could be expected given the 'promises' made to investors with respect to the integration of their recent acquisitions and the performance gains from cost efficiencies and selling).

Penguin who's performance the company described as 'in-line with expectations' lost significant margin during the period versus last year. The company expects the second half to be stronger but have also put in place some 'organizational changes'. From the release some other information about Penguin's Digital Innovation:
  • Significant expansion of eBook publishing and sales. In the US and UK, Penguin has almost 10,000 eBooks available to date and expects to have almost 14,000 by year end including eSpecials and Enriched eBook Classics.
  • In the US, Penguin launched an online network with three channels featuring nine series of book-related programming for adults, young adults and children. Titled "From the Publishers Office", the site aims to build on Penguin's 2.0 initiatives to engage new audiences and to enhance the dialogue between authors and readers.
  • In the UK, Penguin and Puffin launch We Make Stories, a unique set of digital tools for children to create, print and share a variety of innovative story forms including pop-up books, customised audio books, comics and interactive treasure maps. The site is designed to encourage literacy, creativity and storytelling skills and is Penguin's first move into providing services. We launched iPhone applications for the Top 10 DK Eyewitness travel guides retailing at £4.99.
  • Penguin China is the first major international publisher to sell English books directly under its own brand on Taobao the leading direct-to-consumer online auction site in China.