Personanondata: Time for A Publisher ID?

Tuesday, January 30, 2024

Time for A Publisher ID?

In the 1870s R.R. Bowker began publishing The American Catalog which collected publisher titles into one compendium book. The first edition of this book was surprisingly large, but its most useful aspect was that it organized publisher books into a usable format. The concept was not sophisticated: The Bowker team gathered publisher catalogs, bound and reprinted them so that they were more or less uniform. In subsequent years Books In Print became three primary components: The Subject Guide, Author Guide and Publisher Index (or PID). Each was separated into distinct parts, but it was the PID which held everything together.

When a user found a title in the author or subject index they would also be referred to the PID index to find specific information about the publisher including (the obvious) how to order the book. At some point, Bowker began applying an alphanumeric “Bowker Id” to Publisher names so that the database could be organized around the publisher information.

In the late 1970s and early 1980s, the ISBN was introduced to the US retail market and Bowker was (and still is) the only agency able to assign ISBN numbers in the US. Included in the ISBN syntax was a “publisher” prefix such that a block of numbers could be assigned specifically to one publisher. The idea, while good in concept, did not work well in practice. For example, in an effort to encourage adoption of ISBNs the agencies assigned some large publishers a small two digit publisher prefix which resulted in a very large block of individual ISBNs (seven digits plus the check digit). Even after 50 years, many of these blocks are only partially used (and wasted) because the publisher output was far less than anticipated. A second problem was that publishers, imprints and lists were bought and sold which made a mess of the whole idea. (In the above image the prefix is 4 digits).

At Bowker, we recognized that our Publisher Information Database was a crown jewel and a key component of our Books In Print database. Despite many requests we never licensed this data separately and this was a significant reason retailers such as Barnes & Noble, Borders, Follett and others licensed Books In Print. Because the information was so important, we spent a lot of time maintaining the accuracy and the structure of the data.

Publishers who acquired ISBNs from the Bowker agency were a key input to this database – beginning in the 1980s but continuing to the present. Not all new ISBNs go to small independent publishers and there remains consistent demand from established publishers for new numbers even today. To be useful, this publisher information needs to be structured and organized accurately and is only possible with continued application of good practice. During my time at Bowker, the editorial team met regularly with publishers to both improve the timeliness and accuracy of their book metadata but also to confirm their corporate structure. We wanted to ensure that all individual ISBNs rolled up to the correct imprint, business unit and corporate owner. This effort was continuous and sometimes engaged the corporation’s office of general counsel and was frequently detailed and time consuming.

A few years after I left Bowker, one of my consulting clients presented me with a proof of concept to programmatically create a publisher id database. In concept it looked possible to do; however. I pointed out all the reasons why this would become difficult to complete and then to maintain. They went ahead anyway but after a year or so abandoned the work because they could not accurately disambiguate publisher information nor confirm corporate reporting structures.

Today there is no industry wide standard publisher id code but the idea comes up frequently as one the industry should pursue. As with many new standards efforts it will be the roll out and adoption of the standard which will prove difficult. Establishing an initial leap forward could represent a promising start by using data which might already be available or available for license.

Bowker (and all global ISBN agencies) are required to publish all new publisher prefixes each year and this information could also be a useful starting point. Bowker is not the only aggregator with publisher data (we were just the best by a significant margin) and another supply chain partner might be willing to contribute their publisher data as a starting point. This could establish a solid foundation to build on, but realistically any effort will fail if the maintenance aspect of the effort is not understood and recognized, and a strong market imperative isn’t widely agreed and supported.

When (I)SBN was launched in the UK in the late 1960s it succeeded because the largest retailer (W.H. Smith) enforced the strong business case for its adoption. Globally ISBN has gone on to become one of the most successful supply chain initiatives in (retail) history and the entire industry is dependent on this standard. (It has even survived Amazon’s cynical ASIN). If there is a business case for the publisher id this needs to be powerful, obvious and accord universal benefits: Mutual interest and money can be powerful motivators but having a policeman like W.H. Smith will help as well.

The ISBN is Dead

ChapGPT "thoughts" on the history of identifiers.

Note: I ran R.R. Bowker for a while and was also Chairman of ISBN International.

Tuesday, January 30, 2024

Time for A Publisher ID?

No comments: