Showing posts sorted by relevance for query metadata. Sort by date Show all posts
Showing posts sorted by relevance for query metadata. Sort by date Show all posts

Wednesday, June 27, 2012

Making Your Metadata Better: AAUP Panel Presentation - M is the New P

(Video of this presentation here)

The last time I was asked to speak at an AAUP meeting was in Denver in 1999 and naturally the topic was metadata. As I told the audience last week in Chicago, I don’t know what I said at that meeting but I was never asked back!  I am fairly confident most of what I did say in Denver still has relevance today, and as I thought about what I was going to say this time, it was the length of time since my last presentation that prompted me to introduce the topic from an historical perspective.

When ISBN was established in the early 1970s, the disconnect between book metadata and the ISBN was embedded into business practice.  As a result, several businesses like Books In Print were successful because they aggregated the collection of publisher information, added to this some of their own expertise and married all this information with the ISBN identifier.   These businesses were never particularly efficient but, things only became problematic when three big interrelated market changes occurred.  Firstly, the launch of Amazon.com caused book metadata to be viewed as a commodity, Second, Amazon (and the internet generally) enabled a none too flattering view of our industry’s metadata and lastly, the shear explosion of data supporting the publishing business required many companies (including the company I was running at the time, RR Bowker) to radically change how they managed product metadata.

The ONIX standard initiative was the single most important program implemented to improve metadata and provided a metadata framework for publishing companies.  As a standard implementation, ONIX has been very successful but the advent of ONIX has not changed the fact that metadata problems continue to reside with the data owners.

More recently, when Google launched their book project a number of years ago it quickly became apparent that the metadata they aggregated and used was often atrocious proving that little had changed since Amazon.com had launched ten years earlier.  When I listened to Brian O’Leary provide a preview of his BISG report on the Uses of Metadata at the Making Information Pay conference in May, I recognized that little progress had been made in the way publishers are managing metadata today.  When I pulled my presentation together for AAUP, I chose some slides from my 2010 BISG report on eBook metadata as well as some of Brian’s slides.  Despite the 2-3 year interval, the similarities are glaring.

Regrettably, the similarities are an old story yet our market environment continues to evolve in ever more complex ways.  If simple meta-data management is a challenge now it will become more so as ‘metadata’ replaces ‘place’ in the four ‘p’s marketing framework.   In traditional marketing ‘place’ is associated with something physical: a shelf, distribution center, or store.  But ‘place’ is increasingly less a physical place and, even when a good is only available ‘physically’ - such as a car, a buyer may never actually see the item until it is delivered to their driveway.  The entire transaction from marketing, to research, to comparison shopping, to purchase is done online and thus dependent on accurate and deep metadata.  “Metadata” is the new “Place” (M is the new P): And place is no longer physical.

This has profound implications for the managers of metadata.  As I wrote last year, having a corporate data strategy is increasingly vital to ensuring the viability of any company.  In a ‘non-physical’ world, the components of your metadata are also likely to change and without a coherent strategy to accommodate this complexity your top line will underperform.   And if that’s not all, we are moving towards a unit of one retail environment where the product I buy is created just for me. 

As I noted in the presentation last week, I work for a company where our entire focus is on creating a unique product specific to a professors’ requirements.  Today, I can go on the Nike shoe site and build my own running shoes and each week there are many more similar examples.   All applications require good clean metadata.  How is yours?

As with Product and Place (metadata), the other two components of marketing’s four Ps are equally dependent on accurate metadata.  Promotion needs to direct a customer to the right product, and give them relevant options when they get there.  Similarly, with Price, we now rely more on a presumption of change rather than an environment where price changes infrequently.  Obviously, in this environment metadata must be unquestioned yet rarely is.  As Brian O’Leary found in his study this year, things continue to be inconsistent, incorrect and incomplete in the world of metadata.  The opposite of these adjectives are, of course, the descriptors of good data management.

Regrettably, the metadata story is consistently the same year after year yet there are companies that do consistently well with respect to metadata.  These companies assign specific staff and resources to the metadata effort, build strong internal processes to ensure that data is managed consistently across their organization and proactively engage the users of their data in frequent reviews and discussions about how the data is being used and where the provider (publisher) can improve what they do.

The slides incorporated in this deck from both studies fit nicely together and I have included some of Brian’s recommendations of which I expect you will hear more over the coming months.  Thanks to Brian for providing these to me and note that the full BISG report is available from their web site (here).

Wednesday, February 15, 2012

File Under "Bleedin' Obvious": Good Data Drives Sales

Nielsen Bookdata recently released a white paper/sales sheet on metadata enhancement which presents some real data on the direct link between deep accurate metadata and increased sales and long term revenue.  Unsurprisingly, the document finishes by noting that BookData provides enhanced metadata services for a fee which, assuming publishers don't have the where with all to handle this very basic activity themselves, they would be well advised to contract with Nielsen (or someone similar).

It occurs to me that there's some circuitous illogical aspect to working with a third party data enhancement provider: If, as a publisher I don't have the means to provide this deep information in the first place, how will I be able to know that the deep metadata services provided by a third party are accurate and optimal?  Nielsen will say "increased sales" and they'd be correct based on their own analysis yet it is always going to be the author, editor and marketing person at the publisher who is best placed to define and optimize their metadata.  Contracting this function out is not only likely to be sub-optimal but might also result in a staff who's experience becomes removed from the realities of market dynamics.  This is not to suggest that the third party will do a bad job but that the benefits to the publisher in doing it themselves far out weighs the benefits both in the short term and long term.

And what are the results of better metadata?  Neilsen's report is quite specific from this sample:
White Paper: The Link Between Metadata and Sales

Looking at the top selling 100,000 titles from 2011ii we analysed the volume sales for titles where either the BIC Basic or image flag was missing, and compared these with titles where one of the flags were missing and titles where both the BIC Basic and image flags were present, indicating that the BIC Basic standard was met. Figure 1.1 shows the average sales per title for these four different sets of records.

The positive impact of supplying complete BIC Basic data and an image is clear. Records without complete BIC Basic data or an image sell on average 385 copies. Adding an image sees sales per ISBN increase to 1,416, a 268% boost. Records with complete BIC Basic data but no image have average sales under 437 copies, but when we look at records with all of the necessary data and image
requirements, average sales reach 2,205. This represents an increase of 473% in comparison to those records which have neither the complete BIC Basic data elements or an image. Figure 1.2 shows a direct comparison between all records with insufficient data to meet the BIC Basic standard, and those that meet the requirements.

the average sales across all records with incomplete BIC Basic elements are 1,113 copies per title, with the complete records seeing an 98% increase in average sales.

Titles which hold all four enhanced metadata elements sell on average over 1,000 more copies than those that don’t hold any enhanced metadata, and almost 700 more copies that those that hold three out of the four enhanced metadata elements. In percentage terms, titles with three metadata elements see an average sales boost of 18%, and those with all four data elements 55% when compared to titles with no enhanced metadata elements.
In the still early days of Amazon we were always throwing out the (anic)data point that a book with a cover image was 8x more likely to sell versus one without.  Sadly we are still discussing much the same issue.

Tuesday, September 29, 2020

OCLC's Vision for the Next Generation of Metadata

From the OCLC report summary: 

Transitioning to the Next Generation of Metadata synthesizes six years (2015-2020) of OCLC Research Library Partners Metadata Managers Focus Group discussions and what they may foretell for the “next generation of metadata.”
The firm belief that metadata underlies all discovery regardless of format, now and in the future, permeates all Focus Group discussions. Yet metadata is changing. Innovations in librarianship are exerting pressure on metadata management practices to evolve as librarians are required to provide metadata for far more resources of various types and to collaborate on institutional or multi-institutional projects with fewer staff.
This report considers: Why is metadata changing? How is the creation process changing? How is the metadata itself changing? What impact will these changes have on future staffing requirements, and how can libraries prepare? This report proposes that transitioning to the next generation of metadata is an evolving process, intertwined with changing standards, infrastructures, and tools. Together, Focus Group members came to a common understanding of the challenges, shared possible approaches to address them, and inoculated these ideas into other communities that they interact with. 
Download pdf

Wednesday, September 14, 2011

Corporate Data Program: Where to Start?


The following post represents the third installment of my data strategy presentation.

There are likely to be many approaches to initiating a corporate data strategy but, for my money, I would start with product metadata. Theoretically, this is data you maintain the most control over (you will find out if you actually do during your initial review). In my opinion, product metadata should be considered the basic foundation upon which to build a corporate data strategy.  

On this foundation, a “data value” can be assigned to all the data a business produces. Product data might also be seen as the data asset from which all other corporate data flows. For example, as businesses begin to generate more user/transaction data, this information will be far more valuable if it is tied back to robust product metadata. Establishing your corporate data strategy around product metadata also has another advantage in that the company is always best placed to manage the information that describes their products.

How companies describe and interrelate information about their products is increasingly viewed as the most proactive activity a company can impose on supply-chain partners (or directly to consumers) in order to positively impact sales. The deeper, more descriptive and interconnected the metadata is, the better all products will perform in a sales environment that is increasingly congested. Unfortunately, only a small number of companies manage their data in a uniform manner and, typically, descriptive metadata continues to be poorly managed and locked in silos deep within organizations. Ironically, if the logic of a “corporation” as a collection of assets makes sense from a financial point of view, isn’t that logic undermined by the disaggregation of information about the collective assets of the organization?

For many companies, this describes their product data management ‘philosophy’. As such, their data management is more reflective of their own internal structures rather than a pragmatic understanding of the mechanics of their markets. Just as consumers seek products and services – often via search – in the broadest sense and not in accordance with artificial corporate hierarchies, the smart approach to product metadata management would be to centralize it at a broad or corporate level. This approach would facilitate the most effective integration of all products so that the best combination of product options can be presented to a customer.

In choosing product data as the first practical implementation for your data-strategy effort, your team will also benefit from an existing set of methodologies, policies and procedures. (This wouldn’t necessarily be the case if you were to choose customer data or web traffic data, for example.) In launching this first initiative, your internal communications will want to explain to the individual business units and the business as a whole what benefits they will realize as the project becomes operational. All participants in the metadata initiative will be striving for a ‘future state’ where each business and constituency will be able to spend more time analyzing and leveraging better and more complete data. Thus, the future state will be materially different than the “legacy environment,” where staff spend their time chasing and remediating data rather than benefiting from value-added tasks supporting their business units.

In my next post, I will spell out in more detail what benefits may accrue from this initiative but, overall, they include the application of scale economies to the management of data, the attribution of control mechanisms (such as thesauri) and a greater ability to merge and mingle metadata to improve revenues.

Tuesday, September 20, 2011

Strategically Managing Data for Long Term Benefit

In this series of articles (1,2,3), I have attempted to describe the organization and development of a corporate-wide approach to data management. In doing so, I have suggested that product metadata is a good place to initiate a phase one approach to centralizing management of a business’s data. The underlying premise of this series is that data represents a core asset of any organization and, if effectively managed can produce incremental benefits to the organization as a whole.

In my view, the efforts detailed in the prior articles will have a material impact on the business in three ways: (i) The application of scale economies to the management of data; (ii) the attribution of control mechanisms (such as thesauri) and (iii) a greater ability to merge and mingle metadata to improve revenues. Below, I suggest how some of these benefits might be actualized:
Scale:

Centralizing metadata management allows the organization to take advantage of scale economies in factor costs, technology and expertise. Not every business unit can afford to acquire state of the art technology or the market’s best metadata expert but these types of decisions are almost encouraged if their benefits can be spread across the enterprise. The financial benefits of better data management can also be most appreciated and captured at the corporate level, thereby providing greater financial justification for the adoption of technology and staffing to support the data strategy.

Collective Dictionaries “You say tomato, I say tomato” - Thesauri, ontologies and the attribution of consistent cataloging rules:

Business units don’t speak to one another nearly enough and this is absolutely the case in the way they manage information about the products they sell. The manner and method one business unit may use to describe a product could be vastly different than that which a sister unit may apply to a similar or likely compatible product. Take, for example, a large legal publisher publishing journals and educational materials: It would make logical and strategic sense that the metadata used to describe these complimentary products would be produced using the same metadata language and dictionary, yet that is rarely the case. (Think of this as a ‘chart of accounts’ for data).

Additionally, the manner and method by which authors and contributors are required to compile (write) their authored materials are unlikely to take account of the potential for compatibility and consistency across content type. As is readily apparent, traditional content silos are breaking down as users are accorded more power in finding specific content and an organization will be significantly hampered if it cannot provide relevant material to customers irrespective of its format.

Inter-relationships and cross selling:

Companies frequently leave it up to channel partners to aggregate compatible or complimentary products. Naturally, at the retail level this activity happens across publishers; however, to not provide the supply chain with an integrated metadata file that represents the best complete presentation of all the company’s products suggests a contradiction with the wider corporate business strategy of acquiring or developing products that ‘fit’ with corporate objectives. In other words, why doesn’t the company’s data strategy support the corporate business strategy in managing a collection of related and complimentary products and services? To do this, data strategy should be a component of business strategy planning.

The opportunity inherent in managing data in this manner will be a real ability to sell more products and services to customers. Providing relevant “packages” of content that add related and complimentary products to the item originally sought will generate cross- and up-sell opportunities. Why rely on a hit-or-miss approach provided by your channel partners? (Or worse, an association with a competing product). This activity is only possible with good data yet; if done effectively, can become a significant competitive advantage with incremental sales. Return on investment would be seen in metrics such as average revenue per customer and/or average shopping cart revenues. Importantly, selling more products to a customer who is already interested in buying from you is always easier and more profitable than finding new customers.

Market, promotions and branding:

Combining a company’s products in a logical manner reinforces branding and messaging. If product information is disaggregated and disorganized, it is likely that the branding and messaging in the mind of consumers similarly lacks effectiveness.

Channel Partner Relationships:

A company may be able to exert more leverage with a channel partner if it is in a position to represent all of its products in a coordinated and managed manner than if this interaction is dispersed across the organization

Matching and marrying data with partners will be less problematic and more effective if data models can be allied – time to market will be significantly reduced and planned benefits of the relationships should accrue in shorter time frames.

Additionally, providing a well-managed metadata file that supplies the type of product descriptive cohesion described above is going to benefit your channel partners as well, not only making their lives easier but also make them money by selling more of your products.

Acquisitions – integration of new data:

Historically, the integration of companies and new products with respect to product metadata might have been a haphazard affair. At a consolidated level this task becomes much easier with the added benefit that connections between products, adoption of standard dictionaries and standards may have an immediate financial impact. As noted before, it is likely the justification for the acquisition of the company in the first place was its compatibility or consistency with a strategy and it is logical that this be reflected in the manner in which the products are managed.

It is likely we will see that companies with ‘best in class’ approaches to data asset management will be valued more than those without. Increasingly, companies will be asked about their data policies and management practices and those which ‘under-manage’ their data will be seen as less attractive – for acquisitions, partnerships and other relationships.
Those are some of the general benefits of better corporate data management and developing a corporate data strategy. The effort required to implement a data strategy program isn’t inconsequential and planning should be rational and realistic; however, as data management across a business becomes more and more ‘strategic’ the faster you adopt an approach, the faster your business will benefit. If you believe the tasks involved are difficult (or near impossible) now, they are only likely to get more so; therefore, it would be best to get started now.

Wednesday, March 28, 2007

Metadata, identifiers and a challenge ahead ….

Another rehash from March 28, 2007 this time a post written by Michael Healy who at the time was the Executive Director of the Book Industry Study Group. Michael has since moved on to Copyright Clearance Center but all of these issues he spoke about in 2007 remain relevant.

I am (unsurprisingly) in complete agreement with Michael’s comment in his thought-provoking piece on metadata that publishing businesses “must continue to focus on product information”. No one would seriously argue with his assertion that the quality of metadata has risen in recent years.

Several factors have influenced the improvements we have seen. International standards, notably ONIX, have been helpful to this process and many publishers, booksellers and data aggregators have adopted it to organize and communicate information in a standardized way. Practical guidance has also been made available. The Book Industry Study Group has prepared Product Metadata Best Practices, a set of voluntary guidelines that aims to help publishers improve the quality of their product information throughout the supply chain and speed the delivery of that information to the vendors’ trading partners. Innovative services from companies like Quality Solutions and Netread have also played their part.

I think also the general level of awareness in the book industry of the role product information plays in selling books has risen substantially. This has been helped by leaders like David Young at Hachette, Joe Gonnella at Barnes & Noble, and many others evangelizing on the subject for many years.

Under normal circumstances when improvements like those we have seen are made there is a danger of complacency setting in, but I see encouraging signs that this is being avoided. In many of the larger publishing houses, where investment in quality metadata has already been significant, I find abundant evidence of a commitment to raise standards even further. Many examples of high-quality data can be found outside these large houses, but I think it remains true that many smaller companies, working with fewer resources, have a lot to do raise their game. Organizations like BISG must face the challenge of how to reach these companies with clear, straightforward advice and with tools to help them deliver good metadata. We will be announcing some initiatives in this area shortly.

More work is certainly needed in the standards area and much of this is underway. A new release of ONIX is expected later this year which, among other things, will improve its handling of digital publications. An entirely new standard now under development, the International Standard Party Identifier (ISPI), will in time establish a unique identifier for authors, composers, performers and others in the creative supply chain. We are all aware of how unreliable personal names are as a means of identifying individuals, especially when we consider how many people share the same name and how many authors use pseudonyms. The adoption of a standard ID for personal and corporate names will be a big step in eliminating ambiguity when searching and in facilitating transactions such as the remittance of royalties.

RFID also appears to offer interesting opportunities. As the price of tags continues to fall we are beginning to see some large-scale adoptions in libraries, notwithstanding well-documented concerns about privacy issues. In bookselling, at least so far, the response has been more cautious. The adoption of RFID by the leading Dutch bookshop chain, BGN, has certainly stimulated interest among American booksellers but at the moment most of them appear to be waiting for more compelling cost benefits to emerge.

As we look further ahead into a future in which more fragmented content is sold, distributed and traded digitally, whether it’s cookery recipes or individual chapters from textbooks, one key question is how the industry will cope with the metadata challenge. If publishers are finding it demanding today to provide comprehensive, accurate and timely product information to support a universe of more than 3.0 million US titles and 200,000 new books a year, what happens in a market where available product is set to grow exponentially?

Michael can be reached directly at CCC.


Links: Metadata: What does it all mean

Friday, June 18, 2010

Metadata Everywhere

An interesting article in OCLC's NextSpace publication about the increasing importance of meta data. Music to bibliographers and catalogers' ears. (OCLC):
“Metadata has become a stand-in for place.”

So says Richard Amelung, Associate Director at the Saint Louis University Law Library. When asked to expand on that idea he explains, “Law is almost entirely jurisdictional. You need to know where a decision occurred or a law was changed to understand if it has any relevance to your subject.

“In the old days, you would walk the stacks in the law library and look at the sections for U.S. law, international law, various state law publications, etc. Online? Without metadata, you may have no idea where something is from. Good cataloging isn’t just a ‘nice-to-have’ for legal reference online. It’s a requirement.”

Richard’s point is one example of a trend that is being felt across all aspects of information services, both on and off the Web: the increasing importance and ubiquity of metadata. In a world where more and more people, systems, places and even objects are digitally connected, the ability to differentiate “signal from noise” is fast becoming a core competency for many businesses and institutions.

Librarians—and catalogers more specifically—are deeply familiar with the role good metadata creation plays in any information system. As part of this revolution, industries are increasing the value they place on talents and the ways in which librarians work, extending the ever-growing sphere of interested players.

Whether we are tracing connections on LinkedIn, getting recommendations from Netflix, trying to find the right medical specialist in a particular city or monitoring a shipment online, metadata has become the structure on which we’re building information services. And no one has more experience with those structures than catalogers.

Concluding:

“It is clear that metadata is ubiquitous,” Jane continues. “Education, the arts, science, industry, government and the many humanistic, scientific and social pursuits that comprise our world have rallied to develop, implement and adhere to some form of metadata practice.

“What is important is that librarians are the experts in developing information standards, and we have the most sophisticated skills and experience in knowledge representation.”

Those skills are being put to good use not only in the library, but in nearly every discipline and societal sector coming into contact with information.

Monday, May 05, 2014

MediaWeek (Vol 7, No 18): Metadata Harvesting, Death of the Novel, Ed Innovations Conference + more

This weeks selection on FlipBoard

A presentation on slideshare.net that describes how to take metadata from HathiTrust and Pubmed:
This presentation will describe Cornell University Library efforts to provide an "afterlife" to The Cornell Veterinarian by leveraging a number of disparate initiatives and metadata sources. While attempting to build article level linking to full-text in HathiTrust (functionality currently unavailable), limitations in the metadata captured during the scanning process were uncovered. The speaker will delineate these metadata findings and provide strategies (some scalable, others highly labor intensive) for gathering the necessary metadata for creating direct links to articles found in HathiTrust. 



A dispatch in Inside HigherEd from the Education Innovations Summit where impatience may be brewing:
“At a national level, there is no evidence that educational technology has reduced the cost of education yet or improved the efficacy of education,” said Brandon Busteed, executive director of Gallup Education. “And that’s just as true as it gets. Maybe there will be some day, but that’s the question: How much longer do we think it will take before we can detect movement on the national needle?”
During the summit’s first two days, speakers identified well-known issues such as the rising cost of higher education, stagnant graduation and retention rates, and stubborn levels of unemployment among recent graduates. The proffered solution, in many cases, was a renewed promise of the disruptive powers of technology -- often wrapped in a sales pitch.
“Every one of these companies has -- at least most of them -- some story of a school or a classroom or a student or whatever that they’ve made some kind of impact on, either a qualitative story or some real data on learning improvement,” Busteed said. “You would think that with hundreds of millions of dollars, maybe billions now, that’s been plowed into ed-tech investments ... and all the years and all the efforts of all these companies to really move the needle, we ought to see some national-level movement in those indicators.”

Will Self thinks the novel is dead and it's not coming back to life. (Guardian)
My canary is a perceptive songbird – he immediately ceased his own cheeping, except to chirrup: I see what you mean. The literary novel as an art work and a narrative art form central to our culture is indeed dying before our eyes. Let me refine my terms: I do not mean narrative prose fiction tout court is dying – the kidult boywizardsroman and the soft sadomasochistic porn fantasy are clearly in rude good health. And nor do I mean that serious novels will either cease to be written or read. But what is already no longer the case is the situation that obtained when I was a young man. In the early 1980s, and I would argue throughout the second half of the last century, the literary novel was perceived to be the prince of art forms, the cultural capstone and the apogee of creative endeavour. The capability words have when arranged sequentially to both mimic the free flow of human thought and investigate the physical expressions and interactions of thinking subjects; the way they may be shaped into a believable simulacrum of either the commonsensical world, or any number of invented ones; and the capability of the extended prose form itself, which, unlike any other art form, is able to enact self-analysis, to describe other aesthetic modes and even mimic them. All this led to a general acknowledgment: the novel was the true Wagnerian Gesamtkunstwerk.
From twitter this week:
News Corp to buy Torstar's romance publisher Harlequin Amazing this deal hadn't been done yrs ago.
With free web courses, Wharton seeks edge in traditional programs
Sad Ending to Ladies’ Home Journal’s Era

Monday, January 17, 2011

BISG eBook ISBN Study Findings Released

BISG held a meeting last Thursday to review the findings from the eBook ISBN study which I conducted for the group. BISG intends to use this study as a first step in defining what the industry should do to identify eBooks and eContent for the future.

Here is a link to the summary presentation
. BISG plans to distribute the full report in some form within the next few weeks.

By way of introduction, here is the executive summary from the detailed report:
All publishing supply-chain participants want clarity and consistency in applying ISBNs to eBooks and all would like the solution to be defined and agreed by the relevant parties. The ISBN agency is virtually irrelevant to participants and most interviewees – including sophisticated players – do not understand or acknowledge important aspects of the ISBN standard. These aspects include the international community of ISBN countries, the ratification by ISO of the standard and important standard definitions contained in the standard. Many interviewees referred to the ISBN policies and procedures as “recommendations” or “best practices” and without correction each of these issues encourages misinterpretation of the ISBN standard policies.

“Bad practice” is common and enabled at all levels within the supply chain. For example, retailers have the power to reject improperly applied title-level ISBNs but pragmatically create ‘work-arounds’ in order to make the products available for sale in the shortest possible time.

One major retailer has been ‘allowed’ to reject the ISBN almost entirely (although this pre-dates the issues with respect to eBooks). It is our view that many instances of these ‘bad practices’ are so embedded they will be difficult to dislodge.

Supply chain participants self-define important terms such as ‘product’ and ‘format’ and an industry thesaurus is suggested to alleviate this practice. Without a generally accepted thesaurus, participants are able to use terms as they please to support their arguments. All participants in the supply chain would benefit from better messaging and communication that addresses standards generally and the ISBN issues specifically. Interviewees – particularly medium and small players – repeatedly requested more information and education about standards (and related) issues. In particular, all participants would like an unambiguous eBook policy that is consistently and uniformly adopted.

With particular reference to the above, most of the interviewees failed to understand or recognize the ‘business case’ for applying ISBNs to the ultimate or purchased manifestation of the product.

Arguments regarding metadata control, data analysis or ‘discovery’ have failed to make any impact in convincing participants that the ISBN policy is one they should adopt. To publishers, these arguments sound ‘theoretical’ without any practical relevance.

While the definition of a ‘product’ is problematic (as noted above) there is a more pragmatic challenge faced by ISBN. Not only are publishers combining different content elements (in addition to text) into ‘books,’ they are beginning to redefine how books are created. Publishers contemplate gathering disaggregated content into collections that are ‘published’ specific to a customer’s requirements. As a consequence, some publishers openly question the need for an ISBN as their future publishing programs develop.

While the publisher > distributor > retailer supply chain has adequately accommodated eBooks, the library market faces some unique challenges. In particular, titles available from multiple vendors and in multiple pricing packages create significant challenges to vendors operating in this segment. As eBooks become more prevalent in the library community, these issues will continue to exacerbate what is an incomplete solution to eBook identification.

The quality of meta data provided by publishers was universally derided by all downstream supply partners. In particular, very few publishers are making an effort to combine print and electronic metadata in the first instance and secondly to ensure over time that the metadata attributable to print and electronic versions of the same titles remains in sync. Repeatedly, supply chain partners referred to incomplete and inconsistent eBook metadata files and data rot in electronic metadata files over time.

Metadata quality remains an important issue and, setting aside a revision of the ISBN policies and procedures, improving metadata would be the single most important and beneficial activity publishers could undertake to improve the effectiveness of the print and electronic book supply chain.

Conclusion: There is wide interpretation and varying implementations of the ISBN eBook standard; however, all participants agree a normalized approach supported by all key participants would create significant benefits and should be a goal of all parties.

Achieving that goal will require closer and more active communication among all concerned parties and potential changes in ISBN policies and procedures. Enforcement of any eventual agreed policy will require commitment from all parties; otherwise, no solution will be effective and, to that end, it would be practical to gain this commitment in advance of defining solutions.

Any activity will ultimately prove irrelevant if the larger question regarding the identification of electronic (book) content in an online-dominated supply chain (where traditional processes and procedures mutate, fracture and are replaced) is not addressed. In short, the current inconsistency in applying standards policy to the use of ISBNs will ultimately be subsumed as books lose structure, vendors proliferate and content is atomized.
This was a fun engagement and I enjoyed the full cooperation of the participants. There remains a lot more work to be done.

Sunday, October 14, 2012

MediaWeek (Vol 5, No 43): McGraw Hill, Springer, Hilary Mantel, Amazon, Metadata +More

The tires are being kicked at McGraw Hill Education by a predictable group of private equity players (Cengage is backed by Apax) but the asking price looks steep (Reuters):
McGraw-Hill Companies Inc's (MHP.N) education unit is expected to draw final bids from private equity firms Bain Capital and Apollo Global Management (APO.N) as well as rival Cengage Learning Inc, in a deal that could fetch around $3 billion, several people familiar with the matter said.

Cengage, the No. 2 U.S. college textbook publisher, and the two private equity firms are working on final offers for McGraw-Hill Education, the world's second-largest education company by sales, with the bids due later in October, the people said.

McGraw-Hill, which is running the auction as an alternative to its planned spin-off of the business, wants to get more than $3 billion and could still decide against a sale if the bids fail to meet its price expectations, the people said this week.
Note - If you read my post this week about Pearson the Apollo Group owns for profit University of Phoenix making me some kind of fortune teller.

Journals publisher Springer is up for a recapitalization based on reports from Reuters:
The company has performed well and earnings before interest, tax, depreciation and amortisation (EBITDA) have risen to around 330 million euros, bankers said, from 310 million in 2011, which was quoted on EQT's website.
Although there is no urgency for the company to do anything as its debt does not mature until between 2015 and 2017, conditions in Europe's leveraged loan market are such that it could be good time to do an opportunistic deal.
There have been a number of such deals recently as banks and private equity firms seek to make money and take advantage of stronger market conditions, after a lack of deal activity over the summer, including dividend recapitalisations by the RAC and Formula One.
Hilary Mantel interviewed in The New Statesman:
Mantel wondered if she was being too demanding. But then she thought that to adjust her style in any way would be not only a loss, but patronising (“You simply cannot run remedial classes for people on the page”). Some will be lost along the way, but she doesn’t mind. “It makes me think that some readers read a book as if it were an instruction manual, expecting to understand everything first time, but of course when you write, you put into every sentence an overflow of meaning, and you create in every sentence as many resonances and double meanings and ambiguities as you can possibly pack in there, so that people can read it again and get something new each time.”

She can sound arrogant, Mantel, assured of her abilities and candid about them in a way that seems peculiarly un-English. But even the arrogance is purposeful. It is one of her pieces of advice to young authors: cultivate confidence, have no shame in being bullish about your ideas and your abilities. She was patronised for years by male critics who deemed her work domestic and provincial (one, writing about A Place of Greater Safety – the French 800-pager – dwelt on a brief mention of wallpaper). So she makes no apologies for her self-belief.
...
After all the research, the reading, the note-taking, the indexing, the filing and refiling, it is a question of tuning in. Alison, she says, is how she would have turned out if she hadn’t had an education – not necessarily a medium, but not far off, someone whose brain hadn’t been trained, and so whose only (but consi­derable) powers were those of instinct, of sensing, of awareness. Mantel describes herself as “skinless”. She feels everything: presences, ghosts, memories. Cromwell is researched, constructed and written, but he is also channelled. Occupying his mind is pleasurable. He is cool, all-seeing, almost super-heroic in his powers to anticipate and manipulate. (Craig thinks Mantel made the mistake of falling in love with her leading man and that her version of Cromwell is psychologically implausible for a man we know tortured people.) Mantel relishes his low heart rate, the nerveless approach to life, a mental state unbogged by rumination. She says that when she began writing Wolf Hall, first entering this mind, she felt physically robust in a way she hadn’t for years.
Amazon chief Jeff Bezos was on a promo tour in the UK this week and was interviewed in The Telegraph:
He says the business quickly realised that if they wanted to make ebooks work, they needed to make hardware. Eight years later, the Kindle is into its fifth generation. The latest, film and music playing, multimedia tablet takes on Apple’s iPad and is, on pre-orders alone, the site's number one best seller.

Bezos, though, doesn’t want to take on Apple at their own game. “Proud as I am of the hardware we don’t want to build gadgets, we want to build services,” he says. “I think of it as a service and one of the key elements of the service is the quality of the hardware. But we’re not trying to make money on the hardware – the hardware is basically sold at breakeven and then we have a continuing relationship with the customer. We hope to make money on the services they buy afterwards.”

And make money they do, but Amazon is still not Apple’s size. Would Bezos like it to be? “Even though this device is only £159, in some ways it's better than a £329 iPad – way better wifi, the iPad only has mono sound and the Kindle bookstore is by far the best electronic bookstore in the world.”

Colin Robinson writing in the Guardian suggests ten ways publishing can help itself. Extra points if you can find anything either new in this list (Guardian):
This year, on the face of things, it's been business as usual at the Frankfurt book fair, with some 7,500 exhibitors setting up shop in the gleaming white Messe. But scratch beneath the surface and a tangible unease about the future of the industry is evident: book sales are stagnating, profit margins are being squeezed by higher discounts and falling prices, and the distribution of book buyers is ever more polarised between record-shattering bestsellers and an ocean of titles with tiny readerships. The mid-list, where the unknown writer or new idea can spring to prominence, is progressively being hollowed out. This is bad news not just for publishing but for the culture at large.
Three magazine publisher's experience with Apple's Newsstand (Journalism UK)
When Goldsmith delivered presentations on Newsstand at publishing conferences a year ago, he said he would be asked a common question.  "The first question from the audience would be 'aren't you cannibalising your own sales?' And that question would come from our editors as well."  "But 80 per cent of sales are overseas, 90 per cent of customers are new to the brand." And 40 per cent of all of sales are for subscriptions. "That's brilliant, because it is offsetting that sad decline in print.
It is a similar story for Conde Nast. "We are reaching a new audience, we are able to target them in new ways, we are able to market to them in new ways, it's a pretty exciting new development for us," Read said. "It means that the overall circulations of our magazines in these particular instances are growing very healthily so that we are seeing very big increases in circulation with titles such as Wired and GQ." Overseas sales vary from title to title, Read added, "A magazine like Vanity Fair will see quite a big proportion of its iPad sales coming from overseas, something like 60 to 70 per cent will be international, but that applies to print as well.
Metadata on Stage at Frankfurt reported by Publisher's Weekly:
Indeed this is the thrust of their exchange—the ever-increasing numbers of books and the faulty metadata being circulated about them—over the next half hour. The transition from print to digital has made metadata—which can mean anything from an ISBN to customer ranking on Amazon—not just simply useful, both Dawson and O’Leary emphasized, it is now critical to the ability to find and sell a book. The rise of digital publishing, and the lowering of barriers to entry for just about anyone—from professional publisher to newest self-publisher—has resulted in an explosion of metadata of all kinds. And apparently a sizeable chunk of it is either inaccurate or missing outright, compounding the problem of book discoverability. 
“When it was only the print bookstore, BISAC was a luxury,” O’Leary said, “but with all the digital products, we need accurate and granular metadata. It’s what we need to make book discovery possible.” The explosion in the amount of digital book content, “puts pressure on the metadata,” said Dawson, who pointed out that once inaccurate metadata is published online, “it’s there forever. If you’ve ever tried to correct a mistake in the metadata you know it’s a game of Whack-A-Mole.”
In fact in the olden days of print, O’Leary said, “It used to be that once you shipped the book, that was the end, the metadata was done. But with digital it never stops, there are constant updates and changes.” And as more consumers around the world go online they encounter information on all kinds of books, many of which they will want—but will be unable to buy. “Today, every book you publish is visible everywhere, even if you can’t buy it [because of territorial rights],” O’Leary said, “This encourages piracy, because if people do try to buy it, they find out they can’t.”
From Twitter this week:

From the fashion and style section of the NYTimes (??) The Education of Tony Marx head of NYPL.  (NYTimes)

Monday, December 03, 2012

MediaWeek (Vol 5, No 49) Library World Overview, OCLC

A catch-up on what's going on in library land that I didn't intend to be an OCLC catalog of achievement yet that's what seems to have happened.  Most everyone else (vendors, content suppliers, etc.) seem to have been quiet over the past 6mths.  Especially interesting however is the LJ overview of the market which is their annual review from March.  If you haven't kept up to date on what's going on specifically with vendors in the library world give this a read.


Highlights:
  • NEXT SPACE: OCLC WorldShare: Sharing at Webscale (LINK)
  • More Libraries Join Worldshare Platform (LINK)
  • OCLC Improves Worldshare Metadata Program (LINK)
  • WorldShare Interlibrary Loan (LINK)
  • From March 2012 a Library Journal review of the library automation business (LJ):
Other News:
  • GoodReads and OCLC to work together (LINK)
  • OCLC Continues to Add Publisher Content (LINK)
Presentations and Research:
  • A joint OHIOLINK/OCLC project to determine how library resources can be used more effectively (LINK)
  • Libraries in 2020 – Pew Report (LINK)
  • Richard Walis Presentation on Linked Data to OCLC Members Committee Meeting (LINK)
  • The OCLC Global Council meeting was webcast live.
  • From Charleston Conference: The Digital Public Library of America (LINK)
Highlights:

NEXT SPACE: OCLC WorldShare: Sharing at Webscale (LINK)
Libraries are built on a foundation of sharing. They are the places where communities bring together important, unique and valuable resources for the benefit of all. OCLC WorldShare extends those values to allow all members to benefit from the shared data, services and applications contributed by each individual institution.

OCLC WorldShare is more than a new set of services and applications. It is the philosophy and strategy that will guide the cooperative in its efforts to help member libraries operate, innovate, connect, collaborate and succeed at Webscale. WorldCat data provides the foundation for WorldShare services. And WorldCat discovery and delivery applications help connect information seekers to library resources.
While the philosophy is broad, it also includes two very real, very specific sets of resources that can help libraries make the move to Webscale today: the OCLC WorldShare Platform and OCLC WorldShare Management Services.
More Libraries Join Worldshare Platform  (LINK)
OCLC WorldShare Management Services enable libraries to share infrastructure costs and resources, as well as collaborate in ways that free them from the restrictions of local hardware and software. Libraries using WorldShare Management Services find that they are able to reduce the time needed for traditional tasks and free staff time for higher-priority services.

"We selected WorldShare Management Services because we really wanted to get away from managing servers and back-office infrastructure and focus more of our time on working with student- and faculty-specific projects," said Stanley J. Wilder, University Librarian, The University of North Carolina at Charlotte, one of the newest members of the WorldShare Management Services community. "Plus, we wanted the ability to manage all of our various library services under one platform—using true multi-tenancy architecture that also would allow UNCC to benefit from cloud-based collaboration among our library peers."

UNC Charlotte is North Carolina’s urban research university. It is the fourth largest campus among the 17 institutions of The University of North Carolina system and the largest institution of higher education in the Charlotte region.
Among the new subscribers to OCLC WorldShare Management Services:
•    College of the Siskiyous (Weed, California)
•    De Anza College (Cupertino, California)
•    Glendale Community College (Glendale, California)
•    Indiana Institute of Technology (Fort Wayne, Indiana)
•    Iona College (New Rochelle, New York)
•    Lake Tahoe Community College (South Lake Tahoe, California)
•    Mt. San Antonio College (Walnut, California)
•    Nashotah House (Nashotah, Wisconsin)
•    North Central University (Minneapolis, Minnesota)
•    Northwestern Oklahoma State University (Alva, Oklahoma)
•    Saint Leo University (St. Leo, Florida)
•    San Bernardino Valley College (San Bernardino, California)
•    The Scripps Research Institute (La Jolla, California)
•    Tyndale University College & Seminary (Toronto, Ontario, Canada)
•    The University of North Carolina at Charlotte
•    Westminster College (New Wilmington, Pennsylvania)

OCLC WorldShare Management Services were released for general availability in the United States 16 months ago. Today, a total of 148 libraries have signed agreements to use the new services and 52 sites are already live.

WorldShare Metadata collection management automatically delivers WorldCat MARC records for electronic materials and ensures the metadata and access URLs for these collections are continually updated, providing library users better access to these materials, and library staff more time for other priorities.
OCLC Improves Worldshare Metadata Program (LINK)
OCLC worked with libraries in North America to beta test the new functionality as part of OCLC WorldShare Metadata services. Pilots of the new functionality are planned in different regions around the world.

"The WorldShare Metadata collection management service is a step forward because we can now use the records in the WorldCat database to provide access to our electronic collections in a way that incorporates access changes quickly and easily," said Sarah Haight Sanabria, Electronic Resources Cataloger, Central University Libraries, Southern Methodist University, who participated in the beta test.

Libraries use the collection management functionality to define and configure e-book and other electronic collections in the WorldCat knowledge base. They then automatically receive initial and updated, customized WorldCat MARC records for all e-titles from one source. With the combination of WorldCat knowledge base holdings, WorldCat holdings and WorldCat MARC records, library users gain access to the same set of titles and content in WorldCat Local, WorldCat.org, the local library catalog or other discovery interfaces.

OCLC WorldShare Metadata collection management services are available to all libraries with an OCLC cataloging subscription and work with other components of OCLC WorldShare Management Services as well as other library systems.
WorldShare Interlibrary Loan (LINK)
The release of WorldShare Interlibrary Loan represents the first large migration of OCLC member libraries to the OCLC WorldShare Platform, where they will benefit from expanded integration across a growing number of services. The platform will enable library staff and others to develop applications that will help them connect the service with other services in use within their libraries. They may also use the new service it in conjunction with other components of OCLC WorldShare Management Services.

The phased rollout of the service has begun and will continue through December 2013. Open migration for all WorldCat Resource Sharing users will begin in February 2013 and continue until the end of access to WorldCat Resource Sharing on December 31, 2013.

OCLC has invited a small group of libraries with a low volume of borrowing-only interlibrary loan activity to participate in the initial 90-day managed migration currently in progress. Participation in the next managed migration group, scheduled to begin in October 2012, will be open to interested WorldCat Resource Sharing librarians whose normal interlibrary loan activities can be supported by available functionality in the service before its full release in February.
From March 2012 a Library Journal review of the library automation business (LJ):
In 2011, the library automation economy—the total revenues (including international) of all companies with a significant presence in the United States and Canada—was $750 million. This estimate does not necessarily compare directly to 2010’s $630 million, as this year’s estimate includes a higher proportion of revenues from OCLC, EBSCO, and other sources previously unidentified. (Using the same formula, 2010 industry revenues would be estimated at $715 million.)

As OCLC becomes ever more involved as competition in the library automation industry, we have performed a more detailed analysis of what proportion of its revenues derive from products and services comparable to other companies considered in this report. Of OCLC’s FY11 revenue of $205.6 million, we calculate that $57.7 million falls within that scope.

A broader view of the global library automation industry that aggregates revenues of all companies offering library automation products and services across the globe totals $1.76 billion, including those involved with radio-frequency identification (RFID), automated handling equipment, and self-check, or $1.45 billion excluding them. Library automation revenues limited to the United States total around $450 million.

The overall library economy continues to suffer major cutbacks that may never be fully restored, so library automation vendors are facing enormous challenges to find growth opportunities. Libraries may only be able to justify investments for tools that enable them to operate with fewer resources. Software-as-a-service (SaaS) deployments, for example, result in revenue gains through subscription fees commensurate with delivering a more complete package of services, including hosting; libraries see overall savings as they eliminate local servers and their associated costs. Stronger companies can increase their slice by taking on competitors with weaker products, especially those in international regions.

The ongoing trend of open source integrated library systems (ILSs) cannot be discounted. Open source ILS implementations shift revenues from one set of companies to another, often at lower contract values relative to proprietary software. Scenarios vary, so it’s difficult to determine whether these implementations result in true savings in total ownership costs and to what extent costs shift back to the libraries or their consortial or regional support offices.

The above comes from the management summary and there are more detailed reports as follows:
•    Three-Year Sales Trends by Category
•    2011 Personnel Trends
•    2011 Sales by Category
•    Discovery Trends
•    Company Profiles
OTHER NEWS:

GoodReads and OCLC to work together ((LINK)
The new agreement pledges to improve Goodreads members’ experience of finding fresh, new things to read through libraries. It will also provide libraries with a way to reach this key group of dedicated readers through social media. As a WorldCat.org traffic partner since 2007, Goodreads has sent more than 5 million Web referrals to WorldCat.org.

“We are always looking to give the Goodreads community even more ways to connect with their favorite titles and authors,” explains Patrick Brown, Community Manager for Goodreads. “Linking to libraries through WorldCat and OCLC has always been important to Goodreads, and this agreement helps ensure that our more than 12 million members find their local library and that their local library finds them.”

The expanded partnership includes several components:
•    A joint marketing effort to get libraries to join the Goodreads site and create a library “group” page, which will now be listed at the top of the groups page.
•    Engagement reports from Goodreads that show how many libraries have joined and created group pages and how fast membership is growing for individual libraries on Goodreads.
•    An upcoming webinar held specifically for librarians and library staff members, to learn more about Goodreads and how to optimize the library’s presence.
•    Library-specific promotional materials to encourage patron participation in the Goodreads Choice Awards 2012 during the month of November.
•    A discussion session planned for ALA Midwinter 2013 to hear library feedback and solicit ideas for additional visibility and collaboration.
OCLC and Amazon (LINK)
OCLC WorldShare platform has an Amazon app that takes information about orders from the OCLC acquisitions web service and combines it with pricing and availability information from Amazon.  You can then see pricing and availability for titles and choose to purchase them from Amazon via a cart created on the fly.   (see p. 12)

Authority Control for Researchers: Orcid is another attempt at author/contributor authority (LINK)
Wouldn’t it be great if we had authority control for every researcher?  Of course, we do spend lots of time on authority work already but efforts are underway “to solve the author name ambiguity problem in scholarly communication.”  The ORCID project (http://about.orcid.org/) aims to resolve this ambiguity by issuing unique identifiers to authors.  The next stages of this project will focus on three areas:
•    “Allowing researchers to claim their profiles in an open environment that transcends geographic and national boundaries, discipline, and institutional constraints
•    Allowing researchers to delegate control of the ongoing management of their profile to their institution
•    Providing an interoperable platform for federated exchange of profile information with systems supplied by publishers, grant managers, research assessment tools, and other organizations in the scholarly community”
What is ORCID?
ORCID is an open, non-profit, community-based effort to create and maintain a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers.  ORCID is unique in its ability to reach across disciplines, research sectors, and national boundaries and in its cooperation with other identifier systems.  ORCID works with the research community to identify opportunities for integrating ORCID identifiers in key workflows, such as research profile maintenance, manuscript submissions, grant applications, and patent applications. 

ORCID provides two core functions:  (1) a registry to obtain a unique identifier and manage a record of activities, and (2) APIs that support system-to-system communication and authentication.  ORCID makes its code available under an open source license, and will post an annual public data file under a CCO waiver for free download. 

The ORCID Registry is available free of charge to individuals, who may obtain an ORCID, manage their record of activities, and search for others in the Registry.  Organizations may become members to link their records to ORCID identifiers, to update ORCID records, to receive updates from ORCID, and to register their employees and students for ORCID identifiers.
OCLC Continues to Add Publisher Content (LINK)
OCLC has signed new agreements with leading publishers around the world and has added important new content and collections to WorldCat Local, the OCLC discovery and delivery service that offers users integrated access to more than 922 million items.

WorldCat Local offers access to books, journals and databases from a variety of publishers and content providers from around the world; the digital collections of groups like HathiTrust and Google Books; open access materials, such as the OAIster collection; and the collective resources of libraries worldwide through WorldCat.

WorldCat Local is available as a stand-alone discovery and delivery service, and as part of OCLC WorldShare Management Services. Through WorldCat Local, users have access to more than 1,700 databases and collections, and more than 650 million articles.

OCLC recently signed agreements with the following content providers to add important new collections—including some searchable full text—to WorldCat Local, WorldCat.org and OCLC WorldShare Management Services:
June Announcement of earlier publisher additions (Link)
Presentations and Research:

A joint OHIOLINK/OCLC project to determine how library resources can be used more effectively
(LINK) via (Ohio Library Director)
This OCLC report by Julia Gammon (Akron) and Ed O’Neill (OCLC) was conducted to “gain a better understanding of how the resources of OhioLINK libraries are being used and to identify how the limited resources of OhioLINK member libraries can be utilized more effectively.”  The study collected and analyzed circulation data for books (30 million items in the final set used for analysis) in the OhioLINK union catalog using FRBR (Functional Requirements for Bibliographic Records) analysis.  It would take me pages to explain what FRBR does, but put most simply, it helps you look at items from a title level (all formats and types of holdings) rather than each type of format of the same content as separate.  Check out page 14 for a better explanation of FRBR.

For those of you looking for new research projects, the full data set for individual institutions is available from the project website at http://www.oclc.org/research/activities/ohiolink/circulation.htm.  Figure 3 in the report shows the spreadsheets for OSU.

Here are a few conclusions that the authors draw
•    “The academic richness and histories of the OhioLINK member institutions are reflected in the uniqueness of their library collections. Unique items are not limited to a few large institutions but are widely distributed across many different types of member institutions. The membership should avoid collection practices that homogenize the state-wide collection through unnecessary duplication.
•    Individual institution members commented with surprise on the low use of their non-English language collections. Further study is needed to discover potential causes and trends of these collections’ usage patterns.
•    The most fascinating result of the study was a test of the “80/20” rule. Librarians have long espoused the belief that 80% of a library’s circulation is driven by approximately 20% of the collection. The analysis of a year’s statewide circulation statistics would indicate that 80% of the circulation is driven by just 6% of the collection.”
Libraries in 2020 – Pew Report (LINK) 
Richard Walis Presentation on Linked Data to OCLC Members Committee Meeting (LINK)
The OCLC Global Council meeting was webcast live. 
From Charleston Conference: The Digital Public Library of America (LINK)