Wednesday, July 31, 2013

White House Sponsored DataJam Promotes Open Data Initiatives

There is a lot of machine-readable data coming from the federal government, as a result of the Obama Administration’s open-data initiatives. Open data was a platform initiative of the President’s first presidential campaign and, in his second term, he has reinvigorated this policy with a new spate of policy initiatives, executive orders and community outreach.

The result of the outreach program was on display at a recent Datajam event I attended at the White House Conference Center (near, not at, the White House). Sponsored by the White House Office of Science and Technology and CENDI, the event invited technologists, researchers, publishers and data owners to weigh up ways of using the data and content rapidly being made accessible by almost every federal agency. As a group, we were challenged by White House Chief Technology Officer Todd Park to engage our inner entrepreneurial spirits and think up new and innovative uses for government data. In a lively opening speech, Park promised the datajam audience member who came up with a viable idea or product within 60 days that he would “make [him or her] famous.” He pointed to several examples of health data related products which had come out of similar meetings recently, and was enthusiastic about this group’s ability to produce some interesting ideas.

Government information and data is a by-product of the taxes we pay to maintain the federal agencies, and, Park confirmed “the administration is looking to maximize tax payer return of government produced data by opening it up to so many people can access and use it.” Citing as a guiding principle ‘Joy’s law’--that the smartest people in the world will always be working for someone else and that collaboration is an imperative if real progress needs to be made--Park suggested that bringing people together in groups like ours is important for both promotion of open data and for actually devising worthwhile uses for it. Furthermore, he made the point that “open data by itself is useless and only useful if it gets applied to something and produces value.” He encouraged us to “use ‘our data’ to produce awesomeness - where stuff can actually happen.”

Earlier this year, President Obama signed an executive order requiring all federal agencies to open access to all government-sponsored published content by the end of 2013. This has produced a frenzy of activity at some agencies to determine what they have and how to make the materials accessible. Some agencies are more mature in this respect than others but, CTO Park confirmed, the President is passionate about open data and has made specific commitments to fulfilling this policy.

In a speech in Austin, TX, in May, the President cited several examples of start-up companies working with open data: “StormPulse uses government-produced weather data to help businesses anticipate disruptions in service. Another company based in Virginia called OPower uses government trend data to save consumers $200m on their energy bills.” Obama also mentioned an app called iTriage, founded by a pair of doctors, that uses data from the Department of Health and Human Services to help define symptoms and find appropriate health care for the patient. In the same speech, the President announced that his administration is making even more government data available and he expected that this and his other open-data initiatives would help launch even more start-ups similar to the ones he mentioned. 

The data available would help “more entrepreneurs come up with products and services that we haven’t even imagined yet”.

More recently, in a speech just two weeks ago, President Obama suggested that we are part of a process to build a better, more open-data America. Hyperbole aside, there will be no coming back once these policies are in place and this policy by the executive and agencies of the Federal government is likely to have profound changes on how we, as citizens, interact with the government. Frequent examples of open data initiatives cite the use of satellite imagery and data from the National Oceanic and Atmospheric Administration (NOAA) which have produced the now ubiquitous mapping and weather apps; however, given the “fire hose” nature of the data and information on offer, these early examples are likely to represent only isolated examples of the opportunities represented by the government’s open data initiatives.

Commercial publishers of government-funded and/or -produced content and data are spooked by some of these moves by the government. During our meeting it was mentioned that one only needs to search for “aspirin” on Google to see how accessing government-produced content via API can produce content that looks very much like a drug handbook entry from Elsevier, Wolters Kluwer or some other commercial medical publisher. The government often makes reference-like content a requirement of various approval processes, and thus, we may be about to see professional reference content undergo some profound changes. And that is just one small example of what could happen to commercial publishing.

As a direct result of the open-data initiatives both in the US and Europe, the Association of American Publishers and Society of Scholarly Publishers (in collaboration with partner CrossRef) have established an initiative named CHORUS. (The EU is said to be about to press for greater open-data requirements than we have in the US.) Through CHORUS, publishers aim to avoid a PubMed situation and manage the open data and open access content requirements themselves; publishers who publish content which is also available on PubMed see significant decreases in traffic once PubMed opens access to the same content. Publishers believe that, by setting up their own open-access service, they will be able to fulfill the government’s open-access requirement and mitigate the impact(s) on their own business models.

Regardless of the risks to current publishers and their business models, it appears that the government produces a lot more content and data than is currently being commercialized by publishers. The sheer amount of data is overwhelming and, as long as the President continues to promote open data, we’ll see hundreds of new products and services develop in the short term as entrepreneurs take CTO Park up on his promise to make them famous.

No comments: