Open Data Doesn’t Equal Open Government (But Learning to Code Can Bridge the Gap)

Two interesting trends have recently been coming together in an exciting way: the push for open data, and the “Learn to Code” movement. Together, they show great promise for realizing the goals of open government, but this promise has yet to be fully realized.

The Open Data Movement

Open data has increasingly become a way for governments to demonstrate their commitment to transparency and accountability. Calls to release government data have been heeded to varying degrees by local, provincial, and national governments in many countries. And open data is, in itself, no small feat. Releasing data in any capacity is often an immense hurdle, and one for which governments should be recognized.

But, as anyone who has ever downloaded spreadsheet upon spreadsheet of government data (or pored over printed table upon table in a government publication) can tell you, open data alone does not automatically equate to open government. Open government requires citizens and governments to interact with open data and transform it into something that can drive debate, advocacy, and accountability.

Two weeks ago, I had a chance to see the challenges of converting open data into open government firsthand in Indonesia. The Indonesian government has been pushing to increase the adoption of “e-procurement” nationwide as part of its open government strategy. In this system, companies that want to win government contracts must submit their bids through an online process, which facilitates monitoring. The government then publishes data on the open calls and winning contracts.

But even governments whose processes are largely computerized typically store data in formats that serve their purposes, not the needs of the citizen user. For example, when the government of Indonesia first began releasing data, it had to be downloaded by an individual procurement package. More importantly, the data in its raw format is not immediately meaningful to most citizens.

This is where the Jakarta-based Indonesia Corruption Watch (ICW) saw an opportunity. ICW works with the Indonesian procurement agency, LKPP, to consolidate their data on e-procurement on the website Visitors to the site can visualize data and search for contracts with specific characteristics. That would be valuable itself, but ICW went a step further by developing a tool called Potential Fraud Analysis, which applies a scoring algorithm to procurement contracts in order to identify those with a higher likelihood of fraud. Armed with this data, civil society groups, journalists, and citizens can then undertake further (analog) investigation in order to hold government units accountable for their use of resources.

The implementation of open data initiatives is often midwifed by civically minded programmers who write the code to display government data online. Efforts like those of ICW’s staff demonstrate the importance of computer programming skills in ensuring data accessibility, but these skills are still uncommon. The languages that underpin the ubiquitous websites and applications that have become a part of everyday life for many people around the globe remain a complete mystery to most of us. Fortunately, there are efforts underway to change that.

The “Learn to Code” Movement

The Learn to Code movement is seeking to de-mystify the computer programming process in an effort to put technological tools in the hands of more people. There are various “learn to code” online courses like Code Academy, while groups like Girls Who Code are working to build up communities around tech skills, with a focus on groups that are underrepresented in the tech industry. This fall, students in UK schools are being taught from a new computation curriculum that includes ambitious computer programming coursework for students starting in primary school. These courses focus on teaching languages like HTML, Python, javascript, PHP, and Ruby on Rails (to name a few), in the service of web design and skills needed to work in the tech industry.

Bringing the Two Together

This is where I want to make a pitch for another kind of coding that I think is at least as important for citizens around the globe: coding for data analysis. Learning to code websites and applications is an essential skillset for visualizing and publishing open data online. But statistical analysis is what allows us to interrogate, test, and extract meaning from data, and many powerful data analysis applications (including open source options like “R” and other popular programs like Stata) rely on the use of command lines or formulas.

Learning a few key lines of code in one of these applications (and, more importantly, how to interpret the results they produce) opens the door for anyone—not just academics or researchers—to identify statistical trends and relationships related to the issues faced by their communities. It puts real power and flexibility in the hands of citizens to test the claims they hear from those in power and to back up advocacy with hard facts. Data has the potential to inspire powerful stories, but these stories must be unlocked through analysis.

This week, I’m in Mexico City for Condatos, the Latin America Regional Open Data Conference. The conference is an exciting example of burgeoning efforts to integrate programming, data analysis, and communication of open data. The agenda includes speakers and discussions with policymakers, entrepreneurs, researchers, data scientists, and data journalists, all of whom will be talking about the future of the open data ecosystem in the region. On the day before Condatos, the AbreLatam “unconference” will offer workshops and experiences such as a Data Bootcamp that will bring together 20 journalists, 20 programmers, and 20 designers to learn how to analyze and visualize open data. I’m looking forward to seeing firsthand some of the latest efforts to turn open data into truly open government.


Futher reading.