Over the past several years, cities like San Francisco have come a long way in putting data online. However, departments with budgets in the tens of millions of dollars - including the very agency tasked with policing government ethics - still have miles to go. I'm organizing an open data standards campaign in San Francisco and California because putting data online in formats that cannot be independently analyzed and resused by the public is simply poor government.
For the upcoming CityCampSF Hackathon, an event where volunteers will bring meaning to and extend the usefulness of public data, the muckracking newspaper CitiReport is sponsoring a challenge to cross-reference all kind of open data to make sense of public ethics in San Francisco. Sadly, of four departments who could help bring transparency to public ethics, only one is using open standards on its key data.
By way of background, it's been more than a year since SF passed a groundbreaking law urging departments to post their data on DataSF.org.
SF Planning: The Planning Department has several data sets on DataSF, but the project information on key developments active right now does not seem to be there. Instead, we get this nicely designed "Complete List of Projects" with no strutured data (such as a simple CSV file with headers that include the project description, location, lead planner and developers) in sight.
SF Ethics: The Ethics Commission has the legal duty of collecting huge amounts of campaign finance and good government data. But their datasets on OpenSF are horribly out of date, and their online list of lobbyists would take a master coder or many, many man hours to parse into an open format for analysis (again, simple CSV files with headers would do - name of lobbyist, who they work for, how much they got paid, who they contacted, when, what project they were trying to influence). This lack of functional transparency is glossed over with a great data display, but none of the underlying data is easily available on the site. This means you get the results that the display tells you are important, while independent analysis is nearly impossible.
SF Controller: The Controller's Office is responsible for keeping track of city salaries and city contracts. They don't have a fancy data display (which I bet saved them a lot of money), but they do allow you to search their contractor database and download data in structured open formats. Thank you!
SF Board of Supervisors: The Clerk of the Board publishes huge volumes of minutes of board agendas and votes. However, they are PDFs with no structure. Many of the attachments for Board items end up as PDF-formatted scans of documents without optical character recognition, rendering them unsearchable by people viewing them on the web, and keeping them out of open web search results. This lack of functional transparency makes it very difficult to know who's voting on which planning projects for example.
In summary, if we want to make connnections between who is lobbying for a contract, which developers are giving money to politicians, and who's voting on what, it is very, very difficult. Lack of functional transparency means that if you want to evaulate your government in San Francisco, you have to know almost exactly what piece of straw in a data haystack you're looking for before you even start.
Consider that SF has an annual budget of more than $6 billion. We can do better. We will do better.
That's why I'm working on a campaign to bring structured open formats to SF and California law.
And please, if you can help research, write open records requests, code or have any other skills you'd like to bring to bear on behalf of tech-enabled open government, join us this weekend at the CityCampSF Hackathon. It's time to make transparency work for the people.