I was witness to an interesting cultural moment today that exemplifies how fast technology is moving. Above, Phil Ma, owner of Mavelous SF, checks out his 23andMe results, fresh from the day's mail. Getting online access to your genetic code now costs just a couple hundred bucks for a year.
Over the past several years, cities like San Francisco have come a long way in putting data online. However, departments with budgets in the tens of millions of dollars - including the very agency tasked with policing government ethics - still have miles to go. I'm organizing an open data standards campaign in San Francisco and California because putting data online in formats that cannot be independently analyzed and resused by the public is simply poor government.
For the upcoming CityCampSF Hackathon, an event where volunteers will bring meaning to and extend the usefulness of public data, the muckracking newspaper CitiReport is sponsoring a challenge to cross-reference all kind of open data to make sense of public ethics in San Francisco. Sadly, of four departments who could help bring transparency to public ethics, only one is using open standards on its key data.
By way of background, it's been more than a year since SF passed a groundbreaking law urging departments to post their data on DataSF.org.
SF Planning: The Planning Department has several data sets on DataSF, but the project information on key developments active right now does not seem to be there. Instead, we get this nicely designed "Complete List of Projects" with no strutured data (such as a simple CSV file with headers that include the project description, location, lead planner and developers) in sight.
SF Ethics: The Ethics Commission has the legal duty of collecting huge amounts of campaign finance and good government data. But their datasets on OpenSF are horribly out of date, and their online list of lobbyists would take a master coder or many, many man hours to parse into an open format for analysis (again, simple CSV files with headers would do - name of lobbyist, who they work for, how much they got paid, who they contacted, when, what project they were trying to influence). This lack of functional transparency is glossed over with a great data display, but none of the underlying data is easily available on the site. This means you get the results that the display tells you are important, while independent analysis is nearly impossible.
SF Controller: The Controller's Office is responsible for keeping track of city salaries and city contracts. They don't have a fancy data display (which I bet saved them a lot of money), but they do allow you to search their contractor database and download data in structured open formats. Thank you!
SF Board of Supervisors: The Clerk of the Board publishes huge volumes of minutes of board agendas and votes. However, they are PDFs with no structure. Many of the attachments for Board items end up as PDF-formatted scans of documents without optical character recognition, rendering them unsearchable by people viewing them on the web, and keeping them out of open web search results. This lack of functional transparency makes it very difficult to know who's voting on which planning projects for example.
In summary, if we want to make connnections between who is lobbying for a contract, which developers are giving money to politicians, and who's voting on what, it is very, very difficult. Lack of functional transparency means that if you want to evaulate your government in San Francisco, you have to know almost exactly what piece of straw in a data haystack you're looking for before you even start.
Consider that SF has an annual budget of more than $6 billion. We can do better. We will do better.
That's why I'm working on a campaign to bring structured open formats to SF and California law.
And please, if you can help research, write open records requests, code or have any other skills you'd like to bring to bear on behalf of tech-enabled open government, join us this weekend at the CityCampSF Hackathon. It's time to make transparency work for the people.
Who's signing on early to the campaign for open data standards in SF and California? Here's a look at how they shape up on a map:
If you'd like to learn more or get involved, check out the San Francisco-California Open Data standards campaign home page.
Whether your expertise is copywriting, marketing, design, research or coding, you're welcome at the first CityCampSF Hackathon, this Saturday and Sunday at 600 Harrison St., Suite 120 in San Francisco. At the 24-hour event we'll be working on projects around government transparency at the state and local level and using application interfaces from Granicus and Tropo. There are $4,000 in sponsored cash prizes.
Check out the list of projects and prizes and RSVP.
If you're already signed up, please invite friends who have expertise to contribute. Events like this are incredibly important for influencing policy. I wrote here about how bad data may be costing California millions in wasted environmental funds.
Last Saturday, civic hackers at Open Data Day Bay Area kicked off some great open data projects, including 99atms.com, an HTML5 app for finding fee-free credit union AMTs in the Bay Area. Sunday on Gov 2.0 Radio we talked about the event and the impact open data policies have on fighting urban poverty. Listen here.
Let me know if you have any questions about the CityCampSF Hackathon.
RSVP here - http://www.wiredtoshare.com/citycampsf_hackathon
Last month, I opened a new account with my local credit union as part of the Move Our Money campaign. Today, I'm excited to use the winning app from Open Data Day Bay Area - an ATM finder for credit union-linked cash machines - to easily make fee-free cash withdrawals.
Check it out at 99atms.com.
I hope New Bottom Line and others active in campaigns to get money out of big banks and into local credit unions will help further development of the new app, which works on both desktop and mobile browsers. The easy visualization of ATM locations will hopefully spur installation of new machines in under-served communities.
I was listening to Jeanne Holm, Data.gov's data evangelist, via Skype today at Open Data Day Bay Area. Jeanne's presentation made me think about my first real meaningful interaction with federal government data. As an investigator with the San Francisco City Attorney's Office back in 2007, I was trying to find out if a City utility crew leaders was using a private business to do private jobs with his crew during his city work shifts.
His business didn't show up anywhere, not in the Yellow Pages or local business registration listings. But it did show up on FedSpending.org, a precursor to Data.gov and a product of legislation by then-Sen. Barack Obama. Donnie Thomas' utlility company had taken federal contracts with the Coast Guard, and there they were on FedSpending.org. That was my first big break in the case.
Don't let anybody short sell the value of open government data.
In the open data and open government communities, we like to talk up the benefits of open innovation and private sector application development when governments open up data in structured formats. The cost of not doing so is huge, too. When data cannot be easily analyzed, lost dollars in public policy mistakes can easily reach tens of million at a time.
In California, lengthy timber harvest plans are published by the Department of Forestry and Fire Protection to an obscure FTP server as several unsearchable PDFs for each report. This means the extensive information contained in these logging industry reports is useless in terms of data analysis.
The impact here is huge. Right now, the California and U.S. governments are spending $128 million to restore salmon in Battle Creek, which collects runoff from the west slope of Lassen Peak (Battle Creek watershed map).
Here's an overhead view of Lassen Peak from Google Maps:
And here's what happens when you zoom in on the west side of the Peak. The quilted pattern is timber clear cuts.
Is California allowing massive deforestation and increased runoff of industrial pollutants and loose soils into Battle Creek at the same time it is spending $128 million to restore salmon habitats there? Without good data, it's almost impossible to know. (For more background, see "Clear-cutting, death and madness" in the Redding Record Searchlight.)
Next weekend at CityCampSF, we'll be working on a project to imagine how applying open data principles to timber harvest plans would allow for clear analysis of logging impacts to habitats in California. Forests Forever, a conversation group based in San Francisco, will use our work as part of a campaign to legislate an open structured format for these reports.
Is bad data leading to the waste of tens of millions of dollars by California and the feds?
Join us at the CityCampSF Hackathon next weekend and help us find out. The event is open to physical and remote participation.
Update (12/05/11): Paul Hughes, executive director of Forest Forever, points out that XML online Timber Harvest Plans will expose data re: all clearcutting now going on in California, including in the Cascades, Shasta/Modoc Plateau region, and North Coast and Central Coast, in addition to the Sierra Nevadas. Also, here's the Sacramento Bee's story on Battle Creek restoration efforts and the possible effects of nearby logging.
Today I actually shut down one of my brands instead of opening a new one. Third Thursdays SF, the civic tech meetup series conjured up by Luke Fretwell and I over coffee late last year, is no more. There really isn't anything negative to say; Third Thursdays SF is going away so I can focus more fully on SF Tech Dems, a project I started just before joining NationBuilder in mid-May.
SF Tech Dems is meant to be a grassroots advocacy group weighing in on - and sometimes lobbying for - tech-related legislation in San Francisco and at the state and federal levels. We aim to organize tech enthusiasts and experts into a serious political force in the SF Bay Area. And I don't mean we're going to be about tax breaks for Big Tech or big name out-of-town endorsements for local politicians - I mean we're going to be a group of regular folks who get tech and how it should serve people and democratic institutions. We'll weigh in on everything from civil liberties issues to data center best practices and provide advice to local and state politicians on issues within our areas of expertise. SFTechDems.com is hosted on the NationBuilder platform, and, with its great advisory board and growing membership, I'm convinced it can be a significant organizing force for good.
If you want to get involved, you can become a dues-paying SF Tech Dems member here, or sign up for email updates here. Our current campaigns are for legislative standards for structured open data in SF and California, and for reform of the Electronic Communications Privacy Act to ensure that Americans' private online information is not accessed without warrants.
Hope to see some of you at a Tech Dems meeting one of these days soon - who knows, maybe even on a Thursday.