D.C. code meets open government on GitHub
Connecting state and local government leaders
When Joshua Tauberer found a typo in the District of Columbia’s legal code, he corrected it with a GitHub pull request – an action that’s possible only because D.C. publishes its code on the online hosting service.
Joshua Tauberer isn’t an editor, but when he found a typo in the District of Columbia’s legal code recently, he corrected it with a GitHub pull request. That was possible because the nation’s capital is the only jurisdiction to publish its code on the online hosting service, he said.
The founder of GovTrack.us and a public member of the District’s Open Government Advisory Group, Tauberer noticed that a link to a subchapter of the code linked to the wrong part of the law. Because the nation’s capital publishes the law on GitHub, Tauberer was able to edit the file and submit a pull request that resulted in a correction.
We spoke with Tauberer to find out more about the process.
This interview was lightly edited for clarity.
GCN: Why is it notable that you were able to edit the D.C. Code on GitHub?
Tauberer: Legal publishing has been controlled by a very small number of companies that are doing the vast majority of work to publish the legal codes of the states. This is a big deal because the District’s government, the [D.C.] Council, has moved to a new contractor for legal publishing. It’s a nonprofit [called the Open Law Library]. It’s interested in making things better -- not just for people who are going to pay high-priced subscriptions to read the law, but for everyone, and it’s using open, modern technologies and practices to do it.
GCN: Other jurisdictions and the federal government digitally publish laws, but this is -- to your knowledge -- the only government that opens the law to editing by the public. Is that right?
Tauberer: In every jurisdiction, the public is allowed to suggest an edit, and that usually happens in the form of proposing a bill or being an advocate and going to the council. The difference here was that the process occurred entirely in the open and online using a platform that’s built for that purpose.
GCN: So, what happened?
Tauberer: There’s a repository on GitHub where the council is publishing XML data for the laws that are enacted in the D.C. Code. With a little help from the Open Law Library, I was able to find the file in that repository that corresponded to the law that the council had recently enacted to update the responsibilities of the Office of Open Government, which is what I was interested in. The typo was actually in the XML metadata that was added by the codification lawyer in the council after the enactment of the act. The council passes an act and they turn it into XML data, and it was that process of turning it into XML data that had the typo.
I spotted that in the file and made the change in GitHub. I pushed that change to my own repository copy of D.C. Code -- it was sort of a scratch space for putting my correction somewhere -- and then I opened a pull request. A pull request is basically a request to the maintainer of the original repository to merge a change that I put in my repository. The original maintainer can see exactly the change that I’ve made and adopt it or not. Ben Bryant, who’s the codification lawyer at the council, got an email through GitHub's normal process of sending emails when pull requests are opened. He saw my pull request, reviewed it and merged it, which moved my correction of the typo to the authoritative repository that the council runs.
GCN: Would this look different if you’d needed to fix something more substantive?
Tauberer: If we’re talking about substantive changes, this may be a completely different ballgame. The tools that we’re creating certainly can help that, but the exact story probably shouldn’t look the same. We don’t want thousands of people submitting pull requests for the District’s codification lawyer to review when substantive changes really should be going to your representative on the council. Things have to go through a democratic process.
GCN: This isn’t unfamiliar territory for you. You were part of the initial work in 2013 to turn the D.C. Code into XML.
Tauberer: At that time, the council would tell [legal research firm] Westlaw what changes to make, which meant that the council didn’t have a complete digital copy of the code. Westlaw had it and maintained it. [The District] changed to Lexis[Nexis] after I did this work. First, we needed a complete digital copy of the code, so the council asked Westlaw for a copy, and Westlaw emailed over a very large number of Word documents containing the complete content of the D.C. Code.
The second step that we did was to convert all the Word documents into an XML format. The challenge, once you have the data, is keeping it up to date, which is really the biggest challenge in most data issues. You’ve got to keep it going. That’s the main work that the Open Law Library has done since then. It’s been in giving the council the tools to keep those files up-to-date as fast as possible.
GCN: How does it do that?
Tauberer: The way it used to work is the council would pass a law, and then the lawyers inside the council would decide where in the D.C. Code to put it, and they would work with their past contractor, Lexis, to tell them where to put it. In the new version, the laws are first turned into XML data by the council’s own lawyers so they don’t have to go to an outside publisher to tell them how to update the code. The Open Law Library’s tools are automatically compiling the law into the code. It replaces an old manual process of copying and pasting text.
GCN: How difficult was this for the District to get this up and running?
Tauberer: A lot of effort. The range of work is from technology to keep the code up to date but also embedding that work within the workflow that the council goes through. Councilmembers will draft bills in Word. We’re back to Word again, and you can’t escape it and you can’t tell councilmembers to draft in XML --the councilmembers aren’t going to draft legislation in XML. So a lot of the work has been in creating that bridge between how the laws are actually drafted inside the council and what needs to be done to get that into data and then published as the code.
On top of it, the council is now doing this without the help of Lexis or Westlaw, so the work that Lexis and Westlaw were doing has now been internalized by the council, and that’s a big shift also. It’s both technology as well as process.
GCN: If other municipalities are interested in pursuing this track, they should realize it’s a commitment, right?
Tauberer: For sure -- and not even just across municipalities, but even within the same ones. The D.C. Code is now being done in a modern way, but the code is only one aspect of law in the District. Regulation is another very large aspect of D.C. law, and that’s run by the executive branch of the D.C. government and hasn’t been modernized yet in this way. That would be the next logical step, but it’s a whole different agency that does it, so it would be starting over even within D.C. to do that.
GCN: What pointers can you offer governments that might consider this approach?
Tauberer: The first issue is realizing how modernization and open data can help. It’s always easier to get done when the council or whoever sees this as a win internally. No. 2 is you have an interest in improving access to the law for everyone beyond the lawyers who subscribe to subscription services [such as LexisNexis], and then the third thing would be to start talking to folks who are working on this.
The Open Law Library, I’m sure, is happy to talk to any jurisdiction that would want to do that. There are other organizations also working on a similar path. Another is Xcential, and I’ve done contracting work for them. They’re also working on similar tooling for jurisdictions, and they can help figure out whether a jurisdiction is going to be able to do it.