Last week, after Dr. James Billington announced his retirement as the Director of the Library of Congress (LC), I argued that LC is better positioned to digitize books than Google.
Morally and culturally better positioned, that is. Unlike at Google, which has an inescapable profit motive behind its book scanning program, LC exists to serve the public interest.
In addition to serving the public interest, a massive monograph scanning project spearheaded by LC would honor Dr. Billington's legacy of spurring digitization efforts. As his official biography notes, "Dr. Billington has championed the Library of Congress’s National Digital Library program, which makes freely available online millions of American historical and cultural documents from the vast collections of the Library and other research institutions on the Library's website at www.loc.gov. These one-of-a-kind American Memory materials and the Library’s other Internet services -- including the Congress.gov congressional database and information from the U.S. Copyright Office -- are widely used in K-12 education."
This is an outstanding legacy, although it must be said that what's been digitized so far is the "low hanging fruit" of public domain or historic documents. That was the right place to start, and now we can go further. In the Washington Post, Philip Kennicott offers this suggestion for the next Librarian of Congress: "Every word on every page of every public-domain book sitting in the Library of Congress should be available online. That’s a lot to ask, but why ask for less?"
I agree with Kennicott, and would take it even further. The reason he stops at public domain materials, of course, is because digitizing anything that is still in copyright would lead to protracted legal battles of the kind that Google endured (I always supported the legality of Google's efforts, if not their cultural legitimacy.) But we must remember that, in the United States, the only items which are squarely in the public domain were published before 1923. There are some public domain materials after this, but as a matter of exceptions and technicalities rather than law.
So Kennicott's proposal would not account for a good deal of material published in the last 92 years. This is especially true for items published in recent decades.
The nexus of the problem is the nation's antiquated copyright law, which was last substantially revised in 1976. This may have been a bicentennial year, but it was also well before the widespread use of the Web. Our codified notions about how to incentivize authors, and protect their rights, do not jive with how knowledge and information flow today.
In a nutshell, then, we need to fix copyright. (That's much easier typed than achieved, of course. The current system benefits incumbents such as publishers, so any widespread changes would be mightily resisted.) And who manages the US Copyright Office? Why lo and behold, it's a division of the Library of Congress. Ultimately, copyright is the responsibility of the Librarian of Congress.
So we have our mission for whomever President Obama appoints and the Senate confirms as the 14th United States Librarian of Congress: digitize our books, and modernize our copyright law.