Much of the publicity around recent mass-digitization projects focuses on the millions of books they promise to make freely readable online. Because of copyright, though, most of the books provided in full will be of mainly historical interest. But much of the richest historical text content is not in books at all, but in the newspapers, magazines, newsletters, and scholarly journals where events are reported firsthand, stories and essays make their debut, research findings are announced and critiqued, and issues of the day debated. Back runs of many of these serials are available in major research institutions but often in few other places. But they have the potential for much more intensive use, by a much wider community, if they are digitized and made generally accessible.
In this talk, we will discuss an inventory we have conducted at Penn of periodicals copyright renewals. We found that copyrights of the vast majority of mid-20th-century American serials of historical interest were not renewed to their fullest possible extent. The inventory reveals a rich trove of copyright-free digitizable serial content from major periodicals as late as the 1960s. Drawing on our experience with this inventory's production and previous registry development, we will also show how low-cost, scalable knowledge bases could be built from this inventory to help libraries more easily identify freely digitizable serial content, and collaborate in making it digitally available to the world. Our initial raw inventory can be found at http://onlinebooks.library.upenn.edu/cce/firstperiod.html