The
Library of Congress has just debuted Newspaper Navigator, a
new tool for searching over 1.5 million newspaper images in their collections.
The Library of Congress and their partners have digitized over 16 million pages
of newspapers from across American history as part of their Chronicling America
database. The Newspaper Navigator uses machine learning and computer algorithms
to enable users to search the databases by both keyword and a new "similar
image" search. Newspaper Navigator currently covers roughly 90% of
newspaper images from 1900-1963. Everything accessible through the application
is in the public domain, so all of the images can be used for free.
Researchers
can search for images in the collection in multiple ways. A search can be
limited to a particular time range by year or by the state where a paper was
located. Users can then enter keywords to search for images, and Newspaper
Navigator extracts keywords from the newspaper to identify images. Users can
also use a the newly designed tool to do a similar image search, which uses
those algorithms to identify images that resemble the initial image. Newspaper
Navigator uses machine learning to continually improve its search capabilities. More information
about these search processes can be found here.
Newspaper
Navigator was designed by Ben Lee as part of the Innovator
in Residence Program at the Library of Congress. The program supports “innovative
and creative uses” of the Library’s collections. Past residents also created
applications utilizing the Library’s sound and art collections. Lee was inspired
to apply because of an earlier crowd-sourced project with the Library called Beyond Words, in which members of
the public identified keywords for images in the newspaper collections. Newspaper
Navigator builds on that project, using the keyword metadata and adding the
similar image search.
Lee also studied the problem of algorithmic bias and
explored its effects on this system. He used four photographs of W.E.B. DuBois
as a case study and published a paper discussing the results
here.
Newspaper
Navigator and the code used to build it are all in the public domain. The
developers have also published a white
paper on the website, as well as data sets and other materials
related to the tool, all for public use.
Newspaper
Navigator gives researchers new ways to explore images from across millions of
newspapers and further develops search technologies for all libraries and
databases. Whether you're a legal scholar looking for a relevant political
cartoon, a lecturer looking for that perfect image to illustrate a point in a
slide show, or a student trying to understand a moment in American history,
Newspaper Navigator provides new ways of accessing some of the Library of
Congress's core digital collections.
Posted by Ellie Campbell on Thu. September 24, 2020 3:00 PM
Categories:
Uncategorized