NIST Corner: The National Software Reference Library
Written by Barbara Guttman   

HUNDREDS OF MILLIONS of us, lawbreakers included, are leaving digital footprints in our wake. Computers, smartphones, GPS, gaming, and other devices record a nearly continuous stream of our daily activities—whether it might be a coffee purchase, a text message, or a document download. Over the past two decades, forensic investigators have developed the expertise to use these footprints to solve crimes. And, thanks to NIST, investigation of cybercrimes is much speedier than would otherwise be possible.

In 1999, the NIST Information Technology Laboratory began a project that has grown into the National Software Reference Library (NSRL). Computer forensic examiners use the NSRL to identify software on a seized computer, smartphone, or other digital device.

“We know that the NSRL is used daily in every computer forensic lab in the United States,” said Douglas White, project leader for the NSRL in the Software and Systems Division of the NIST Information Technology Lab. “About 20,000 copies of each NSRL quarterly update release are downloaded from our website. We encourage users to redistribute it. The FBI, for example, redistributes it to every field office with digital forensics capability.”

Special Agent Edward Labarge with the US Army Cumputer Crimes Investigative Unit at Quantico, Va., conducts investigative research into a suspected network populated by computer hackers intent on accessing a restricted Army network. Photo: US Army

Investigators rely on it to either exclude files from examination or to look for files of interest. In a child exploitation case, an examiner can exclude the thousands of images that come with Microsoft Office and instead concentrate on unknown files that may contain illicit material. In a hacking case, the examiner might be interested in hacking software found on the drive. Either way, it saves investigators from needing to examine every file.

The NSRL consists of a library of software, a database of information about the software, the Reference Data Set that is the output extracted from the database that law enforcement uses, and a research environment to help the community develop new and better ways of identifying software.

The NSRL currently contains more than 19,000 software packages and 100 million file fingerprints that are based on an advanced cryptographic technique and are quite similar to real fingerprints. The fingerprints uniquely identify the file, but the file cannot be re-created from the fingerprint.

NIST has published the NSRL Reference Data Set each quarter since the fall of 2001 as NIST Special Database 28. The Reference Data Set is available via subscription and download. It can be freely redistributed. Users can verify that their copies are correct via the NSRL website.

A range of uses

Although the NSRL Reference Data Set was designed primarily to serve the computer forensics community, computer security experts and cultural heritage preservationists find it useful as well.

  • Computer forensics interests span from crimes committed with computers, such as child pornography and hacking, to intelligence, civil, and corporate disputes. All of these users need the efficiency of data reduction and the ability to know what software is or has been run on a computer system. For example, in a civil case, it could be important to know whether a disk wiper had been run on a system, possibly violating an order to preserve evidence.
  • The computer security community uses the NSRL to find at-risk software installed on a computer. The computer security community also uses the NSRL to validate files that are known to be safe and for other aspects of software package management.
  • The third major user of the NSRL is the cultural heritage community, consisting mainly of libraries and archives. NSRL preservation strategies—originally employed to meet the need to preserve evidence—are being adopted by the cultural heritage community to preserve software. NIST and Stanford University Library formed a partnership earlier this year to catalog the data contained in about 15,000 software releases from the early days of microcomputing, many of which are game titles. The project will help give software its place in culture and will expand the NSRL.

For More Information

Visit the NSRL website
Sign up for a subscription

About the Author

Barbara Guttman is the Manager of the Component Software Group in NIST’s Information Technology Lab (ITL). Her areas of responsibility include software assurance and computer forensics. In computer forensics, her group runs the National Software Reference Library, the Computer Forensics Tool Testing Project, and the Computer Forensics Reference Data Sets. In software assurance, her group runs the Software Assurance Metrics and Tool Evaluation (SAMATE) project including the Static Analysis Tool Exposition and the SAMATE Reference Data Set. Prior to joining the Software Components Group, she was Associate Director of ITL, Senior Program Analyst to the NIST Director, and worked in computer security and federal information policy.

< Prev   Next >

Court Case Update

FINGERPRINT EVIDENCE went through a nearly three-year ordeal in the New Hampshire court system, but eventually emerged unscathed. On April 4, 2008, the New Hampshire Supreme Court unanimously reversed the decision of a lower court to exclude expert testimony regarding fingerprint evidence in the case of The State of New Hampshire v. Richard Langill. The case has been remanded back to the Rockingham County Superior Court.