NIST Publishes Sequence-Based Gene Frequencies for Forensic DNA Markers
Written by Rich Press   

NIST HAS PUBLISHED STATISTICAL DATA that will pave the way for crime laboratories to use next generation sequencing (NGS) to create forensic DNA profiles. The data includes genetic sequences that occur at each of 27 genetic markers and estimates of how frequently each sequence occurs in the population. This research, which was jointly funded by NIST and the FBI, was published in Forensic Science International: Genetics.

“The data we’ve published will make it possible for labs that use NGS to generate match statistics when analyzing DNA profiles,” said Katherine Gettings, the NIST biologist who led the study. Those statistics describe how closely two profiles match and allow a jury to take the strength of that match into account when deciding guilt or innocence.

To generate a DNA profile, forensic labs analyze genetic markers where a series of base pairs repeat themselves. Those segments are called short tandem repeats, or STRs, and the number of repeats at each marker varies from person to person. Currently, the analyst doesn’t sequence those markers, but determines the number of repeats at each one, producing a series of numbers that can be used to identify an individual.

When STR-based profiling was developed in the 1990s, genetic sequencing was expensive and impractical. Today, NGS makes sequencing cost-effective for biomedical research and other applications. NGS can also be used to create forensic DNA profiles that, unlike traditional STR profiles, include the actual genetic sequence inside the markers. That provides a lot more data.

For most cases, STR-based profiles contain more than enough information to reliably identify a suspect. However, in cases that involve degraded DNA or trace amounts of DNA, STR-based profiles might not be enough to establish an identity. In those cases, the extra data in an NGS-based profile might help solve the case.

In addition, evidence that contains a mixture of DNA from several people can be difficult to interpret. NGS-based profiles might help analysts deconvolute those mixtures.

NIST scientists have published statistical data needed for forensic DNA profiling based on a technology called Next Generation Sequencing. To do that, they sequenced forensic DNA markers for a sample population. The letters A, G, T, and C represent the building blocks of DNA. Illustration: K. Irvine/NIST

DNA analysts are able to calculate match statistics for STR-based profiles because scientists have measured how frequently different versions of the markers occur in the population. With those frequencies, you can calculate the chances of randomly encountering a particular DNA profile, just as you can calculate the chances of picking all the right numbers in a lottery.

NIST measured those STR gene frequencies years ago using a library of DNA samples from 1,036 individuals. To calculate gene frequencies for NGS-based profiles, Gettings and her co-authors sequenced the markers in those same samples, which were anonymized and donated by people who consented to their DNA being used for research. The researchers sequenced 27 autosomal markers—the core set of 20 included in most DNA profiles in the U.S. plus seven others. They then calculated the frequencies for the various genetic sequences found at each marker.

It might be surprising that scientists can estimate gene frequencies from such a small library of samples. However, the NIST team was measuring frequencies not for the full profiles, but for the individual markers. Since they sequenced 27 markers, with each marker occurring twice per sample, the number of markers tested wasn’t 1,036, but more than 55,000.

Although NIST has now published the data needed to generate match statistics for NGS-based profiles, other hurdles must still be cleared before the new technology sees widespread use in forensics. For instance, labs will have to develop systems for managing the greater amounts of data produced by NGS. They will also have to implement operating procedures and quality controls for the new technology. Still, while much work remains, said Peter Vallone, the research chemist who leads NIST’s forensic genetics research, “We’re laying the foundation for the future.”

About the Author

Rich Press is a writer with NIST.


K. B. Gettings, L. A. Borsuk, C. R. Steffen, K. M. Kiesler, P. M. Vallone. U.S. Population Sequence Data for 27 Autosomal STR Loci. Forensic Science International: Genetics. Published online 19 July 2018. DOI: 10.1016/j.fsigen.2018.07.013

This article appared in the Winter 2018 issue of Evidence Technology Magazine.
Click here to read the full issue.

< Prev   Next >

Recovering Latent Fingerprints from Cadavers

IN A HOMICIDE CASE, the recovery of latent impressions from a body is just one more step that should be taken in the process of completing a thorough search. This article is directed at crime-scene technicians and the supervisors who support and direct evidence-recovery operations both in the field and in the controlled settings of the medical examiner’s office or the morgue under the coroner’s direction.