INTRODUCTION — Technologies for sequencing DNA have improved dramatically, to the point that it has become practical to sequence an individual's entire genome. Next-generation sequencing (NGS) is a type of DNA sequencing technology that uses parallel sequencing of multiple small fragments of DNA to determine sequence. This "high-throughput" technology has allowed a dramatic increase in the speed (and a decrease in the cost) at which an individual's genome can be sequenced.
The ability to sequence an entire genome raises several challenging questions for the clinician, including the following:
●When should NGS be considered clinically?
●What is the best choice among several types of genetic testing available?
●What is the clinical significance of findings from sequencing of an entire genome?
●Which findings should be acted upon and/or conveyed to the patient?
This topic will review basic concepts and clinical uses of next-generation DNA sequencing (NGS). Genetic testing, counseling, and reporting of incidental findings from genome sequencing are discussed separately. (See "Genetic testing" and "Secondary findings from genetic testing".)
Alternative methods for evaluating genetic and genomic disorders such as conventional sequencing, polymerase chain reaction (PCR), cytogenetics and fluorescence in situ hybridization (FISH), and gene expression profiling are also presented separately. (See "Overview of pharmacogenomics" and "Tools for genetics and genomics: Cytogenetics and molecular genetics" and "Tools for genetics and genomics: Gene expression profiling" and "Personalized medicine".)
TERMINOLOGY AND EVOLUTION OF TECHNOLOGIES — A large investment has been made in improving DNA sequencing technologies, to make them cheaper, faster, and more accurate. The following terms are used to distinguish sequencing methodologies:
●Sanger sequencing – Manual or automated sequencing using the methods developed by Sanger, Maxam, and Gilbert is considered the "first" generation of DNA sequencing methods . These types of sequencing are also commonly referred to as "conventional" or "traditional" sequencing. Sanger sequencing determines the sequence of large DNA fragments (up to approximately 500 to 900 bases), by collecting and aligning a series of different length products polymerized along the DNA template. The original Sanger method used radioactive markers for each nucleotide, and later adaptations have used fluorescently tagged versions.
Sanger sequencing is used clinically when the sequence of a specific gene is being tested. As an example, conventional sequencing could be used to identify a mutation in factor IX in a patient with suspected hemophilia B, without examining the rest of the individual's genome. Sanger sequencing is preferred in this setting because mutations in the gene encoding factor IX are already known to cause the disease; gene mutations can be correlated with laboratory markers of the disease (eg, activated partial thromboplastin time [aPTT]); and mutations in other genes are exceedingly unlikely to cause hemophilia B. (See "Genetics of hemophilia A and B".)
In contrast, Sanger sequencing often cannot provide information about large portions of the genome (eg, multiple genes) at a practical cost and within a reasonable timeframe. One estimate predicted that sequencing an entire human genome using Sanger sequencing would take 60 years .
●Next-generation sequencing (NGS) – Next-generation sequencing (next-gen sequencing; NGS) uses sequencing of multiple DNA fragments, performed in parallel. This technology is also referred to by other terms (eg, short-read sequencing, high-throughput sequencing, deep sequencing, second-generation sequencing). In contrast to Sanger sequencing, the speed of sequencing and amounts of DNA sequence data generated with NGS are exponentially greater, and are produced at significantly reduced costs.
Several "platforms" (ie, sequencing instruments and associated reagents) for NGS have been developed. Across NGS platforms, there is typically a sample preparation or "library preparation" step in which the patient's DNA, which serves as the template, is purified, amplified and fragmented, followed by physical isolation of DNA fragments by attachment to solid surfaces or small beads. Sequence data are generated on these small fragments, and the electronic results are computationally aligned against a "reference" genome or sequence (ie, a previously sequenced genome designated as a "normal" reference). An updated version of a commonly used human reference genome, GRCh38, was released by the National Center for Biotechnology Innovation (NCBI) on December 24, 2013.
NGS is reserved for clinical scenarios in which it is considered useful to determine the sequences of multiple genes. (See 'Clinical use of next-generation sequencing' below.)
●Third-generation sequencing – Third-generation sequencing (also known as long-read sequencing) uses parallel sequencing similar to NGS, third-generation sequencing uses single DNA molecules rather than amplified DNA as a template. Thus, third-generation sequencing potentially eliminates errors in DNA sequence introduced in the laboratory during the DNA amplification process.
Third-generation methods are under development and generally are not clinically available. The technology of this approach is nearing the accuracy of short-read technology, which is used in NGS and is considered the clinical NGS gold standard. Third-generation sequencing allows for sequencing some of the most challenging areas of the genome (challenging due to the high percentage of guanine and cytosine bases, referred to as "GC content") and for providing better analysis of structural variation in the genome .
Additional terminology used to describe a variety of genetic and genomic concepts is presented separately. (See "Genetics: Glossary of terms".)
In addition to evolving technical innovations, methods to optimize data interpretation continue to be studied, as large amounts of sequence data are becoming available before the clinical implications are known. (See 'Interpretation' below.)
Source of DNA — The starting material that provides a template for clinical NGS is double-stranded nuclear DNA. This can be obtained from a variety of cell types. In some cases, the DNA is further modified in the laboratory (eg, to remove non-coding regions when exome sequencing is performed). (See 'Whole genome, exome, or gene panel' below.)
DNA extracted from leukocytes in whole blood is a sterile source of DNA used for most clinical testing. When DNA is derived from non-sterile sources (eg, buccal swab, saliva sample), there is a potential risk for DNA sequence from bacterial flora to be confused for host sequence. However, non-sterile sources of DNA have the advantage of not requiring a blood draw, and less-invasive means for collecting DNA are increasingly used in large-scale genomic research studies. If needed, DNA can also be extracted from fixed tissues (eg, from formalin-fixed, paraffin-embedded pathology samples).
For patients who have undergone hematopoietic cell transplantation, leukocyte DNA from a blood sample is likely to contain donor rather than recipient genetic sequences. In this setting, DNA from a buccal swab, saliva, or hair follicles can be used [4,5]. Hair follicles are likely to be the most reliable source, because buccal and saliva samples have been reported to show chimerism for donor and recipient sequences .
Starting material DNA must be amplified prior to NGS to run the sequencing reactions. Amplification involves exponential in vitro replication of patient DNA using PCR-based methods. While the DNA polymerase rarely makes mistakes (eg, substituting the wrong nucleotide), the amplification step has the potential to introduce errors in sequence not present in the original DNA from the patient.
NGS methods can also be used to sequence RNA and mitochondrial DNA; however, the majority of known human disease genes reside in nuclear DNA; hence these sources of material are generally reserved for research studies or diagnosing specific rare disorders.
For research purposes, it may be informative to compare genome sequences using normal and diseased tissue (eg, a tumor) from the same individual. Such comparisons can help determine the timing during which mutations are acquired during cancer development, allowing an estimation of their relative contributions to carcinogenesis.
Whole genome, exome, or gene panel — NGS can be used to sequence every nucleotide in an individual's DNA (ie, the whole genome), or limited to smaller portions of the genome such as the exome or a preselected subset of genes.
●Whole genome sequencing – Whole genome sequencing is costlier than more limited sequencing, because the whole genome is equivalent to approximately 3.3 x 109 bases (3.3 gigabases [Gb]). Whole genome sequencing may become preferable to exome sequencing as cost decreases and more information about the role of non-coding DNA in human disease becomes available. Further, whole genome sequencing can be used to detect deletions and duplications typically detected only by array comparative genomic hybridization (aCGH) and thus is an opportunity to reduce the need for other supplemental testing.
●Exome sequencing – The exome contains the portions of genes that encode proteins; it represents only 1.5 to 2.0 percent of the genome (ie, about 30 megabases [Mb]). The remaining (non-exomic) DNA consists of introns and regulatory regions that control other aspects of gene function such as splicing and gene expression levels.
Exome sequencing is a reasonable approach for some clinical situations, because over 85 percent of known disease-causing mutations are found in exons. This approach substantially reduces cost and data storage requirements compared with whole genome sequencing. Exome sequencing also simplifies clinical reporting, because the significance of variants in exons is easier to interpret in most cases.
The main disadvantage of exome sequencing over whole genome sequencing is that exome sequencing could potentially miss a pathogenic variant(s) in a non-coding region of the genome. Thus, whole genome sequencing may be used in selected cases if initial exome sequencing is not diagnostic.
●Targeted gene panels – Gene panels provide sequence data for a limited subset of genes (typically 10 to 200 genes). Targeted gene panels are used in settings in which it would be appropriate to sequence many genes to make a diagnosis (eg, a Mendelian disorder for which the number of candidate genes is too large for traditional Sanger sequencing). As an example, targeted gene panels can be used to determine the genetic basis for hereditary early-onset and/or familial nonsyndromic hearing loss. Potential etiologies include pathogenic variants in over 60 genes; and screening for all of these individually would be impractical using traditional sequencing methods. Gene panels are becoming increasingly large (eg, >1000 genes), particularly for neurologic indications such as ataxia and autism spectrum disorders.
Targeted gene panels are also increasingly being used to evaluate potential hereditary cancer syndromes such as hereditary breast and ovarian cancer (HBOC) or hereditary colon cancer. In many cases, when testing of a single gene or a small number of genes is ordered, the laboratory will test the entire panel but only report on the genes that were ordered. In a series of 360 unselected patients with ovarian or fallopian tube cancer who were tested for abnormalities in a panel of tumor suppressor genes, 82 of the patients (23 percent) carried germline loss-of-function mutations in 1 of the 21 genes tested . Of these 82 women, 25 (31 percent) had no family history of breast or ovarian cancer. In our experience, strength of the family history was not a reliable predictor for finding a pathogenic variant in a gene on further testing after initial BRCA1 and BRCA2 testing was negative . Specific recommendations for HBOC are presented in detail separately. (See "Genetic testing and management of individuals at risk of hereditary breast and ovarian cancer syndromes".)
Targeted gene panels may be preferable to exome sequencing due to the considerable cost advantage, the lower likelihood of identifying variants of unknown significance that are unrelated to the disease being evaluated, and a higher depth of coverage (number of independent "reads" of a region of DNA sequence) compared with whole exome sequencing. In practical terms, depth of coverage indicates how many times the DNA sequence is "proofread." Depth of coverage is important in helping to distinguish true SNPs or variation in the genome from errors introduced from the sequencing process. The utility of targeted gene panels over Sanger sequencing continues to increase as costs decline and as these gene panels offer a wide enough net of genes to assay for many conditions.
The number of available panels also continues to increase, with validated panels available for hereditary cardiomyopathies, inherited cancer syndromes, lung diseases, ciliopathies, and other disorders in which molecular diagnosis is better facilitated by sequencing multiple genes known to be causative in the majority of cases. Additionally, the content of many disease-specific gene panels has been expanding to include genes that account for a smaller proportion of positive cases, and genes that have less available information on their mutation spectrum such as clinical outcomes and population-based data.
A review of expanded gene panels in the testing for HBOC susceptibility noted that these panels may include genes for which the disease association and the degree of increased risk is unclear . Consequently, some expanded gene panels may reveal variants that create dilemmas for medical management (eg, the appropriateness of prophylactic mastectomy may be substantially lower for a variant in a gene other than in BRCA1 or BRCA2). The use of expanded panels may also increase the number of variants of unknown significance (VUS), for which data are very limited regarding the impact of mastectomy on clinical outcomes. Expanded gene panels are likely to be least beneficial in low-risk groups such as females in the general population, for whom some experts suggest offering BRCA1 and BRCA2 screening .
For these reasons, caution should be exercised in choosing and interpreting results from expanded gene panels. The same expanded gene panel issues seen in HBOC testing (ie, inclusion of genes with limited data on the pathogenicity spectrum and/or questionable disease association) can be seen in other commonly-offered expanded gene panels, such as those for reproductive carrier screening, epilepsy, cardiomyopathy, and autism.
Accuracy — In general, one can expect usable data (adequate depth of coverage to provide confidence in the results) for over 92 to 95 percent of NGS sequence for whole genome and exome sequencing. Accuracy for targeted NGS gene panels is higher, since sequencing a smaller region of the genome allows for a greater degree of probe-template overlap (also called "probe tiling"). Sanger sequencing remains the "gold standard" for diagnosis based on gene sequencing, with >99.99 percent accuracy reported for most genes sequenced.
Clinical laboratories (ie, those with Clinical Laboratory Improvement Amendments [CLIA] certification) generally perform Sanger sequencing to confirm any variant reported back to the ordering clinician as pathogenic, because of the greater accuracy of Sanger sequencing [12,13]. However given the continued improvement of NGS-based technology, the necessity of performing secondary validation with Sanger sequencing is being challenged . (See 'Interpretation' below.)
Additional methods are used to limit the number of variants prioritized for validation and reporting, including the use of stringent bioinformatic filters and targeted sequencing in affected family members.
As noted below, re-analysis of prior sequencing data sometimes reveals a previously unrecognized diagnosis, such as reclassification of a variant of uncertain significance as either benign or pathogenic (see 'Adults' below). Many clinical laboratories offer a one-time "re-analysis of data" at a later time when the initial analysis of whole exome sequencing does not yield a diagnosis. Re-analysis of a whole exome or a genome sequence based on new insights into the human genome has the potential to impact care if it can be scaled.
Interpretation — Clinical interpretation of data from NGS is more challenging than interpretation of traditional sequencing, because NGS provides data regarding "variants" in multiple genes, many of which are unexpected, outside the gene(s) of interest, and of unknown prognostic significance .
Results of NGS are generally reported as one of the following:
●"Pathogenic" – Pathogenic variants are variants previously reported in patients with disease and/or are strongly suspected of being pathogenic based on preclinical studies.
●"Likely pathogenic" – Likely pathogenic variants are those with sequence features that are likely to be implicated in disease pathogenesis but for which conclusive evidence of pathogenicity is not available.
●"Likely benign" – Likely benign variants are those for which weak data in the medical literature supporting pathogenicity may be available, but for which the majority of evidence suggests the effect of the variant is benign.
●"Benign" – Benign findings are genetic variants not predicted to alter gene expression or function.
●"Uncertain clinical significance" – Variants of uncertain significance (VUS; also called "variant of unknown significance" or "finding of unknown clinical significance") are variants that have some features suggestive of possible functional consequence, but for which there is insufficient evidence for either a pathogenic or benign role.
The clinician and patient must be aware that the likelihood of receiving a result of VUS is reasonably high because the clinical implications of many variants are unknown. As an example, a review on exome sequencing noted that an average sequence could identify over three million variants, of which the clinical implications might be uncharacterized for approximately 15 to 20 percent . Thus clinicians ordering NGS testing and providing results of NGS to patients should receive proper training in how to discuss these issues. (See "Secondary findings from genetic testing" and "Genetic counseling: Family history interpretation and risk assessment".)
Clinical NGS laboratories generally will be in agreement on their interpretation of a variant as pathogenic or benign. In contrast, significant laboratory-to-laboratory variability exists for the other categories. Consistency across laboratories continues to improve as more data (especially clinical outcomes data) become available for any given variant.
The challenges in interpreting NGS data are illustrated by following reports:
●A whole genome sequence analysis of an otherwise healthy individual with Charcot-Marie-Tooth (CMT) disease, in addition to facilitating the identification of the CMT-causing mutations, identified 159 mutations with known disease (or trait) associations, including 33 variants associated with cancers, 48 associated with complex diseases, and 21 associated with Mendelian (monogenic) diseases . In the absence of significant family history for most of these conditions, the interpretation and management of these variants are unclear. However, longitudinal follow-up of the patient may be helpful as new data become available.
●A whole genome sequence analysis in a healthy 40-year-old man with a family history of coronary artery disease and sudden death in one of 27 relatives identified numerous variants associated with both rare and common conditions, including three mutations previously associated with sudden cardiac death . Predictive models were applied for 32 conditions, with probability estimates suggesting substantially increased risk for the development of obesity, myocardial infarction (probability increased from 2.0 percent pre-testing to 8.9 percent post-testing), type 2 diabetes (probability increased from 27 to 54 percent), prostate cancer (probability increased from 16 to 23 percent), and depression, and a reduced risk for Alzheimer's disease (probability reduced from 9 to 1 percent). Based upon these results, in conjunction with the patient's strong family history of atherosclerosis, the patient's physician advised initiating lipid-lowering therapies despite a lipid profile for which treatment would not otherwise be recommended by established guidelines. Given the anecdotal nature of this report and the lack of prospective data, it is unclear whether any gains were achieved from the availability of the sequence data.
These cases illustrate the importance of discussing the likelihood of identifying unsuspected variants or variants of unknown significance prior to ordering testing, and the importance of having genetic counselors and other experienced subspecialists available to guide the patient once such variants are discovered. (See "Secondary findings from genetic testing" and "Genetic testing", section on 'Obtaining informed consent'.)
Variant databases — The initial interpretation of the significance of a gene variant is performed by the genetics laboratory, and discussions of the implications generally are conducted by individuals with expertise in genetic counseling. However, it may be helpful (and important) for the ordering clinician to understand how pathogenicity is assessed, and in some cases to critically re-assess the interpretation. This is especially true in settings in which new information becomes available or when unique patient information (eg, personal or family history of disease) or values and preferences may alter clinical decision-making.
Interpretation of individual gene variants or mutations requires bioinformatics expertise. The process starts by checking the quality of the sequencing data. Sequence must be aligned to a normal reference genome. The interpretation of pathogenicity of gene variants incorporates several components:
●The degree of evolutionary conservation at the amino acid position, indicating a residue critical for protein function.
●The charge and hydrophobicity of the amino acid, indicating whether it is likely to be exposed on the protein surface or buried inside the protein. Information on clinical and disease manifestations associated with that specific variant from various databases.
A number of databases are available, including the National Institutes of Health (NIH) National Center for Biotechnology Information (NCBI) ClinVar effort (http://www.ncbi.nlm.nih.gov/clinvar/).
ClinVar's mission is to foster communication and access to genomic information to establish evidence-based relationships between human disease and genomic variation. It aggregates data from multiple sources (eg, research and clinical laboratories, published literature, expert panels) to help resolve conflicting interpretations and provided necessary transparency and consistency necessary for variant calling and interpretation. Of note, however, none of the databases, including ClinVar, are completely comprehensive in gene and variant content.
Furthermore, variant assessment regarding pathogenicity is submitter-dependent, and conflicting assessments among multiple submitters are not uncommon. Another challenge in variant interpretation is the use of inconsistent nomenclature for variants across databases and published literature, resulting from the use of different gene transcripts (which are not always specified). Furthermore, variant assessments are not consistently shared among genetic testing laboratories; as a result, important information that could benefit patients can become siloed (ie, retained within the confines of individual laboratories rather than shared).
Ongoing efforts in ClinVar, ClinGen, Genome Connect, and other groups are actively addressing these issues . Somatic variant databases in oncology are discussed below. (See 'Cancer screening and management' below.)
Even with the creation of large, publically available genomic databases and guidelines for assigning pathogenicity, interpreting genomic data and determining whether a change in an individual's DNA is pathogenic or normal human variation remains a challenge. Several studies have showed that concordance of variant calling by laboratories can be low, even amongst highly respected laboratories with significant expertise in the area being examined. There are multidiscipline efforts underway to help bring more consistency to variant calling [20,21]. This subject is discussed in more detail separately. (See "Secondary findings from genetic testing".)
CLINICAL USE OF NEXT-GENERATION SEQUENCING
Risks and benefits — NGS is not always the most appropriate clinical genetic test. It is expensive, time-consuming, and often unnecessary for diagnosing genetic conditions for which the clinical evaluation has limited potential candidates to one or a few genes amenable to Sanger sequencing or other more traditional methods of detecting genetic defects.
However, it is appropriate to consider exome sequencing or targeted NGS gene panels when a large number of pathogenic genes need to be screened [22,23]. Similarly, exome sequencing or whole genome sequencing should be considered when a condition demonstrates high heritability in a family or is suspected to have a genetic basis, but the number of potential candidate genes is large, or responsible gene(s) are unknown.
Potential benefits of NGS were demonstrated in the 100,000 genomes project, in which whole genomes were sequenced in 4660 individuals with undiagnosed conditions (2183 probands plus 2477 affected and unaffected family members) . The most common indications included neurologic disorders, ophthalmologic conditions, and tumor syndromes. Approximately three-fourths of the probands were adults. A genetic diagnosis was made in 25 percent of probands, and a variant of uncertain significance was identified in 10 percent. The diagnostic yield was higher for monogenic than for complex disorders (35 versus 11 percent).
In this study, benefits of genome sequencing included ending a diagnostic odyssey (often prolonged for years or decades), identifying new targets for research and drug development, and allowing testing of first-degree relatives, often children . Of the 533 genetic diagnoses, 134 (25 percent) were considered to be of immediate clinical actionability, with several remarkable examples described. Only 11 (0.2 percent) were considered to be of no benefit. The clinical impact is significant, as over 80 percent of rare diseases are thought to have a genetic component, and these disorders affect as much as 6 percent of the population.
When used in diagnostic testing, gene sequencing or genome sequencing requires assessment of identified variants in the context of all of a patient's clinical features and the full spectrum of clinical manifestations that occur in the disease being considered.
Indications for NGS
Diagnosis of complex diseases — Consideration of NGS as a clinical tool (eg, for genetic diagnosis) is appropriate in individuals for whom sequencing of a single gene is unlikely to provide a diagnosis.
Examples include suspected genetic disorders in the following settings:
●One of many potential genes may be responsible, and/or the clinician does not know which gene(s) to test because many different genes cause the same phenotype (eg, due to genetic heterogeneity).
●Obvious candidate genes have been tested and were found to be normal. This is especially applicable when the percentage of disease attributed to these candidate genes is low, and other potentially causative genes for the disorder are thought to exist but have not yet been identified. Such analyses are often aided by comparison of NGS results from affected and unaffected family members.
●It would be less costly and more efficient to sequence the entire genome, exome, or gene panel than to sequence individual candidate genes sequentially.
Children — One of the most common medical indications for whole genome sequencing or whole exome sequencing is evaluation of severe intellectual disability or developmental delay believed to have a genetic etiology in a child with a negative initial evaluation. In some cases, evaluation of an affected child and both parents ("trio sequencing") is performed – especially when the inheritance pattern is dominant and a de novo mutation is suspected [25,26]. The value of NGS in this setting has been illustrated in several studies, in which the likelihood of reaching a molecular diagnosis is on the order of 25 percent [25-29]. NGS may thus be appropriate when an extensive evaluation including chromosomal microarray is negative for developmental delay with a suspected genetic etiology.
●Exome sequencing was performed for 2000 individuals born with severe neurologic deficits unexplained by prior clinical evaluation; slightly fewer than half were younger than five years of age [27,28]. Molecular diagnosis was established in 504 of the patients (25 percent). Diagnosis was more likely in those with neurologic findings rather than anomalies of other organ systems; and the implicated genetic variant was more likely to be previously undescribed rather than a known variant. Almost one-third were due to genetic causes of disease discovered within the prior one to two years.
●Exome sequencing was performed in 814 patients (half younger than five years of age) with a variety of unexplained syndromes, the most common of which were developmental delay in children and ataxia in adults . Molecular diagnosis was established in 213 (26 percent). Diagnosis was more likely in patients who had trio sequencing rather than proband sequencing alone; patients younger than five years of age; and patients with retinal disorders (for which a larger fraction of disease genes may be known). Examples were presented in which identification of a variant previously not associated with disease coincided with publication of a case report demonstrating the association.
●Whole genome sequencing was used to evaluate 50 children with severe intellectual disability who did not have a diagnosis after extensive genetic testing that included exome sequencing; a genetic diagnosis was made in 20 of these children (40 percent) . This cohort was likely to have been enriched for de novo cases, since none of the children had a positive family history for intellectual disability. Rapid whole genome sequencing (rWGS) for critically ill newborns in neonatal and pediatric intensive care units when an underlying genetic etiology is suspected has been shown to improve clinical outcomes and to be cost effective .
Advances in methods and expansion of the information in databases that can be queried may improve diagnostic yield. This was demonstrated in a study involving a cohort of children with severe undiagnosed neurodevelopmental disorders and/or congenital anomalies, in which re-analysis of exome sequencing data four years after the initial analysis identified previously unrecognized genetic defects in an additional 182 individuals, raising the overall diagnostic yield from 27 to 40 percent . (See "Birth defects: Approach to evaluation", section on 'Whole-exome and whole-genome sequencing'.)
Diagnostic yield may be lower or higher in other clinical settings. In a study that used exome sequencing in 41 individuals with intellectual developmental disorder and metabolic abnormalities for which an underlying defect was not identified by previous extensive metabolic or genetic testing, whole exome sequencing helped to suggest nongenetic etiologies (eg, toxic or infectious exposures, autoimmune disorders) in four individuals and to identify a genetic diagnosis in 28 (68 percent) . The high yield for a genetic diagnosis, relative to other whole exome sequencing studies, was likely due to the focus on patients with a high likelihood of having an enzyme defect that could be the result of a single gene defect, given the prevalence of autosomal recessive disorders with metabolic consequences. This highlights how the clinical scenario can have a significant impact on diagnostic yield and thus can drive decisions on how to incorporate NGS tests into clinical care.
Adults — NGS is also being incorporated into the National Institutes of Health "Undiagnosed diseases program,” which evaluates patients who have a longstanding medical condition that eludes diagnosis . In some cases, molecular diagnoses may have been challenging prior to the availability of NGS if the phenotype was nonspecific (ie, intellectual disability without distinguishing syndromic features) and the candidate gene list too large for traditional Sanger sequencing to be practical.
This application of NGS was illustrated in a series of 1519 individuals with a variety of unexplained symptoms (neurologic, musculoskeletal, immunologic, rheumatologic, and others) who were referred to the Undiagnosed Disease Network (UDN, established in 2014) . A total of 601 of these were accepted for evaluation, including 251 adults (median age of children, 8 years; median age of adults, 29 years). Approximately half had whole exome sequencing and half had whole genome sequencing; many of the latter group had previously undergone whole exome sequencing. Of the 382 individuals who had completed their evaluation at the time of publication, 132 (35 percent) received a diagnosis, the majority based on their sequencing results. Of these 132, the diagnosis for 11 (8 percent) was established based on the re-analysis of previously obtained sequencing data. Most of the diagnoses were recognized presentations of a known syndrome, but several were unusual presentations of known syndromes, or new syndromes related to either previously known or previously unknown genetic variants. Many of these diagnoses resulted in changes in medical care (eg, new treatments, revised monitoring, genetic counseling). This study highlights the discovery aspect of genome sequencing and the value of sequencing in adults.
NGS can also be used to diagnose genetic diseases if the mode of Mendelian inheritance is known and familial samples are also sequenced for comparison. The clinical availability of whole exome screening has increased and is a valuable tool for trying to end so-called "diagnostic odyssey" cases .
Cancer screening and management — For many types of cancer, the choices of screening, diagnostic testing, and therapy incorporate genomic information about the tumor (somatic changes), germline changes in inherited cancer genes (eg, BRCA1 and BRCA2), and germline changes in genetic modifiers. Targeted gene panels have shown expanded usefulness across many cancer types, especially those for which more than one genetic variant may be responsible.
Multi-gene panels for certain inherited cancer syndromes based on National Comprehensive Cancer Network (NCCN) recommendations are becoming increasingly popular options for certain patients as the field moves away from single-gene testing to the panel approach . This approach may be more efficient and cost-effective, given the decreasing cost associated with NGS technology and increasing indications for genetic testing.
In February of 2019, the American Society of Breast Surgeons recommended all patients with a diagnosis of breast cancer undergo germline genetic testing via a multigene panel that includes BRCA1, BRCA2, and PALB2. This guideline was based on an article that cited the limitations of NCCN guidelines for identifying patients who may harbor a pathogenic mutation in a clinically relevant breast cancer gene . Criteria for genetic testing for hereditary breast and ovarian cancer gene variants is discussed separately. (See "Clinical features, diagnosis, and staging of newly diagnosed breast cancer", section on 'Postdiagnosis evaluation' and "Genetic testing and management of individuals at risk of hereditary breast and ovarian cancer syndromes", section on 'Criteria for genetic risk evaluation'.)
Examples of cancer gene panels used clinically to guide screening, preventive options, and cancer treatment include the following:
●Using a cancer gene panel that includes BRCA1 and BRCA2 if there is personal or family history of prostate and/or pancreatic cancer, even in the absence of breast or ovarian cancer. Other genes for which pathogenic variants may be associated increased breast cancer risk include ATM, CDH1, CHEK2, NF1, PALB2, and TP53. (See "Genetic testing and management of individuals at risk of hereditary breast and ovarian cancer syndromes", section on 'Criteria for genetic risk evaluation'.)
●Screening for inherited causes of gastrointestinal cancers (eg, panels that include APC, BMPR1A, EPCAM, MLH1, MSH2, MSH6, MUTYH, PMS2, PTEN, SMAD4, STK11, TP53, BLM, CHEK2, GALNT12, GREM1, POLD1, and/or POLE). (See "Screening for colorectal cancer in patients with a family history of colorectal cancer or advanced polyp".)
●Identification of familial acute leukemia syndromes. (See "Familial disorders of acute leukemia and myelodysplastic syndromes", section on 'Diagnostic genetic testing'.)
●Categorization of prognostic groups in acute myeloid leukemia. (See "Prognosis of acute myeloid leukemia", section on 'Gene mutations'.)
●Classification of certain inherited bone marrow failure syndromes. (See "Dyskeratosis congenita and other telomere biology disorders", section on 'Laboratory testing and bone marrow'.)
●Analysis of tumor tissue or non-tumor tissue to identify genetic abnormalities that may be present in the germline and/or the tumor that could potentially match molecularly targeted therapies. As an example, a poly(ADP-ribose) polymerase (PARP) inhibitor could be considered in a patient with an identified BRCA1 or BRCA2 mutation. (See "ER/PR negative, HER2-negative (triple-negative) breast cancer", section on 'Patients with previous exposure to chemotherapy'.)
In 2017, the US Food and Drug Administration (FDA) approved two gene panel tests (MSK-IMPACT and F1CDx) for analyzing pathogenic variants in solid tumors; these tests can be used on formalin-fixed, paraffin-embedded (FFPE) tissue regardless of the primary organ from which the tumor arose [38-40]. These tests detect variations in the coding regions of over 400 genes (MSK-IMPACT) and over 300 genes (F1CDx), and can provide information about differences between tumor and adjacent noncancerous tissue and about genomic signatures such as microsatellite instability and tumor mutational burden. The indications for F1CDx include metastatic and recurrent cancer for a number of solid tumor types (eg, breast, colon, endometrial, lung, melanoma, pancreatic) as well as other solid tumors and tumors of unknown primary, for which genomic information would inform patient management .
The indications for MSK-IMPACT are not specified; they may include diagnostic or prognostic classification or therapy decisions (eg, use of a clinically approved therapy or for identification of candidates for participation in clinical trials).
The MSK-IMPACT study prospectively sequenced 10,000 tumors and matched normal tissue (in most cases, peripheral blood was the DNA source) . This allowed the investigators to examine acquired genetic changes in tumors. The average depth of coverage (number of independent reads (see 'Accuracy' above)) was over 700-fold, and many of the alterations fell outside of the coverage regions of traditional "hot-spot" panels that focus on specific variants. Strong concordance was established with data from the Cancer Genome Atlas . However, MSK-IMPACT identified more variants of potential clinical significance than would be expected from whole exome sequencing.
Examples of important findings from MSK-IMPACT and their clinical significance included the following:
●Actionable findings were identified in 3792 individuals (37 percent of the cohort).
●TP53 mutations accounted for the highest percentage of pathogenic variants (over 10 percent of cases). MSK-IMPACT identified TP53 mutations in 29 percent of patients with prostate cancer; the Cancer Genome Atlas only identified TP53 mutations in 7 percent.
●The tumor types with the highest yields of actionable findings included gastrointestinal stromal tumors (in 76 percent); thyroid cancers (in 60 percent); breast cancers (in 57 percent); and melanomas (in 56 percent).
●Mismatch repair (MMR) abnormalities and microsatellite instability (MSI) were documented in approximately 100 individuals who had not been tested for MMR defects; some of these individuals received directed therapies that they may not have otherwise received.
Data from this study were used to enroll patients in therapeutic trials and in treatment decisions, highlighting the usefulness of the information.
This approach of determining therapy based on genetic abnormalities rather than tissue of origin is increasingly important as studies such as the NCI-MATCH attempt to advance precision medicine cancer treatment. In 2017, the monoclonal antibody pembrolizumab was approved by the FDA for the treatment of adult and pediatric patients with unresectable or metastatic solid tumors that have been identified as being MSI-high (MSI-H) or MMR deficient (dMMR) . This is significant because it was the first time a chemotherapeutic agent was approved independent of the anatomical site of origin for a cancer.
Other available genetic tests in oncology use gene expression rather than gene sequencing to identify molecular signatures in tumors (eg, Oncotype Dx panels for breast and prostate cancer). Gene expression profiling determines the level to which a gene is transcribed, as opposed to variations in gene sequence. (See "Molecular prognostic tests for prostate cancer", section on 'Tests based on molecular characteristics' and "Prognostic and predictive factors in early, non-metastatic breast cancer", section on 'Receptor status' and "Deciding when to use adjuvant chemotherapy for hormone receptor-positive, HER2-negative breast cancer".)
Given the investment in precision medicine and personalized medicine initiatives in the treatment of cancer, tumor variant repositories and knowledge bases are increasingly becoming important. The Catalogue of Somatic Mutations in Cancer (COSMIC) brings together data reviewed by expert curators from a number of sources including literature curation from genomic data uploaded for publication purposes, genomic profiles of cell lines used in cancer research, and data from other well-established databases such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC). This combination of sources creates a rich reference resource that can aid both clinical applications and research in cancer genomics.
Diagnosis of infections — In addition to diagnosing genetic disorders, NGS might be helpful in identifying an infectious pathogen when usual microbial or serologic testing is unrevealing or when faster diagnosis could improve outcomes [45,46]. This subject is discussed in detail in separate topic reviews. (See "Molecular diagnosis of central nervous system infections" and "Detection of bacteremia: Blood cultures and other diagnostic tests", section on 'Organism identification'.)
The role of NGS in identifying an elusive infection was demonstrated in a case report in which a 14-year-old boy developed unexplained fever and meningoencephalitis following travel to Puerto Rico; the boy had an underlying immunodeficiency syndrome . An extensive infectious disease evaluation was unrevealing, and the patient's clinical status deteriorated. NGS identified a species of Leptospira, a pathogenic spirochetal organism acquired by exposure to contaminated water or soil, in the patient's cerebrospinal fluid (CSF) but not serum. Antibiotic therapy was initiated based on the NGS results, with clinical improvement. Subsequent testing confirmed the diagnosis. Additional studies have illustrated the advantages and disadvantages of genetic testing for infections, as discussed separately. (See "Molecular diagnosis of central nervous system infections", section on 'Types of tests'.)
NGS technology has played a critical role in several steps in vaccine development during the coronavirus disease 2019 (COVID-19) pandemic:
●Identification of viral sequence [48,49]
●Characterization of viral structure 
●Vaccine design 
Resulting vaccines are discussed separately. (See "COVID-19: Vaccines".)
Healthy people — NGS technology is opening the door for proactive genetic screening and is being offered to healthy individuals to determine increased disease risks, pharmacogenomic variants, and nonmedical information (eg, ancestry). The Center for Disease Control and Prevention (CDC) has designated hereditary breast and ovarian cancer syndrome, Lynch Syndrome, and familial hypercholesterolemia as "Tier 1" genes, meaning that they have significant potential for positive impact on public health . However, studies demonstrating improved outcomes from testing healthy people for these syndromes is lacking.
Clinical laboratories have started to offer genetic health screens in which a number of medically actionable genes are sequenced. The content of these panels typically include expanded versions of the conditions and genes listed in the American College of Medical Genetics and Genomics (ACMG) recommendations (table 1) for the reporting of secondary findings in clinical exome and genome sequencing . These genes and the rationale for their inclusion are discussed separately. (See "Secondary findings from genetic testing", section on 'Decisions made by the laboratory'.)
Research efforts are underway to determine the clinical value of screening healthy people with genomic sequencing. (See 'Research' below.)
It is important to distinguish whole genome- and whole exome-based sequencing from more targeted approaches such as SNP-based panels or targeted disease-specific NGS panels, which may have fewer secondary findings and may be associated with lower costs . The costs associated with NGS panels continue to decrease, and this decrease has prompted the discussion of universal screening for assessing certain risk factors such as inherited cancer risk. As an example, the cost-effectiveness of such an approach using universal BRCA testing of all women older than 30 years was modeled in a 2015 analysis, which concluded that if the price of a BRCA test dropped below USD $250, this would lead to a cost of USD $53,000 per quality of life year (QALY) gained by preventing breast cancers, a cost on par with that of routine screening colonoscopy every 10 years beginning at age 50 years for colon cancer prevention . NGS panels that include the BRCA1 and BRCA2 genes in addition to other inherited cancer genes (eg, MSH and MLH genes associated with Lynch syndrome) can be ordered clinically with self-pay prices in the range of $250 to $500. The cost-effectiveness of population screening using BRCA1/2-based NGS cancer panels continues to gain support . Further explanation about SNPs and additional information about SNP-based panels are presented separately. (See "Tools for genetics and genomics: Cytogenetics and molecular genetics", section on 'Detecting known mutations' and "Basic genetics concepts: DNA regulation and gene expression", section on 'Sequence variants'.)
Limitations — NGS may not be as accurate as other methods for detecting specific types of mutations. As an example, detection of chromosomal copy number changes and/or large gains, losses, or translocations by NGS is problematic due to the short DNA sequence read lengths. These may result in failure to detect chromosomal deletions or insertions. Traditional Sanger sequencing shares some of the same limitations but, theoretically, to a lesser degree with its longer read lengths (approximately 1000 bases for Sanger sequencing versus approximately 100 to 200 bases for standard NGS, although some platforms are reporting longer read lengths). Research efforts to increase the sequencing read length in NGS platforms continues.
In general, when large chromosomal aberrations are suspected, alternative platforms are usually preferred over NGS such as comparative genomic hybridization (CGH) microarray, multiplex ligation-dependent probe amplification (MLPA), fluorescence in situ hybridization (FISH), or cytogenetics, with the ideal choice often dependent on the specific clinical condition being evaluated. These alternative methods are presented separately. (See "Genomic disorders: An overview".)
Additional potential limitations such as cost, long turnaround time, and concerns about lack of insurance reimbursement, are gradually becoming less of a concern as clinical use of NGS increases. (See 'Cost and turnaround time' below and 'Insurance reimbursement' below.)
Genetic discrimination — A common concern about genetic testing is the potential for inadequate protection of privacy of genetic information and associated impacts on employment and insurance coverage. This issue is discussed in detail separately. (See "Genetic testing", section on 'Genetic discrimination'.)
Disclosure of findings from genome sequencing — Genome sequencing may lead to incidental findings with potential clinical importance. Issues of whether and how to disclose such findings, and which findings to report to the individual being tested, as well as recommendations from the American College of Medical Genetics (ACMG), revised in 2021 , regarding which results should be disclosed, are discussed in detail separately. (See "Secondary findings from genetic testing".)
Historically, individual genomic reports provided directly from companies to consumers often included many variants unlikely to be clinically pathogenic. Because a genotyping chip rather than NGS platform was generally used, most variants were located in non-coding regions of DNA and occurred at high frequency in the general population; these variants were unlikely to have any significant predictive value. Often such variants were reported at higher frequencies than the lifetime disease risk observed in the population, further discounting their likelihood of clinical significance. Issues related to direct-to-consumer genetic testing are discussed in detail separately. (See "Personalized medicine", section on 'Direct-to-consumer testing'.)
RESEARCH — In contrast to clinical indications, potential uses of NGS as a research tool are extensive. Examples include identification of new genes involved in unexplained syndromes (which may be reported in the clinical setting if they could possibly explain a patient's disease-associated findings) and determination of genetic changes during the development of acute myeloid leukemia (AML) and other cancers . These findings may ultimately lead to breakthroughs in understanding disease pathogenesis, and to the development of additional diagnostic testing and/or management strategies for these diseases. (See "Molecular genetics of acute myeloid leukemia".)
Comparisons of genome sequencing across populations of individuals may inform research into disease mechanisms or might help to identify specific variants that might be of value for targeted testing or screening healthy people . (See 'Healthy people' above.)
●The All of Us Research Program (https://allofus.nih.gov/) is a major effort under the National Institutes of Health (NIH) Precision Medicine Initiative (PMI) to establish a national research cohort of over a million individuals that combines genomic data and patient surveys. The goal is to better understand how biology, environment, and lifestyle intersect to influence health. Genomic data from this cohort will be obtained using NGS technology. Enrollment began in mid-2017.
●In the United Kingdom, the Newborn Genomes Programme is evaluating whole genome sequencing in up to 200,000 newborn babies . The focus will be on gene variants associated with rare diseases that can present in childhood. Genes such as BRCA1 and BRCA2, which are associated with adult-onset conditions, will not be evaluated. Concerns will need to be addressed related to the inability of newborns to provide informed consent and the risks and harms associated with false-positive or false-negative results. Rapid whole genome sequencing in acutely ill children when a genetic etiology is possible improves outcomes and is cost efficient, but whether whole genome sequencing (rapid or routine) is cost effective and improves outcomes when implemented for all newborns is uncertain .
Research is also evaluating the best way to provide genetic information to healthy people. As an example, in a 2017 trial, 100 healthy adults were randomly assigned to receive a family history report alone or in combination with a whole genome sequencing report . Approximately 1 in 5 assigned to genome sequencing were found to have a monogenic disease risk finding, and approximately 1 in 25 had a new clinical diagnosis. All individuals assigned to genome sequencing also received findings related to carrier status, pharmacogenomic information related to five drugs, and risk predictions for eight cardiometabolic traits. Outcomes in both arms included new advice from the primary care physician, changes in health behaviors (typically in diet or exercise), and financial costs. Genome sequencing did not appear to increase overall anxiety, and an expert panel judged the majority of the primary care physicians' responses to be appropriate. These findings do not support the routine use of whole genome sequencing in healthy people. However, they do demonstrate how providers who are not geneticists may be able to manage results from genomics-based testing and challenge the idea that genetic information is too complex to handle at the primary care level, which will be important as genetic information becomes more ubiquitous in medical care and initial management decisions occur in the primary care setting.
Where to order — Considerations in selecting an NGS service include cost, turn-around time, quality of data interpretation, and method of results reporting. Many university and commercial laboratories in the United States offer Clinical Laboratory Improvement Amendments (CLIA)-certified NGS. The increased availability of CLIA-certified NGS tests has been spurred in part by continued decreased costs and the US Supreme Court's decision to overturn some of the patents held on BRCA1 and BRCA2 genetic testing . Specific resources are generally available through the local academic institution, but direct access between commercial testing laboratories and healthcare providers through utilization of prepackaged test kits is common. Most clinical NGS testing uses targeted gene panels and exome sequencing. An exception is clinical whole genome sequencing offered through Illumina.
Cost and turnaround time — The typical cost of clinical NGS ranges from several hundred to thousands of dollars. Factors that affect cost include the type of genome analyzed (eg, somatic tumor plus germline); whether a whole genome, exome, or gene panel is requested; the requested turnaround time (standard versus expedited); whether a single affected proband or additional family member(s) are tested; and the extent of sequence interpretation. The price barrier continues to drop, particularly where competition between laboratories is intensifying. Inherited cancer panels are now available for under USD $250 if a patient selects an option to pay the company directly, in advance. There is an ongoing shift towards more transparent pricing and out-of-pocket costs by the larger commercial NGS testing laboratories. (See 'Insurance reimbursement' below.)
Standard turnaround time for NGS is generally 3 to 12 weeks, but may be longer for more involved, complex cases.
Insurance reimbursement — Insurance reimbursement for NGS is a complex issue. Many commercial laboratories offer services to patients and providers to investigate insurance coverage if a patent prefers to have the laboratory bill the insurance company directly (rather than pay out-of-pocket). In a study that reported on findings of whole exome sequencing for 250 patients (mostly children) with neurologic phenotypes found that insurance coverage was similar to other genetic testing (eg, reimbursement for the majority of tests) . Of note, the individuals in this study had relatively major neurologic phenotypes, and all had undergone prior genetic testing, often extensive.
It is also common for insurance companies to accept appeals with letters of medical necessity justifying the need for NGS. These letters should provide an explanation for how the results of NGS will alter the clinical evaluation, diagnosis, treatment plan, or prognosis. For some genetic tests such as inherited cancer panels, national guidelines provide advice on the indications for genetic testing, which is typically covered by insurance when indicated. Many insurance companies and Medicare use the National Comprehensive Cancer Network (NCCN) guidelines as a basis for coverage criteria and decisions .
●Definitions – Next-generation sequencing (NGS) is a DNA sequencing technology that uses parallel sequencing of multiple small DNA fragments to determine sequence. This technology has allowed a dramatic increase in the speed (and a decrease in the cost) at which an individual's genome can be sequenced. (See 'Terminology and evolution of technologies' above.)
●Starting material – NGS can be performed on double stranded DNA from a variety of sources. DNA extracted from leukocytes in whole blood is used for most clinical testing. (See 'Source of DNA' above.)
●Genome, exome, or panel – NGS can be used to sequence an individual's whole genome or smaller portions of the genome such as the exome or a preselected subset of genes. The general trend is towards increased use of gene panels; there is also greater use of whole exome testing. Studies are examining the clinical usefulness and value to primary care providers. (See 'Whole genome, exome, or gene panel' above.)
●Data quality – Sanger sequencing is used to confirm the presence of specific variants identified by NGS in clinical settings, because accuracy of traditional sequencing methods is greater, although this practice is being challenged. The interpretation of pathogenicity (eg, pathogenic versus variant of unknown significance [VUS]) can differ by laboratory. It may be helpful (and important) for the ordering clinician to understand how pathogenicity is assessed, and in some cases to critically reassess the interpretation. (See 'Accuracy' above and 'Interpretation' above and 'Variant databases' above.)
●Uses – NGS may be appropriate for diagnosing suspected genetic disorders when sequencing of a single gene is unlikely to provide a diagnosis. Examples include the following settings (see 'Indications for NGS' above):
•One of several potential genes may be responsible
•Obvious candidate genes have been tested and found to be normal
•The cost of NGS would be less than that of sequencing individual candidate genes sequentially
NGS-based gene panels are used clinically in certain hematologic malignancies, and the first gene panel tests for solid tumors were approved in 2017. Other uses such as diagnosis of infections and screening of healthy people remain investigational.
There is growing uptake of screening healthy individuals with NGS-based gene panels; however, evidence to support this practice is lacking. (See 'Healthy people' above.)
●Risks – Concerns regarding consent for NGS testing, genetic discrimination, and disclosure of incidental findings from NGS are discussed separately. (See "Genetic testing" and "Secondary findings from genetic testing".)
●Cost and turnaround time – Many university and commercial laboratories in the United States offer NGS-based tests, the most common application being targeted gene panels for a specific clinical indication (eg, breast and ovarian cancer risk, cardiomyopathy, developmental delay). The cost may be as low as approximately USD $250 to up to several thousand dollars, depending on the type and amount of the genome analyzed as well as the condition being tested. Standard turnaround time is 3 to 12 weeks. Insurance reimbursement may be similar to other genetic testing in individuals with major clinical phenotypes for whom previous testing was unrevealing. (See 'Practical issues' above.)
ACKNOWLEDGMENT — The UpToDate editorial staff acknowledges Joseph V Thakuria, MD, MMSc, who contributed to earlier versions of this topic review.