INTRODUCTION — One of the greatest obstacles clinicians experience in reading about and understanding genetics is the extensive use of technical language and jargon. Genetic terms are frequently used imprecisely in published clinical literature.
This topic defines some of the most important technical terms.
A more extensive discussion of terms can be accessed in standard genetics reference texts [1].
A guide for the conventions regarding naming conventions for human genes and alleles is provided on a website from the HUGO Gene Nomenclature Committee (www.genenames.org/about/guidelines/). It is worth noting that preferred usage as established by professional societies is unevenly applied, even in carefully edited and reviewed documents. Furthermore, usage evolves over time, so it is important to consider shifts in terminology that reflect evolving standards as new technologies and findings alter the scientific and medical communities' communication needs. Readers should therefore expect that actual usage of terms may vary from the definitions provided here.
Glossaries of epidemiological terms and terms that apply to systematic reviews and meta-analyses are presented separately in UpToDate. (See "Glossary of common biostatistical and epidemiological terms" and "Systematic review and meta-analysis", section on 'Glossary of terms'.)
DEFINITIONS
Allele — An allele is one of a series of alternative forms (genotypes) at locus, or a specific region of a chromosome. At the DNA level, different alleles have different base sequences.
Allelic fraction — The allelic fraction can be defined as the number of times a variant base is observed, divided by the total number of times any base is observed at the locus [2]. Allelic fraction is generally applied to a single variant in a tumor, and thus is distinct from allelic frequency, which examines the frequency of an allele in a population. (See 'Allele frequency' below.)
Allele frequency — The proportion of chromosomes, loci, or genes in a population harboring a specific allele. "Minor allele frequency" typically refers to the less common variant at a biallelic locus and is usually used to refer to the frequency of a single nucleotide polymorphism (SNP) (see 'Single nucleotide polymorphism (SNP)' below). This population frequency is distinguished from allelic ratio, which applies to a single person (eg, with a malignancy).
Allelic heterogeneity — Allelic heterogeneity refers to the common occurrence of multiple pathogenic variants in one gene that all result in the same disease or syndrome. As an example, more than 1500 variants in the cystic fibrosis transmembrane conductance regulator (CFTR) gene cause cystic fibrosis. Conversely, for some disorders, different pathogenic variants in a single gene can produce very different phenotypes.
Allelic heterogeneity differs from genetic heterogeneity, in which variants in multiple genes can cause the same disease phenotype. (See 'Genetic heterogeneity' below.)
Allelic ratio — Allelic ratio measures the relative abundance of variant to wildtype alleles within a tumor. Higher allelic ratios (ie, a greater fraction of variant alleles) have been reported to be associated with poorer prognosis. Unlike allele frequency, which is a characteristic of a population (see 'Allele frequency' above), allelic ratio is a property of cells within a tumor in a single individual. Allelic ratio is of necessity an inexact concept because it is rare (for solid tumors at least) to avoid substantial contamination by non-tumor cells from blood, stroma, and vasculature. Amplification of variant sequences in a tumor can also have a large impact on allelic ratios.
Aneuploidy — The state of having an abnormal number of chromosomes. A euploid human karyotype has 46 chromosomes (figure 1). Aneuploidy can affect the entire somatic cell population, as in trisomy 21, or it can affect a subset of cells, as in a tumor.
Anticipation — A phenomenon whereby the symptoms of a genetically-based condition appear at an earlier age, or with greater severity, in successive generations. Expansion of trinucleotide repeats is a known molecular cause for specific diseases (such as myotonic dystrophy, fragile X syndrome, Huntington’s chorea) that manifest anticipation.
Association — Genetic association is a property of alleles. It refers to the non-random relationship between an allele and a phenotype in a population. Genetic association between a marker allele and a phenotype can result either because the allele is a direct causal variant, because the allele is in linkage disequilibrium or segregating with a causal variant in close proximity, or because of stratification of the population. Association may be determined in a genome-wide association study. (See 'Genome-wide association study (GWAS)' below.)
Autosome — A chromosome other than X or Y. The human genome has 44 autosomes (22 pairs of autosomes) (figure 1).
Autosomal — A gene is autosomal if it is located on an autosome rather than a sex chromosome. A gene's inheritance pattern is also referred to as autosomal if the pattern corresponds to that of known autosomal genes (rather than sex-linked). (See 'Sex-linked' below.)
Benign variant — (See 'Variant' below.)
Biome — Humans are colonized by a multitude of microorganisms, which vary by age and location in the body. The biome (or microbiome) is the totality of colonizing microorganisms in a specific environmental milieu.
Biomes may be studied genetically using metagenomics. (See 'Metagenomics' below.)
Carrier — An individual who is heterozygous for a risk or disease allele. The term is typically used to describe someone who is heterozygous for a gene variant that causes autosomal recessive or X-linked recessive disease, but in clinical discussions, it is also used to describe heterozygotes for risk alleles or deleterious alleles that predispose to disease, regardless of inheritance type.
Carrier rate — The frequency of carriers in a population.
Carrier testing — Clinical method of genotyping at-risk populations or relatives to identify individuals, usually asymptomatic, who have a pathogenic or likely pathogenic variant for an autosomal recessive or X-linked disorder. One example is prenatal screening for Tay-Sachs disease-associated variants in people of Ashkenazi Jewish ancestry. (See "Genetic testing" and "Preconception and prenatal carrier screening for genetic disorders more common in people of Ashkenazi Jewish descent and others with a family history of these disorders".)
Cascade testing — Refers to a testing approach in which at-risk first-degree relatives are tested for a familial genetic variant; if these individuals test positive, then their first-degree relatives are tested. Cascade testing allows testing to be focused on a specific variant and reduces unnecessary testing of relatives who are not at risk.
Centromere — The condensed region of a chromosome that mediates attachment of chromosomes to the microtubules of the mitotic or meiotic spindle. The centromere is important in preserving normal chromosome number.
Centrosome — The specialized structure adjacent to the nucleus that nucleates microtubules. During cell division, replicated centrosomes move to opposite ends of the cell to form the poles of the mitotic spindle. (See 'Mitosis' below and "Basic genetics concepts: Chromosomes and cell division", section on 'Mitosis'.)
Chimerism — When referring to an individual, a state in which two or more populations of genetically distinct cells are present that arose from the fusion of two or more fertilized eggs. Contrasted with mosaicism. (See 'Mosaicism' below.)
Also used in patients post-allogeneic hematopoietic stem cell transplant to refer to a state of two genetically distinct populations of hematopoietic cells (one from the donor and one from the recipient). (See "Preparative regimens for hematopoietic cell transplantation", section on 'Chimerism'.)
Chromatid — One of two replications, or copies, of a chromosome formed prior to cell division and joined together at their centromeres. The centromere is the last portion of a chromosome to replicate during cell division. Sister chromatids are a pair of chromatids attached at the centromere. (See 'Centromere' above.)
Chromatin — A complex structure composed of DNA, RNA, and proteins that facilitates efficient packaging of DNA in cells. The primary structure of chromatin is the nucleosome, consisting of double-stranded DNA coiled around a core of histone proteins. Nucleosomes packed tightly together form a "bead-on-string" configuration, which in turn assembles in hierarchical looping structures to create densely-packaged chromatin. The regulation of gene transcription is governed by the uncoiling of packed chromatin (heterochromatin) into exposed DNA (euchromatin). (See "Principles of epigenetics".)
Clonal — Arising from a single clone, or cell. Examples include clonal selection of lymphocytes during immune development and clonal origin of leukemia cells or other tumor cells. (See "Immunoglobulin genetics" and "Acute myeloid leukemia: Pathogenesis".)
Cloning — Production of a genetically identical copy. Can refer to a single gene or to an entire organism.
Coding region — Portion of a gene that encodes a protein.
Coding mutation, variant, or polymorphism — A genetic variation in the open reading frame (protein-encoding region) of a gene.
●Missense – Coding variants that alter amino acid composition of a protein are called non-synonymous or missense variants (figure 2).
●Synonymous – Variants that do not alter amino acid composition are called synonymous variants. These are sometimes called silent variants, although not all synonymous variants are truly silent, as they may affect other aspects of gene function besides protein coding.
●Nonsense – Nonsense variants are coding variants that result in the introduction of a stop codon (figure 3).
●Frameshift – A frameshift mutation results from an insertion or deletion of a number of bases not divisible by three, resulting in shifting of the reading frame (figure 4).
Variants are also classified according to their pathogenicity. (See 'Variant' below.)
Codon — A three-nucleotide sequence that codes either for a specific amino acid or for chain initiation or termination during protein synthesis (translation of RNA to protein). The figures illustrate the codon sequences illustrated as a chart (figure 5) or a wheel (figure 6). (See "Basic genetics concepts: DNA regulation and gene expression", section on 'Translation'.)
Complementation — The restoration of normal phenotype by gene replacement. The replaced gene can either be an intact copy of a defective gene (direct replacement), or an alternate gene with function that can compensate for the defective gene's aberrant function.
Complex trait/complex disease — Trait or disease for which interactions between more than one gene and/or environmental factors also play a role in the phenotype. (See "Principles of complex trait genetics".)
Compound heterozygous — Harboring two different pathogenic variants in the same gene that together are sufficient to manifest an autosomal recessive phenotype. This differs from "homozygous," which refers to harboring two copies of the same pathogenic variant, and from "double heterozygous," which refers to harboring two pathogenic variants at two separate genetic loci, which together manifest disease. (See 'Digenic inheritance' below and 'Double heterozygous' below.)
Consanguinity — Reproduction between two individuals from the same bloodline (eg, first cousin, second cousin). Consanguineous parentage increases the probability of a rare recessive disease, resulting from higher probability of both parents sharing the same rare deleterious sequence variant.
Copy number variation (CNV) — The most prevalent type of chromosomal structural variation, in which the number of copies of a large chromosomal or DNA segment (usually measuring thousands to millions of bases) varies between individuals. (See "Genomic disorders: An overview", section on 'Copy number variations'.)
Coupling (in cis) — The presence of two specified alleles at two linked loci on the same homologous chromosome (ie, "in cis"), and the two alternative alleles on the other chromosome. For illustration, in the case of dominant and recessive alleles, the coupling gametes formed are AB and ab (figure 7).
In contrast, repulsion refers to the presence of the specified alleles at two linked loci on different chromosomes (ie, in trans). (See 'Repulsion (in trans)' below.)
CRISPR — CRISPR (clustered regularly interspaced short palindromic repeats; pronounced "crisper"; sometimes referred to as CRISPR-Cas9) is a method used for gene or genome editing. It was adapted from a component of a bacterial defense system in combination with an endonuclease such as Cas9 or Cpf1. (See 'Genome editing' below and "Overview of gene therapy, gene editing, and gene silencing", section on 'Gene editing'.)
Crossing-over — The exchange of chromosome segments through the process of recombination that occurs between two homologous chromosomes during meiosis. The site on the chromosome where the exchange occurs is called a crossover.
De novo mutation — A novel genetic sequence variant in the proband's DNA and absent in the parents' DNA. Often used to distinguish familial from sporadic cases of genetic disease.
Digenic inheritance — Diseases caused by co-inheritance of variants at two distinct genetic loci (ie, in two different genes). Individuals with digenic inheritance may also be called "double heterozygous," which is distinct from compound heterozygous (having two different pathogenic variants in the same gene). (See 'Compound heterozygous' above.)
Diploid — Possessing two copies of each autosomal chromosome and two sex chromosomes. Most human cells are diploid. Hepatocytes are frequently polyploid (tetraploid or greater). Gametes are haploid (one copy of each autosome and one sex chromosome) (figure 1). (See 'Haploid' below and 'Ploidy' below.)
DNA — DNA (deoxyribonucleic acid) is the primary molecular constituent of chromosomes that stores the genetic information of most living organisms, including humans.
The genetic information in DNA is encoded by the sequence of the four bases adenine, guanine, thymine, and cytosine.
●Adenine and guanine are purines.
●Thymine and cytosine are pyrimidines.
DNA is usually present as a double-stranded antiparallel polymer composed of an outer phosphodeoxyribose backbone with central nucleotide side chains (figure 8). (See "Basic genetics concepts: DNA regulation and gene expression", section on 'DNA and RNA'.)
DNA barcoding — A collection of methods developed to facilitate the analysis of complex mixtures of pooled samples, whereby short, unique DNA sequences (referred to as tags or barcodes) are added to each of the DNA samples (each isolated from a distinct individual) in the pool. Barcoding is used routinely in next-generation sequencing applications, including single-cell RNA sequencing and exome sequencing. (See "Next-generation DNA sequencing (NGS): Principles and clinical applications" and 'Exome sequencing' below.)
Barcoding also refers to methods for determining the species of origin of a DNA sample on the basis of the DNA sequence itself. A clinical example is identification of the ingredients in an herbal preparation.
Dominant negative — Dominant negative alleles are alleles that cause an abnormal phenotype or disease by a mechanism that depends on the abnormal gene product of a pathogenic allele interfering with the function of the normal gene product from the normal allele.
In contrast with most loss-of-function variants that confer phenotype only when both alleles carry a pathogenic variant (recessive inheritance), dominant-negative variants act dominantly, meaning that only a single allele with the variant is sufficient to cause the disease phenotype.
Double heterozygous — Heterozygous for two pathogenic variants at two separate genetic loci that together are sufficient to manifest a phenotype. Differs from compound heterozygous. (See 'Compound heterozygous' above.)
Embryonic stem cell (ESC) — A pluripotential (pluripotent) cell derived from the inner cell mass of an early-stage embryo that is capable of differentiating into cells derived from all three germ layers. (See "Overview of stem cells", section on 'What defines a stem cell?'.)
Enhancer — A region of DNA, upstream (5') or downstream (3") of a gene, that regulates gene expression. Enhancer function relies on binding of specific regulatory proteins (transcription factors). (See "Basic genetics concepts: DNA regulation and gene expression", section on 'Gene expression'.)
Enhancer hijacking — Use by one gene of another gene's enhancer. Often due to changes in three-dimensional genome structure that place one region of DNA adjacent to another region. Can explain the mechanism of certain diseases involving aberrant gene expression.
Epigenetic change — A modification of chromatin, often tissue-specific, that alters the expression of a gene without changing the nucleotide base sequence. Epigenetic changes may be stable in an individual but may be reversed during gametogenesis, development, certain environmental stresses such as starvation, or by certain medications such as histone deacetylase (HDAC) inhibitors. DNA methylation and histone acetylation are common epigenetic changes. Epigenetic changes form the mechanistic basis of imprinting. (See "Principles of epigenetics".)
Epistasis — The process by which variations at two or more genetic loci interact to produce phenotypes different from the individual effects of each variant. This process is often referred to as either a gene-gene interaction or a genetic modifier effect.
Exome — The portion of the genome that consists of exons. (See 'Exon' below.)
Exome sequencing — A sequencing strategy that provides the DNA sequence corresponding to all exons (which represent approximately 1 to 2 percent of the genome), excluding introns and other untranscribed genomic sequences. Though the complete exome includes noncoding 5' and 3' untranslated regions (UTRs), most exome sequencing assays are enriched for the coding exons and largely exclude the noncoding regions.
Exon — A segment of DNA that is transcribed and present in mature messenger RNA (mRNA). Many exons encode a portion of a protein, but noncoding exons also exist. This is in contrast with an intron, the DNA sequence between exons that does not become part of mature mRNA. Exons constitute only a small percent of the genome (approximately 1 to 2 percent).
Expressivity — A parameter used in genetic models that quantifies the degree to which an inherited characteristic is expressed in an organism.
Expression quantitative trait locus (eQTL) — A polymorphic genetic region that influences population-level variability of a target gene's transcript (RNA) abundance. The implicated regulatory variants can influence transcript abundance through a variety of mechanisms that either alter the timing or rate of transcription (RNA production) or transcript stability (RNA degradation). (See 'Quantitative traits and quantitative trait loci (QTL)' below.)
Analogous terms that refer to loci that influence abundances of other molecule types include protein quantitative trait loci (pQTL), metabolomic quantitative trait loci (mQTL), and methylation quantitative trait loci (meQTL).
Frameshift — A frameshift is a DNA sequence variant that results from an insertion or deletion of a number of bases that is not divisible by three, resulting in a shift of the reading frame (figure 4) and thus altering the entire downstream protein sequence.
Fusion gene — A fusion gene is a functional gene product that results from the fusion of DNA segments from two physically distinct genes. The fusion occurs as a consequence of chromosomal rearrangements such as translocations, inversions, segmental deletions, or duplications. Examples include the BCR-ABL and the FIP1L1-PDGFRA oncogenes.
Gene — A gene is a unit of DNA sequence that encodes specific function. Classical definitions limit genes to those elements that code for proteins. However, non-protein coding genes (such as noncoding RNAs or pseudogenes) are also genes.
Gene editing — (See 'Genome editing' below.)
Genetic heterogeneity — Genetic heterogeneity refers to a phenomenon in which variants in different genes result in the same phenotype or disease. Examples include the multiple genetic causes of sensorineural deafness. This differs from allelic heterogeneity, in which multiple variants in the same gene can lead to the same phenotype. (See 'Allelic heterogeneity' above.)
Genetic polymorphism — A genetic polymorphism is a DNA segment for which two or more alternate forms can be found above a minimum threshold frequency (usually >1 percent) in a population. The common types of polymorphisms include single nucleotide variants (single base pair changes, also called single nucleotide polymorphisms [SNPs]), indels (insertion/deletion polymorphisms) or larger structural changes like copy number variants. Most commonly, genetic polymorphism refers a common single base-pair change or single nucleotide polymorphism (SNP). (See 'Polymorphism' below and 'Single nucleotide polymorphism (SNP)' below.)
Genetic risk score — An estimate of an individual's genetic risk for a specific polygenic phenotype [3]. Genetic risk scores are calculated using the cumulative contribution of all known risk alleles carried by the individual. This is in contrast to polygenic risk scores, which model genetic risk using a larger number of loci, including many that do not meet genome-wide significance criteria in association studies. (See "Principles of complex trait genetics" and 'Polygenic risk (PGR) score' below.)
Genome editing — Genome editing (or gene editing) refers to the use of nucleases to introduce specific changes at defined sites in a gene or genome. Genome editing is used as a tool for genetic perturbation in research.
Therapeutic applications for inherited genetic disorders are in early use or under investigation. (See "Overview of gene therapy, gene editing, and gene silencing", section on 'Gene editing'.)
Genome-wide association study (GWAS) — A GWAS (pronounced "gee-wass") study is a type of genetic mapping study that assesses for evidence of association between genetic variants and heritable traits across the entire genome. Typical studies consist of genotyping hundreds of thousands of common SNPs, using DNA microarrays or other methodologies, in large case-control populations, with the goal of identifying specific risk alleles that are more prevalent in cases than in controls.
Genotype — A genotype is the combination of two alleles at one genomic location (locus) or base pair in an individual (figure 7).
Germline — Germline refers to the gametes (ova and spermatozoa and their precursors) that have the capacity to give rise to offspring.
In the context of pathogenic variants, germline refers to variation that arose in germline cells as opposed to somatic variants that were acquired in a specific tissue.
Haploid — Cells or organisms possessing one copy of each autosomal chromosome and one sex chromosome (and therefore effectively one copy of each gene). Gametes (ova and spermatozoa) are haploid. Fertilization of a haploid ovum by a haploid sperm results in formation of a diploid embryo. Many microorganisms are haploid. (See 'Diploid' above.)
Haploinsufficiency — Having an abnormal phenotype due to inactivation of one allele by a pathogenic (deleterious) variant. In a diploid cell, the single remaining functional copy of the gene does not produce sufficient protein, resulting in disease.
Haplotype — The physical combination or sequence of alleles present on a single chromosome. By definition, alleles on one haplotype are in "cis" (figure 7).
Hemizygous — The state of carrying only one copy of a genomic region due to deletion or altered function of the corresponding region on the other chromosome. Carriers of large-scale deletions are hemizygous. Hemizygosity can confer disease if having one normally functioning copy is insufficient for normal cellular function (haploinsufficiency), but if a single functional copy of the gene is sufficient for normal cellular function, the phenotype may not be abnormal. Hemizygosity can also confer disease if a pathogenic variant is present within the hemizygous region. (See 'Haploinsufficiency' above.)
Heritability — The proportion of phenotypic variation that is explained by genetic (or in some cases, epigenetic) factors.
Heteroplasmy — The occurrence in a single cell of more than one population of mitochondrial genomes.
Identity by descent — Alleles are identical by descent if they can be traced back to a common ancestor. Identity by descent is a more stringent classification than identity by state (see 'Identity by state' below). Identity by descent is the basis for establishing linkage.
Identity by state — Alleles are identical by state if the assay being used to distinguish alleles determines that they are identical.
Imprinting — Gamete-specific gene silencing, in which only the allele from the mother or only the allele from the father is expressed, leading to observed parent-of-origin effects in offspring. Examples include the genetic locus responsible for Prader-Willi syndrome and Angelman Syndrome, and the GNAS gene, encoding the guanine nucleotide binding protein alpha activating subunit involved in pseudohyperparathyroidism. (See "Prader-Willi syndrome: Clinical features and diagnosis" and "Congenital cytogenetic abnormalities".)
Indel — A class of variants displaying an extra copy(ies) or a missing copy(ies) of a short genetic or chromosomal sequence (figure 9). (See "Chromosomal translocations, deletions, and inversions".)
Induced pluripotent stem cell (iPSC) — A pluripotent cell derived by in vitro reprogramming of a somatic cell. iPSCs are capable of both self-renewal and differentiation to mature lineages. (See "Overview of stem cells", section on 'Induced pluripotent stem (iPS) cells'.)
Intron — A segment of DNA between two exons that is transcribed to pre-mRNA but is removed through the process of splicing and therefore is not part of mature mRNA. Introns may contain regulatory DNA or serve other functions.
Inversion — A chromosomal rearrangement characterized by rotation and reintegration of a DNA segment, resulting in an inverted orientation of the segment relative to its typical state.
Karyotype — Karyotype refers to the complete set of chromosomes in an organism or tumor. Karyotype is determined by visual examination and counting of condensed chromosomes from several representative cells to determine the number of copies of each chromosome as well as any translocations, sub-chromosomal deletions, or duplications. Determination of the karyotype of a tumor is also called "cytogenetic analysis." (See "Tools for genetics and genomics: Cytogenetics and molecular genetics".)
Likely benign variant — (See 'Variant' below.)
Likely pathogenic variant — (See 'Variant' below.)
Linkage — Genetic linkage refers to the relationship that exists between two loci that violate the Mendelian law of independent assortment and therefore segregate in kindreds in a non-random fashion. Non-independent assortment results because linked loci reside together on the same chromosome (ie, they are syntenic). However, many pairs of syntenic loci are not linked. Linkage therefore implies the linked loci are in close physical proximity to each other. The genetic linkage distance is expressed as the recombination fraction, which is measured in centiMorgans (cM). This is not necessarily proportional to the physical distance (base pairs) separating the loci.
Linkage analysis — Method of gene mapping that tests for the non-random segregation of a phenotype (trait, disease) with discrete chromosomal segments. Identification of linked regions implies the existence of genetic variants within the linked region. The process of disease-gene or trait-gene identification within this region is termed positional cloning.
Linkage disequilibrium — The non-random association of alleles at two or more loci in a population. Linkage disequilibrium is present when the observed haplotype distribution of two or more markers in a population is significantly different from the expected haplotype distribution (which can be derived from the cross-product of observed allele frequencies) (figure 10).
Locus — A locus (plural, loci) is a specific chromosomal or genomic location.
LOD score — The "logarithm of the odds" (LOD) score is a quantitative measure of the statistical evidence of linkage between two genes. The LOD score depends on both the probability of co-segregation of the two genes during meiosis and the size and structure of the population in which the linkage analysis is performed.
By convention, LOD scores >3 are considered to be evidence of linkage in many human studies.
In some studies, the threshold LOD scores for linkage at various statistical significance levels (typically genome-wide alpha ≤0.05 and ≤0.01) can be established empirically via permutation testing.
Lyonization — (See 'X-inactivation' below.)
Manhattan plot — A type of plot used to display results of a GWAS study (see 'Genome-wide association study (GWAS)' above). Genomic coordinates are shown on the X-axis and the negative logarithm of the P-value for each SNP on the Y-axis. SNPs with the strongest association will have the lowest P-values, and hence the tallest profiles. Named for the appearance of the skyline in Manhattan in the United States (figure 11).
Marker — A locus with alternative alleles that can be used in genetic mapping experiments.
Meiosis — The cell division process in germline cells by which the chromosomal complement is reduced from the diploid to the haploid number (figure 12). (See "Basic genetics concepts: Chromosomes and cell division", section on 'Meiosis'.)
Mendelian inheritance — A trait is said to have Mendelian inheritance if its genetic transmission can be explained by a Mendelian model of inheritance, such as autosomal dominant, autosomal recessive, or X-linked recessive or dominant inheritance. This is in contrast to non-Mendelian inheritance patterns such as digenic inheritance, or quantitative traits. (See 'Digenic inheritance' above.)
Mendelian randomization — A study design in which genotypes serve as a proxy for an epidemiologic exposure(s). The rationale for undertaking such investigations is that alleles segregate independently and are therefore immune to biases that cannot be overcome in observational studies.
The alleles included in the study must be associated with the exposure being tested, and several additional stringent assumptions also must be satisfied, as discussed separately. (See "Mendelian randomization".)
Metagenomics — The study of complex microbial populations (biomes) using genomic approaches. Human tissues such as the skin and gut have multiple heterogeneous populations of microorganisms that differ from each other with respect to phyla composition and abundance in a tissue- and site-specific manner. These abundances can be estimated by sequencing the mixed population of microorganisms, either through targeted sequencing of 16S ribosomes (for bacterial characterization) or whole-genome approaches (for bacteria, viruses, fungi, and other organisms).
Methylation — The addition of methyl groups to cytosine bases in DNA or to lysine residues in the tails histones. Methylation followed by deamination is a major pathway for mutation of cytosine to thymine. Methylation is a form of epigenetic regulation that correlates with reduced gene transcription and is an important mechanism for gene imprinting and X-inactivation. (See 'Epigenetic change' above and 'Imprinting' above and 'X-inactivation' below.)
Micro-RNA (miR) — A small noncoding RNA that regulates the stability or translation of a set of messenger RNAs (mRNAs).
Microsatellite — A tandem array of short sequences of DNA (typically two to four bases). Microsatellites are numerous and widely distributed in the genome. There is often polymorphism in their length, making them useful markers in genetic studies, including genome mapping and pedigree-based linkage analysis. Microsatellites are also known as short tandem repeat markers (STRs) or short tandem repeat polymorphisms (STRPs). They constitute a prominent class of indel variants. (See 'Indel' above.)
Mitochondrial genome — The genetic material carried within mitochondria, known as mitochondrial DNA (mtDNA). At fertilization, all the mitochondria are derived from the egg, so mitochondrial genes display maternal inheritance. (See "Mitochondrial regulation and functions", section on 'Mitochondrial genetics'.)
Mitosis — The process of cell division occurring in somatic cells, in which each daughter cell receives a full chromosome complement. (See "Basic genetics concepts: Chromosomes and cell division", section on 'Somatic cell division'.)
Monogenic trait/monogenic disease — Trait or disease with inheritance that can be explained by a single gene. (See "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)".)
Monogenic traits are contrasted with polygenic and complex diseases. (See 'Polygenic trait/polygenic disease' below.)
Mosaicism — When referring to an individual, a state in which two populations of genetically distinct cells are present that arose from a single fertilized egg. Mosaicism can arise through a variety of mechanisms including chromosome nondisjunction, anaphase lag, endoreplication, and post-fertilization mutation. (See "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)", section on 'Mosaicism'.)
A common instance occurs in Klinefelter syndrome, in which post-fertilization nondisjunction causes some but not all cells to harbor an XXY karyotype. (See 'Karyotype' above.)
Contrasted with chimerism. (See 'Chimerism' above.)
Mutation, mutant — An alteration in a gene, or the altered version of a gene, typically in such a manner that affects function, but not always (eg, a "silent" mutation that changes the DNA sequence but not the protein sequence). These terms are used in several different senses, depending on context:
●In human genetics, a mutation is a genetic variant of low population frequency, in contrast with a polymorphism (often a single nuclear polymorphism [SNP]) with an allele frequency of 1 percent or greater. (See 'Single nucleotide polymorphism (SNP)' below.)
Types of gene mutations include:
•Nonsense mutation – Creates premature stop codon (figure 3).
•Missense mutation – Creates amino acid change (figure 2).
•Synonymous mutation – Does not change protein sequence.
•Frameshift – Shifts the reading frame of the DNA, in turn altering the triplet codons for protein translation, creating an entirely new protein sequence downstream of the mutation (figure 4).
●For human traits and diseases, there has been a shift in terminology to use the term variant rather than mutation; variants are further classified according to their pathogenicity. (See 'Variant' below.)
This revision in terminology was based on a 2015 guideline [4]; prior to this terminology change, mutation was commonly used to imply a change associated with abnormal function (eg, sickle cell mutation of the beta globin gene). Mutation remains appropriate in certain contexts such as when referring to a pathophysiologic process or to specific changes in a region of DNA (or less commonly, a protein). The rationale for preferring "variant" over "mutation" when discussing human genetic variation is discussed separately. (See "Secondary findings from genetic testing", section on 'Definitions and classification of variants'.)
●When used in the context of inheritance, a de novo mutation implies a recent sequence change (either germline or somatic), in contrast with inheritance from a carrier parent. (See 'De novo mutation' above.)
●When used to refer to a non-human organism or population of non-human organisms, a mutant refers to a population that harbors a specific, atypical variant (eg, antibiotic-resistant mutants). This term should not be used for people.
Mutation fraction — Synonymous with allelic fraction or allelic ratio. (See 'Allelic fraction' above and 'Allelic ratio' above.)
Next-generation sequencing — Any of several high-throughput DNA sequencing methods that rely on parallel analysis of multiple DNA fragments. Common approaches include whole genome sequencing and exome sequencing. These methods have resulted in dramatic decreases in the cost and time needed for sequencing projects and are used in many clinical and research settings. (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)
Noncoding variant — Genetic variation that does not map to gene regions that code for protein. These variants can be functional if they reside in and disrupt functional elements, such as noncoding RNA sequences or regulatory sites (promoters, enhancers, suppressors, or splice-sites).
Nucleic acid vaccine — Refers to a synthetic nucleic acid sequence (either DNA or RNA) packaged in a lipid nanoparticle that transfects human cells to produce antigenic viral proteins to induce an antiviral immune response.
Examples include FDA-approved RNA vaccines for coronavirus disease 2019 (COVID-19). The use of nucleic acids in the vaccines facilitates modifications when new viral variants arise. (See "COVID-19: Vaccines".)
Oncogene — Gene that contributes to the production of cancer. Oncogenes typically act in a dominant manner (an oncogenic mutation at one allele is sufficient to promote tumorigenesis). In contrast, tumor suppressor genes typically act in a recessive manner. (See 'Tumor suppressor gene' below.)
Pathogenic variant – Genetic change associated with disease or strongly suspected of being associated with disease. (See 'Variant' below.)
Pedigree — A diagram or other graphic representation of a kindred that shows the relationships among relatives, sex of each individual, and presence or absence of one or more genetic conditions in each individual (figure 13).
Penetrance — The probability that an individual harboring a pathogenic variant will develop the associated disease or condition. Incomplete (or variable) penetrance occurs when an individual with a pathogenic variant does not manifest features of the disorder. There are many causes of incomplete penetrance, including absence of environmental or genetic co-factors, epigenetic effects such as imprinting, sex-specific effects, or age-related expression differences. (See "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)", section on 'Penetrance and expressivity'.)
Phenotype — A characteristic of an organism (as opposed to the organism's genotype). Phenotypes are sensitive to the assays used to assign or measure them. They may be categorical, such as presence or absence of a disease; or quantitative, such as systolic blood pressure. Further complexities in phenotypic description involve the physiological state of the organism at the time of measurement, age, or use of provocative stimuli. Most phenotypes are variable, and understanding sources of variability, both instrumental and biologic, is important for interpretation.
Pleiotropy — The association of variant(s) in a single gene with multiple phenotypic effects, often in different tissues or organs. An example is Marfan syndrome, in which mutations in the fibrillin 1 (FBN1) gene can cause cardiac, ocular, and connective tissue findings.
Ploidy — The number of sets of chromosomes present in an organism or cell. Ploidy varies among different organisms, including those that are always haploid (bacteria), either haploid or diploid (eg, Saccharomyces species [yeast]), consistently diploid (eg, mammals) (see 'Diploid' above), or polyploid (eg, hexaploid wheat). Different tissues in multicellular organisms may have different ploidies (eg, mammalian hepatocytes are predominantly tetraploid). The gametes (ova and sperm) are haploid (see 'Haploid' above). Ploidy may change in illness, eg, cardiomyocytes display polyploidy in the setting of heart failure. The designation of ploidy is based on the predominant ploidy of cells in the organism. The designation of ploidy is based on the predominant ploidy of cells in the organism.
Polygenic risk (PGR) score — An estimate of an individual's genetic risk for a specific polygenic phenotype that is derived from contributions of alleles at multiple loci, up to thousands. Allele-specific contributions are estimated using specialized linear regression methods. The scores are typically generated in a model-building population, then validated in additional independent test populations.
PGR is synonymous with polygenic score and contrasts with genetic risk score, which calculates the contribution of the known risk alleles carried by an individual. (See 'Genetic risk score' above and "Principles of complex trait genetics", section on 'Polygenic risk scores'.)
Polygenic trait/polygenic disease — In contrast to monogenic diseases, polygenic diseases are those for which the inherited trait(s) is explained by more than one gene. (See 'Monogenic trait/monogenic disease' above.)
Polymerase chain reaction (PCR) — A method of specifically amplifying a unique target sequence (DNA or RNA) in the laboratory. PCR uses specific primers and repeated cycles of heating and cooling with a heat-stable DNA polymerase to replicate the template material exponentially. (See "Polymerase chain reaction (PCR)".)
Polymorphism — Polymorphism can refer to a genetic polymorphism. (See 'Genetic polymorphism' above.)
It can also refer to any biologic marker (DNA, RNA, or protein) with two or more states. Protein polymorphisms (varying amino acid sequence) can result from DNA polymorphisms or from differential RNA splicing (different isoforms), which in turn can result from sequence variation, epigenetic phenomena, or temporal/spatial/environmental differences.
Quantitative traits and quantitative trait loci (QTL) — A quantitative trait locus (QTL) is a genomic region linked or associated with a "quantitative" trait.
In contrast to traits that have discrete values (such as male or female), quantitative traits refer to quantifiable phenotypes that vary along a continuum of values rather.
Examples include clinical phenotypes (blood pressure or anthropomorphic measures such as height), intermediate disease phenotypes (serum immunoglobulin E levels), and molecular traits (serum biomarker levels, gene or protein expression levels). Each QTL explains a portion of the variance of a trait.
Another example is the level of fetal hemoglobin (Hb F). (See "Fetal hemoglobin (Hb F) in health and disease", section on 'Quantitative trait loci associated with HBG expression'.)
Quantitative traits are sometimes referred to as "complex" traits, reflecting the fact that multiple genes, the environment, and gene-environment interactions all contribute to an individual's trait value. (See "Principles of complex trait genetics".)
Read depth — In genomic or gene sequencing, the number of independent times each base in a targeted region has been sequenced. Typically expressed as an average X coverage (for example, 20X = an average of 20 sequence reads per base). A minimum read depth of 30X is often required for clinical-grade sequencing. (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)
Reading frame — The starting point in translating the DNA sequence to protein. Since each codon includes three nucleotides, the reading frame can be initiated at one of three nucleotides. Offsetting the reading frame (such as by a frameshift) changes the amino acid composition of the encoded protein (figure 4). (See 'Expression quantitative trait locus (eQTL)' above.)
Recombinant — Recombinant has different meanings in different contexts. For inheritance patterns, recombinant refers to offspring whose genotype and phenotype combinations differ from their parents, implying genetic recombination between the loci under study.
For laboratory techniques, recombinant technologies (also called genetic engineering), are molecular genetic approaches that use the process of homologous recombination to manipulate genotypes for experimental purposes. Examples include transgenic models where specific genetic loci are either knocked-out (removed) or knocked-in (introduced) to enable study of the locus; recombinant inbred mouse strains; recombinant viral transfection for synthesis of protein.
Recombination — The process of exchanging DNA sequence between two homologous chromosome regions. Mandatory recombination occurs at least once per aligned chromosome pair during meiosis. The exchange results in the creation of novel haplotypes that are combinations of the parental haplotypes present in a diploid cell. (See 'Meiosis' above.)
Exchange of unequal sequence content (non-homologous recombination) can introduce DNA gains and losses of thousands or millions of bases. These gains and losses result in structural genetic variation and copy number variants (CNVs). (See 'Copy number variation (CNV)' above.)
Repulsion (in trans) — The state in which alleles at two distinct loci are on physically distinct chromosomal strands (ie, in trans). By definition, these variants are not part of the same haplotype (figure 7). In the example of dominant and recessive alleles, repulsion gametes formed are Ab and aB.
The opposite relationship is coupling. (See 'Coupling (in cis)' above.)
Risk allele — An allele associated with a disease phenotype that typically acts in combination with other genetic or environmental factors. Though a risk allele is often that which is least common (ie, the minor allele), risk alleles associated with some complex traits may be the more common allele.
RNA — RNA (ribonucleic acid) is a polymer consisting of a phosphoribose backbone and the bases adenine, guanine, uracil, and cytosine as side chains. Many viruses use RNA rather than DNA as the principal form of genetic information (RNA viruses).
There are several different types of RNAs that have diverse structures and functions.
●mRNA – Messenger RNA (mRNA) is transcribed from the coding strand of DNA and transmits the genetic information to the protein synthesis machinery, serving as an intermediary between a gene's DNA sequence and its encoded protein.
●rRNA – Ribosomal RNA (rRNA) is an integral component of ribosomes, the organelles responsible for protein synthesis.
●tRNA – Transfer RNAs (tRNAs) carry specific amino acids and recognize the corresponding codons of the mRNA during protein synthesis.
●Regulatory RNAs – There are several types of regulatory RNAs such as micro RNAs (miRs), long noncoding RNAs (LNCs, the ribonucleoproteins involved in mRNA splicing), and PIWI-interacting RNAs (piRNAs). PIWI (P-element-induced wimpy testis) designates a class of proteins that may regulate stem cells and appear to be aberrantly expressed in some cancers [5]. (See "Basic genetics concepts: DNA regulation and gene expression", section on 'Transcription'.)
RNA interference (RNAi) — A ubiquitous intracellular process mediated by small RNA species, whereby specific mRNAs are targeted for editing, degradation, or clearance. RNAi has important roles in the regulation of gene expression, developmental processes, cellular defense, and epigenetic effects.
RNAi technology (also called antisense technology) has been used in the laboratory to test the function of a gene by preventing its expression. It has been used clinically as a means of posttranscriptional gene silencing to reduce the expression of viral or cancer genes, or to lower cholesterol. The specific therapies are sometimes referred to as antisense oligonucleotides (ASOs; AS-ODNs). Other therapeutic applications are in the fields of hematology, oncology, and neurodegenerative disease. (See "Overview of gene therapy, gene editing, and gene silencing", section on 'RNA interference'.)
Sequencing — Determination of the nucleotide base sequence of a gene or collection of genes that determines the amino acid sequence of a protein. (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)
Sex chromosomes — The X and Y chromosomes, which determine the sex of the individual (females are XX; males are XY).
Sex-linked — A gene is sex-linked if it is located on a sex chromosome rather than on an autosome. A gene's inheritance pattern is also referred to as sex-linked if the pattern corresponds to that of known sex-linked genes (rather than autosomal genes). (See 'Autosomal' above.)
Silencing — Regulation that prevents the expression of a gene. Mechanisms of silencing include gene methylation (see 'Methylation' above), destruction of messenger RNA, or prevention of protein translation. (See 'RNA interference (RNAi)' above.)
Single nucleotide polymorphism (SNP) — A single nucleotide polymorphism (SNP; pronounced "snip") is a polymorphism (difference in base sequence) that affects a single base pair. This terminology was previously used to refer to variation that had a population frequency of at least 1 percent. The term SNP is commonly used in research such as in GWAS studies (see 'Genome-wide association study (GWAS)' above). In clinical diagnostic testing, the term "variant" with a qualifier about pathogenicity is preferred (although use is inconsistent).
SNP may also be used to refer to polymorphisms in a testing platform such as a SNP array. (See "Genetic association and GWAS studies: Principles and applications", section on 'Single nucleotide polymorphisms' and "Tools for genetics and genomics: Cytogenetics and molecular genetics", section on 'Allele-specific oligonucleotide hybridization' and "Tools for genetics and genomics: Cytogenetics and molecular genetics", section on 'Array comparative genomic hybridization'.)
Somatic — Referring to tissues that are not within the germline. Somatic variants arise in somatic tissues and are therefore not passed from parent to offspring. Somatic variation is common in cancer.
Structural genetic variation — A term that encompasses a variety of large-scale genomic aberrations, including segmental rearrangements, translocations, or inversions and copy-number variants (CNVs). (See 'Copy number variation (CNV)' above.)
Large rearrangements or deletions can be visualized through karyotyping. Smaller variants, particularly CNVs, segmental duplications, and interchromosomal interstitial rearrangements, are assessed by array comparative genomic hybridization (array CGH) or SNP arrays. (See "Tools for genetics and genomics: Cytogenetics and molecular genetics", section on 'Genotyping microarrays'.)
Syntenic — Describing genetic loci that reside on the same chromosome. As an example, the genes causing Birt-Hogg-Dubé syndrome (Folliculin [FLCN], at chromosome 17p11) and early-onset breast cancer (BRCA1, at chromosome 17q21) are syntenic to each other on chromosome 17. However, because they are far apart from each other, they are not genetically linked. (See 'Linkage' above.)
Telomere — Region at the ends of a chromosome that prevents the loss of genetic material or the accidental fusion of two chromosomes together during cell division. Telomeres of chromosomes in most cells shorten as an individual ages. Telomere length is maintained by the enzyme telomerase. (See 'Telomerase' below.)
Telomerase — Multicomponent enzyme that extends the length of telomeres. Telomerase variants are seen in some inherited telomere syndromes. (See "Dyskeratosis congenita and other telomere biology disorders".)
Translocation — A translocation is a structural chromosomal abnormality whereby chromosome segments are exchanged (swapped) between two non-homologous chromosomes.
This form of rearrangement can be balanced, when the translocation does not result in any significant loss or gain of genetic material in the resultant gamete or cell; or unbalanced, when there is a gain or loss of genetic material in the resultant gamete or cell. (See "Chromosomal translocations, deletions, and inversions", section on 'Translocations'.)
Tumor suppressor gene — A tumor suppressor gene is a gene that protects against the development or growth of tumors. Tumor suppressor genes typically act in a recessive manner (both normal copies must be lost for a tumor to develop). In contrast, oncogenes typically act in a dominant manner. (See 'Oncogene' above.)
Uniparental disomy — The inheritance of two copies of a chromosome (or part of a chromosome) from one parent, and no copy from the other parent, due either to nondisjunction errors during either the first or second phases of meiosis, or to chromosomal alterations in early fetal development.
Nondisjunction during the first phase of meiosis (meiosis I) will result in inheritance of each of the grandparental chromosomes from one parent, termed "heterodisomy." In contrast, nondisjunction during meiosis II results in inheritance of two identical copies of one grandparental chromosome, termed "isodisomy."
Variant — Used to refer to a specific change in either DNA or protein sequence.
In microbiology, a variant refers to an organismal isolate whose genetic sequence varies from that of its reference organism. (See 'Viral variant' below.)
For germline variants, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology have recommended use of a five-tier terminology system for the clinical classification of genetic variants (table 1), consisting of the following designations [4]:
●Pathogenic variant (PV) – A disease-causing variant, as determined by very strong genetic and experimental evidence, including consistent familial co-segregation with disease and definitive functional studies.
●Likely pathogenic variant (LPV) – A variant with strong, but not definitive, evidence of pathogenicity based on its similarity to known pathogenic variants, co-segregation with disease in families or populations, and functional evidence.
●Variant of uncertain significance (VUS) – A variant for which the specific criteria for the other four criteria are not met, or when contradictory lines of evidence in support of both benign or pathogenic classifications are present. Also called variant of unknown significance.
●Likely benign variant – A variant with multiple supporting (but not conclusive) lines of evidence to suggest it is not disease-causing.
●Benign variant – A variant with conclusive evidence as not disease-causing, as determined typically (but not only) by a high prevalence of the variant in the general (healthy) population, at a prevalence that exceeds that of the suspected disease.
For clinical testing and management, these terms are preferred over "mutation." Mutation remains appropriate in certain contexts such as when referring to a pathophysiologic process or to specific changes in a region of DNA (or less commonly, a protein). (See 'Mutation, mutant' above.)
Additional information about this classification and its application to genomic testing is presented separately. (See "Secondary findings from genetic testing", section on 'Definitions and classification of variants'.)
Variant allele frequency — See allele frequency. (See 'Allele frequency' above.)
Variant of uncertain significance (VUS) — A classification term used in clinical DNA sequencing reports to signify genetic polymorphisms (variants) for which the pathogenicity (likelihood of causing disease) cannot be determined easily and that cannot be readily classified as "pathogenic," "likely pathogenic," "benign, or "likely benign." Also called variant of unknown significance. (See 'Variant' above.)
Viral variant — A viral isolate with a genome sequence that differs from that of the reference virus, regardless of whether the sequence variant alters the virus's phenotype. A viral strain is a viral variant with a sequence change that confers a unique viral phenotype (eg, altered replication rate, infectivity, or lethality). In contrast, variants with an impact limited to antigenicity are referred to as having different serotypes rather than as different strains. (See "COVID-19: Epidemiology, virology, and prevention", section on 'Viral evolution and variants of concern'.)
Whole genome sequencing — A sequencing strategy that provides the DNA sequence for the entire genome, including exons, introns, and other noncoding sequence. In contrast, exome sequencing only determines the sequence of gene-coding regions. (See 'Exome sequencing' above.)
X-inactivation — An epigenetic process (see 'Epigenetic change' above) that occurs in all female mammalian cells, whereby one of the two X chromosomes are randomly rendered inactive, such that all subsequent gene expression is derived from the other (active) X chromosome. This is sometimes called lyonization, after Mary Lyon, who did important early work on this phenomenon. (See "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)", section on 'Sex-linked patterns' and "Principles of epigenetics", section on 'Types of processes that are regulated'.)
SUMMARY
●Definitions – Commonly used genetics terms are defined above. (See 'Definitions' above.)
●Genetics concepts – Basic genetics concepts are discussed in separate topic reviews. (See "Basic genetics concepts: DNA regulation and gene expression" and "Basic genetics concepts: Chromosomes and cell division" and "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)" and "Genomic disorders: An overview" and "Principles of complex trait genetics" and "Principles of epigenetics".)
●Clinical applications – Use of genetic information in clinical care is also discussed in separate topic reviews. (See "Genetic testing" and "Genetic counseling: Family history interpretation and risk assessment" and "Personalized medicine" and "Secondary findings from genetic testing".)
●Genetics tools
•DNA sequencing – (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)
•PCR – (See "PCR testing for the diagnosis of herpes simplex virus in patients with encephalitis or meningitis".)
•Cytogenetics – (See "Tools for genetics and genomics: Cytogenetics and molecular genetics".)
•Gene expression profiling/genome-wide association studies (GWAS) – (See "Tools for genetics and genomics: Gene expression profiling" and "Genetic association and GWAS studies: Principles and applications".)
•Animal models – (See "Tools for genetics and genomics: Specially bred and genetically engineered mice" and "Tools for genetics and genomics: Model systems".)
آیا می خواهید مدیلیب را به صفحه اصلی خود اضافه کنید؟