Gene terminology - is one gene a concrete, single physical sequence?

Gene terminology - is one gene a concrete, single physical sequence?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Suppose you have two identical copies of the same, coding nucleotide sequence (e.g. two copies of BCL2 - a random gene I found on Wikipedia).

Could you say that these are two genes (i.e. the name "gene" refers to a specific, physical sequence, and where there are two of them, you have two genes, even if identical)? Or just two molecules of one gene (in the same way that when you have two molecules of carbon dioxide, you can't say that they are two carbon dioxides)?

'just two molecules of one gene'.

They are the same gene. Gene is more of an abstract concept. We say for example that humans have ~ 20,000 protein-coding genes. It does not mean that we have only 20,000 molecules, one for each gene. We have many more molecules of course. What we mean is that we have 20,000 different entities, each coding for a specific type of protein. If you have two molecules of a sequence in a tube, it becomes questionable whether these sequences can be thought of as genes. I would say that those sequences lose their 'genicity' or their property of being genes because they are not coding for anything in a tube. They lack the cellular machinery that makes them code for something. I would just say you have 2 nucleotide sequences of the gene BCL2 but not that you have 2 copies of the gene BCL2. It's maybe just a personal difference I see though.

A gene is more a functional unit in the genome. As such if you have, say, 5 sequences of the same gene, you will say that you have 5 copies of this gene but you won't say that you have 5 genes. If you say that you have 5 genes, then we would think that you refer to 5 different loci (locus = position in the genome).

Note that if you are not familiar with the term allele, then you will want to check it out. An allele is a variant of a gene. Again, an allele is more the information in the sequence than the sequence itself (a gene is typically a coding region). Therefore, if you say that at gene BCL2, there are 4 alleles in the populations, that means that there are 4 possible variants in the population, not that the whole population is made of only 4 physical sequences of this gene.

DNA, Genes and Chromosomes

DNA (or deoxyribonucleic acid) is the molecule that carries the genetic information in all cellular forms of life and some viruses. It belongs to a class of molecules called the nucleic acids, which are polynucleotides - that is, long chains of nucleotides.

Each nucleotide consists of three components:

  • a nitrogenous base: cytosine (C), guanine (G), adenine (A) or thymine (T)
  • a five-carbon sugar molecule (deoxyribose in the case of DNA)
  • a phosphate molecule

The backbone of the polynucleotide is a chain of sugar and phosphate molecules. Each of the sugar groups in this sugar-phosphate backbone is linked to one of the four nitrogenous bases.

Strand of polynucleotides

DNA's ability to store - and transmit - information lies in the fact that it consists of two polynucleotide strands that twist around each other to form a double-stranded helix. The bases link across the two strands in a specific manner using hydrogen bonds: cytosine (C) pairs with guanine (G), and adenine (A) pairs with thymine (T).

Double strand of polynucleotides

The double helix of the complete DNA molecule resembles a spiral staircase, with two sugar phosphate backbones and the paired bases in the centre of the helix. This structure explains two of the most important properties of the molecule. First, it can be copied or 'replicated', as each strand can act as a template for the generation of the complementary strand. Second, it can store information in the linear sequence of the nucleotides along each strand.

DNA helix showing nitrogenous bases

It is the order of the bases along a single strand that constitutes the genetic code. The four-letter 'alphabet' of A, T, G and C forms 'words' of three letters called codons. Individual codons code for specific amino acids. A gene is a sequence of nucleotides along a DNA strand - with 'start' and 'stop' codons and other regulatory elements - that specifies a sequence of amino acids that are linked together to form a protein.

So, for example, the codon AGC codes for the amino acid serine, and the codon ACC codes for the amino acid threonine.

There are a two points to note about the genetic code:

  • It is universal. All life on Earth uses the same code (with a few minor exceptions).
  • It is degenerate. Each amino acid can be coded for by more than one codon. For example, AGA and AGG both code for the amino acid arginine.
    A codon table sets out how the triplet codons code for specific amino acids.

DNA replication

The enzyme helicase breaks the hydrogen bonds holding the two strands together, and both strands can then act as templates for the production of the opposite strand. The process is catalysed by the enzyme DNA polymerase, and includes a proofreading mechanism.


Genetic engineering is a process that alters the genetic structure of an organism by either removing or introducing DNA. Unlike traditional animal and plant breeding, which involves doing multiple crosses and then selecting for the organism with the desired phenotype, genetic engineering takes the gene directly from one organism and delivers it to the other. This is much faster, can be used to insert any genes from any organism (even ones from different domains) and prevents other undesirable genes from also being added. [4]

Genetic engineering could potentially fix severe genetic disorders in humans by replacing the defective gene with a functioning one. [5] It is an important tool in research that allows the function of specific genes to be studied. [6] Drugs, vaccines and other products have been harvested from organisms engineered to produce them. [7] Crops have been developed that aid food security by increasing yield, nutritional value and tolerance to environmental stresses. [8]

The DNA can be introduced directly into the host organism or into a cell that is then fused or hybridised with the host. [9] This relies on recombinant nucleic acid techniques to form new combinations of heritable genetic material followed by the incorporation of that material either indirectly through a vector system or directly through micro-injection, macro-injection or micro-encapsulation. [10]

Genetic engineering does not normally include traditional breeding, in vitro fertilisation, induction of polyploidy, mutagenesis and cell fusion techniques that do not use recombinant nucleic acids or a genetically modified organism in the process. [9] However, some broad definitions of genetic engineering include selective breeding. [10] Cloning and stem cell research, although not considered genetic engineering, [11] are closely related and genetic engineering can be used within them. [12] Synthetic biology is an emerging discipline that takes genetic engineering a step further by introducing artificially synthesised material into an organism. [13] Such synthetic DNA as Artificially Expanded Genetic Information System and Hachimoji DNA is made in this new field.

Plants, animals or microorganisms that have been changed through genetic engineering are termed genetically modified organisms or GMOs. [14] If genetic material from another species is added to the host, the resulting organism is called transgenic. If genetic material from the same species or a species that can naturally breed with the host is used the resulting organism is called cisgenic. [15] If genetic engineering is used to remove genetic material from the target organism the resulting organism is termed a knockout organism. [16] In Europe genetic modification is synonymous with genetic engineering while within the United States of America and Canada genetic modification can also be used to refer to more conventional breeding methods. [17] [18] [19]

Humans have altered the genomes of species for thousands of years through selective breeding, or artificial selection [20] : 1 [21] : 1 as contrasted with natural selection. More recently, mutation breeding has used exposure to chemicals or radiation to produce a high frequency of random mutations, for selective breeding purposes. Genetic engineering as the direct manipulation of DNA by humans outside breeding and mutations has only existed since the 1970s. The term "genetic engineering" was first coined by Jack Williamson in his science fiction novel Dragon's Island, published in 1951 [22] – one year before DNA's role in heredity was confirmed by Alfred Hershey and Martha Chase, [23] and two years before James Watson and Francis Crick showed that the DNA molecule has a double-helix structure – though the general concept of direct genetic manipulation was explored in rudimentary form in Stanley G. Weinbaum's 1936 science fiction story Proteus Island. [24] [25]

In 1972, Paul Berg created the first recombinant DNA molecules by combining DNA from the monkey virus SV40 with that of the lambda virus. [26] In 1973 Herbert Boyer and Stanley Cohen created the first transgenic organism by inserting antibiotic resistance genes into the plasmid of an Escherichia coli bacterium. [27] [28] A year later Rudolf Jaenisch created a transgenic mouse by introducing foreign DNA into its embryo, making it the world's first transgenic animal [29] These achievements led to concerns in the scientific community about potential risks from genetic engineering, which were first discussed in depth at the Asilomar Conference in 1975. One of the main recommendations from this meeting was that government oversight of recombinant DNA research should be established until the technology was deemed safe. [30] [31]

In 1976 Genentech, the first genetic engineering company, was founded by Herbert Boyer and Robert Swanson and a year later the company produced a human protein (somatostatin) in E.coli. Genentech announced the production of genetically engineered human insulin in 1978. [32] In 1980, the U.S. Supreme Court in the Diamond v. Chakrabarty case ruled that genetically altered life could be patented. [33] The insulin produced by bacteria was approved for release by the Food and Drug Administration (FDA) in 1982. [34]

In 1983, a biotech company, Advanced Genetic Sciences (AGS) applied for U.S. government authorisation to perform field tests with the ice-minus strain of Pseudomonas syringae to protect crops from frost, but environmental groups and protestors delayed the field tests for four years with legal challenges. [35] In 1987, the ice-minus strain of P. syringae became the first genetically modified organism (GMO) to be released into the environment [36] when a strawberry field and a potato field in California were sprayed with it. [37] Both test fields were attacked by activist groups the night before the tests occurred: "The world's first trial site attracted the world's first field trasher". [36]

The first field trials of genetically engineered plants occurred in France and the US in 1986, tobacco plants were engineered to be resistant to herbicides. [38] The People's Republic of China was the first country to commercialise transgenic plants, introducing a virus-resistant tobacco in 1992. [39] In 1994 Calgene attained approval to commercially release the first genetically modified food, the Flavr Savr, a tomato engineered to have a longer shelf life. [40] In 1994, the European Union approved tobacco engineered to be resistant to the herbicide bromoxynil, making it the first genetically engineered crop commercialised in Europe. [41] In 1995, Bt Potato was approved safe by the Environmental Protection Agency, after having been approved by the FDA, making it the first pesticide producing crop to be approved in the US. [42] In 2009 11 transgenic crops were grown commercially in 25 countries, the largest of which by area grown were the US, Brazil, Argentina, India, Canada, China, Paraguay and South Africa. [43]

In 2010, scientists at the J. Craig Venter Institute created the first synthetic genome and inserted it into an empty bacterial cell. The resulting bacterium, named Mycoplasma laboratorium, could replicate and produce proteins. [44] [45] Four years later this was taken a step further when a bacterium was developed that replicated a plasmid containing a unique base pair, creating the first organism engineered to use an expanded genetic alphabet. [46] [47] In 2012, Jennifer Doudna and Emmanuelle Charpentier collaborated to develop the CRISPR/Cas9 system, [48] [49] a technique which can be used to easily and specifically alter the genome of almost any organism. [50]

Creating a GMO is a multi-step process. Genetic engineers must first choose what gene they wish to insert into the organism. This is driven by what the aim is for the resultant organism and is built on earlier research. Genetic screens can be carried out to determine potential genes and further tests then used to identify the best candidates. The development of microarrays, transcriptomics and genome sequencing has made it much easier to find suitable genes. [51] Luck also plays its part the round-up ready gene was discovered after scientists noticed a bacterium thriving in the presence of the herbicide. [52]

Gene isolation and cloning Edit

The next step is to isolate the candidate gene. The cell containing the gene is opened and the DNA is purified. [53] The gene is separated by using restriction enzymes to cut the DNA into fragments [54] or polymerase chain reaction (PCR) to amplify up the gene segment. [55] These segments can then be extracted through gel electrophoresis. If the chosen gene or the donor organism's genome has been well studied it may already be accessible from a genetic library. If the DNA sequence is known, but no copies of the gene are available, it can also be artificially synthesised. [56] Once isolated the gene is ligated into a plasmid that is then inserted into a bacterium. The plasmid is replicated when the bacteria divide, ensuring unlimited copies of the gene are available. [57]

Before the gene is inserted into the target organism it must be combined with other genetic elements. These include a promoter and terminator region, which initiate and end transcription. A selectable marker gene is added, which in most cases confers antibiotic resistance, so researchers can easily determine which cells have been successfully transformed. The gene can also be modified at this stage for better expression or effectiveness. These manipulations are carried out using recombinant DNA techniques, such as restriction digests, ligations and molecular cloning. [58]

Inserting DNA into the host genome Edit

There are a number of techniques used to insert genetic material into the host genome. Some bacteria can naturally take up foreign DNA. This ability can be induced in other bacteria via stress (e.g. thermal or electric shock), which increases the cell membrane's permeability to DNA up-taken DNA can either integrate with the genome or exist as extrachromosomal DNA. DNA is generally inserted into animal cells using microinjection, where it can be injected through the cell's nuclear envelope directly into the nucleus, or through the use of viral vectors. [59]

Plant genomes can be engineered by physical methods or by use of Agrobacterium for the delivery of sequences hosted in T-DNA binary vectors. In plants the DNA is often inserted using Agrobacterium-mediated transformation, [60] taking advantage of the Agrobacteriums T-DNA sequence that allows natural insertion of genetic material into plant cells. [61] Other methods include biolistics, where particles of gold or tungsten are coated with DNA and then shot into young plant cells, [62] and electroporation, which involves using an electric shock to make the cell membrane permeable to plasmid DNA.

As only a single cell is transformed with genetic material, the organism must be regenerated from that single cell. In plants this is accomplished through the use of tissue culture. [63] [64] In animals it is necessary to ensure that the inserted DNA is present in the embryonic stem cells. [65] Bacteria consist of a single cell and reproduce clonally so regeneration is not necessary. Selectable markers are used to easily differentiate transformed from untransformed cells. These markers are usually present in the transgenic organism, although a number of strategies have been developed that can remove the selectable marker from the mature transgenic plant. [66]

Further testing using PCR, Southern hybridization, and DNA sequencing is conducted to confirm that an organism contains the new gene. [67] These tests can also confirm the chromosomal location and copy number of the inserted gene. The presence of the gene does not guarantee it will be expressed at appropriate levels in the target tissue so methods that look for and measure the gene products (RNA and protein) are also used. These include northern hybridisation, quantitative RT-PCR, Western blot, immunofluorescence, ELISA and phenotypic analysis. [68]

The new genetic material can be inserted randomly within the host genome or targeted to a specific location. The technique of gene targeting uses homologous recombination to make desired changes to a specific endogenous gene. This tends to occur at a relatively low frequency in plants and animals and generally requires the use of selectable markers. The frequency of gene targeting can be greatly enhanced through genome editing. Genome editing uses artificially engineered nucleases that create specific double-stranded breaks at desired locations in the genome, and use the cell's endogenous mechanisms to repair the induced break by the natural processes of homologous recombination and nonhomologous end-joining. There are four families of engineered nucleases: meganucleases, [69] [70] zinc finger nucleases, [71] [72] transcription activator-like effector nucleases (TALENs), [73] [74] and the Cas9-guideRNA system (adapted from CRISPR). [75] [76] TALEN and CRISPR are the two most commonly used and each has its own advantages. [77] TALENs have greater target specificity, while CRISPR is easier to design and more efficient. [77] In addition to enhancing gene targeting, engineered nucleases can be used to introduce mutations at endogenous genes that generate a gene knockout. [78] [79]

Genetic engineering has applications in medicine, research, industry and agriculture and can be used on a wide range of plants, animals and microorganisms. Bacteria, the first organisms to be genetically modified, can have plasmid DNA inserted containing new genes that code for medicines or enzymes that process food and other substrates. [80] [81] Plants have been modified for insect protection, herbicide resistance, virus resistance, enhanced nutrition, tolerance to environmental pressures and the production of edible vaccines. [82] Most commercialised GMOs are insect resistant or herbicide tolerant crop plants. [83] Genetically modified animals have been used for research, model animals and the production of agricultural or pharmaceutical products. The genetically modified animals include animals with genes knocked out, increased susceptibility to disease, hormones for extra growth and the ability to express proteins in their milk. [84]

Medicine Edit

Genetic engineering has many applications to medicine that include the manufacturing of drugs, creation of model animals that mimic human conditions and gene therapy. One of the earliest uses of genetic engineering was to mass-produce human insulin in bacteria. [32] This application has now been applied to human growth hormones, follicle stimulating hormones (for treating infertility), human albumin, monoclonal antibodies, antihemophilic factors, vaccines and many other drugs. [85] [86] Mouse hybridomas, cells fused together to create monoclonal antibodies, have been adapted through genetic engineering to create human monoclonal antibodies. [87] In 2017, genetic engineering of chimeric antigen receptors on a patient's own T-cells was approved by the U.S. FDA as a treatment for the cancer acute lymphoblastic leukemia. Genetically engineered viruses are being developed that can still confer immunity, but lack the infectious sequences. [88]

Genetic engineering is also used to create animal models of human diseases. Genetically modified mice are the most common genetically engineered animal model. [89] They have been used to study and model cancer (the oncomouse), obesity, heart disease, diabetes, arthritis, substance abuse, anxiety, aging and Parkinson disease. [90] Potential cures can be tested against these mouse models. Also genetically modified pigs have been bred with the aim of increasing the success of pig to human organ transplantation. [91]

Gene therapy is the genetic engineering of humans, generally by replacing defective genes with effective ones. Clinical research using somatic gene therapy has been conducted with several diseases, including X-linked SCID, [92] chronic lymphocytic leukemia (CLL), [93] [94] and Parkinson's disease. [95] In 2012, Alipogene tiparvovec became the first gene therapy treatment to be approved for clinical use. [96] [97] In 2015 a virus was used to insert a healthy gene into the skin cells of a boy suffering from a rare skin disease, epidermolysis bullosa, in order to grow, and then graft healthy skin onto 80 percent of the boy's body which was affected by the illness. [98]

Germline gene therapy would result in any change being inheritable, which has raised concerns within the scientific community. [99] [100] In 2015, CRISPR was used to edit the DNA of non-viable human embryos, [101] [102] leading scientists of major world academies to call for a moratorium on inheritable human genome edits. [103] There are also concerns that the technology could be used not just for treatment, but for enhancement, modification or alteration of a human beings' appearance, adaptability, intelligence, character or behavior. [104] The distinction between cure and enhancement can also be difficult to establish. [105] In November 2018, He Jiankui announced that he had edited the genomes of two human embryos, to attempt to disable the CCR5 gene, which codes for a receptor that HIV uses to enter cells. He said that twin girls, Lulu and Nana, had been born a few weeks earlier. He said that the girls still carried functional copies of CCR5 along with disabled CCR5 (mosaicism) and were still vulnerable to HIV. The work was widely condemned as unethical, dangerous, and premature. [106] Currently, germline modification is banned in 40 countries. Scientists that do this type of research will often let embryos grow for a few days without allowing it to develop into a baby. [107]

Researchers are altering the genome of pigs to induce the growth of human organs to be used in transplants. Scientists are creating "gene drives", changing the genomes of mosquitoes to make them immune to malaria, and then looking to spread the genetically altered mosquitoes throughout the mosquito population in the hopes of eliminating the disease. [108]

Research Edit

Genetic engineering is an important tool for natural scientists, with the creation of transgenic organisms one of the most important tools for analysis of gene function. [109] Genes and other genetic information from a wide range of organisms can be inserted into bacteria for storage and modification, creating genetically modified bacteria in the process. Bacteria are cheap, easy to grow, clonal, multiply quickly, relatively easy to transform and can be stored at -80 °C almost indefinitely. Once a gene is isolated it can be stored inside the bacteria providing an unlimited supply for research. [110] Organisms are genetically engineered to discover the functions of certain genes. This could be the effect on the phenotype of the organism, where the gene is expressed or what other genes it interacts with. These experiments generally involve loss of function, gain of function, tracking and expression.

  • Loss of function experiments, such as in a gene knockout experiment, in which an organism is engineered to lack the activity of one or more genes. In a simple knockout a copy of the desired gene has been altered to make it non-functional. Embryonic stem cells incorporate the altered gene, which replaces the already present functional copy. These stem cells are injected into blastocysts, which are implanted into surrogate mothers. This allows the experimenter to analyse the defects caused by this mutation and thereby determine the role of particular genes. It is used especially frequently in developmental biology. [111] When this is done by creating a library of genes with point mutations at every position in the area of interest, or even every position in the whole gene, this is called "scanning mutagenesis". The simplest method, and the first to be used, is "alanine scanning", where every position in turn is mutated to the unreactive amino acid alanine. [112]
  • Gain of function experiments, the logical counterpart of knockouts. These are sometimes performed in conjunction with knockout experiments to more finely establish the function of the desired gene. The process is much the same as that in knockout engineering, except that the construct is designed to increase the function of the gene, usually by providing extra copies of the gene or inducing synthesis of the protein more frequently. Gain of function is used to tell whether or not a protein is sufficient for a function, but does not always mean it's required, especially when dealing with genetic or functional redundancy. [111]
  • Tracking experiments, which seek to gain information about the localisation and interaction of the desired protein. One way to do this is to replace the wild-type gene with a 'fusion' gene, which is a juxtaposition of the wild-type gene with a reporting element such as green fluorescent protein (GFP) that will allow easy visualisation of the products of the genetic modification. While this is a useful technique, the manipulation can destroy the function of the gene, creating secondary effects and possibly calling into question the results of the experiment. More sophisticated techniques are now in development that can track protein products without mitigating their function, such as the addition of small sequences that will serve as binding motifs to monoclonal antibodies. [111]
  • Expression studies aim to discover where and when specific proteins are produced. In these experiments, the DNA sequence before the DNA that codes for a protein, known as a gene's promoter, is reintroduced into an organism with the protein coding region replaced by a reporter gene such as GFP or an enzyme that catalyses the production of a dye. Thus the time and place where a particular protein is produced can be observed. Expression studies can be taken a step further by altering the promoter to find which pieces are crucial for the proper expression of the gene and are actually bound by transcription factor proteins this process is known as promoter bashing. [113]

Industrial Edit

Organisms can have their cells transformed with a gene coding for a useful protein, such as an enzyme, so that they will overexpress the desired protein. Mass quantities of the protein can then be manufactured by growing the transformed organism in bioreactor equipment using industrial fermentation, and then purifying the protein. [114] Some genes do not work well in bacteria, so yeast, insect cells or mammalians cells can also be used. [115] These techniques are used to produce medicines such as insulin, human growth hormone, and vaccines, supplements such as tryptophan, aid in the production of food (chymosin in cheese making) and fuels. [116] Other applications with genetically engineered bacteria could involve making them perform tasks outside their natural cycle, such as making biofuels, [117] cleaning up oil spills, carbon and other toxic waste [118] and detecting arsenic in drinking water. [119] Certain genetically modified microbes can also be used in biomining and bioremediation, due to their ability to extract heavy metals from their environment and incorporate them into compounds that are more easily recoverable. [120]

In materials science, a genetically modified virus has been used in a research laboratory as a scaffold for assembling a more environmentally friendly lithium-ion battery. [121] [122] Bacteria have also been engineered to function as sensors by expressing a fluorescent protein under certain environmental conditions. [123]

Agriculture Edit

One of the best-known and controversial applications of genetic engineering is the creation and use of genetically modified crops or genetically modified livestock to produce genetically modified food. Crops have been developed to increase production, increase tolerance to abiotic stresses, alter the composition of the food, or to produce novel products. [125]

The first crops to be released commercially on a large scale provided protection from insect pests or tolerance to herbicides. Fungal and virus resistant crops have also been developed or are in development. [126] [127] This makes the insect and weed management of crops easier and can indirectly increase crop yield. [128] [129] GM crops that directly improve yield by accelerating growth or making the plant more hardy (by improving salt, cold or drought tolerance) are also under development. [130] In 2016 Salmon have been genetically modified with growth hormones to reach normal adult size much faster. [131]

GMOs have been developed that modify the quality of produce by increasing the nutritional value or providing more industrially useful qualities or quantities. [130] The Amflora potato produces a more industrially useful blend of starches. Soybeans and canola have been genetically modified to produce more healthy oils. [132] [133] The first commercialised GM food was a tomato that had delayed ripening, increasing its shelf life. [134]

Plants and animals have been engineered to produce materials they do not normally make. Pharming uses crops and animals as bioreactors to produce vaccines, drug intermediates, or the drugs themselves the useful product is purified from the harvest and then used in the standard pharmaceutical production process. [135] Cows and goats have been engineered to express drugs and other proteins in their milk, and in 2009 the FDA approved a drug produced in goat milk. [136] [137]

Other applications Edit

Genetic engineering has potential applications in conservation and natural area management. Gene transfer through viral vectors has been proposed as a means of controlling invasive species as well as vaccinating threatened fauna from disease. [138] Transgenic trees have been suggested as a way to confer resistance to pathogens in wild populations. [139] With the increasing risks of maladaptation in organisms as a result of climate change and other perturbations, facilitated adaptation through gene tweaking could be one solution to reducing extinction risks. [140] Applications of genetic engineering in conservation are thus far mostly theoretical and have yet to be put into practice.

Genetic engineering is also being used to create microbial art. [141] Some bacteria have been genetically engineered to create black and white photographs. [142] Novelty items such as lavender-colored carnations, [143] blue roses, [144] and glowing fish [145] [146] have also been produced through genetic engineering.

The regulation of genetic engineering concerns the approaches taken by governments to assess and manage the risks associated with the development and release of GMOs. The development of a regulatory framework began in 1975, at Asilomar, California. [147] The Asilomar meeting recommended a set of voluntary guidelines regarding the use of recombinant technology. [30] As the technology improved the US established a committee at the Office of Science and Technology, [148] which assigned regulatory approval of GM food to the USDA, FDA and EPA. [149] The Cartagena Protocol on Biosafety, an international treaty that governs the transfer, handling, and use of GMOs, [150] was adopted on 29 January 2000. [151] One hundred and fifty-seven countries are members of the Protocol and many use it as a reference point for their own regulations. [152]

The legal and regulatory status of GM foods varies by country, with some nations banning or restricting them, and others permitting them with widely differing degrees of regulation. [153] [154] [155] [156] Some countries allow the import of GM food with authorisation, but either do not allow its cultivation (Russia, Norway, Israel) or have provisions for cultivation even though no GM products are yet produced (Japan, South Korea). Most countries that do not allow GMO cultivation do permit research. [157] Some of the most marked differences occurring between the US and Europe. The US policy focuses on the product (not the process), only looks at verifiable scientific risks and uses the concept of substantial equivalence. [158] The European Union by contrast has possibly the most stringent GMO regulations in the world. [159] All GMOs, along with irradiated food, are considered "new food" and subject to extensive, case-by-case, science-based food evaluation by the European Food Safety Authority. The criteria for authorisation fall in four broad categories: "safety", "freedom of choice", "labelling", and "traceability". [160] The level of regulation in other countries that cultivate GMOs lie in between Europe and the United States.

Regulatory agencies by geographical region
Region Regulators Notes
US USDA, FDA and EPA [149]
Europe European Food Safety Authority [160]
Canada Health Canada and the Canadian Food Inspection Agency [161] [162] Regulated products with novel features regardless of method of origin [163] [164]
Africa Common Market for Eastern and Southern Africa [165] Final decision lies with each individual country. [165]
China Office of Agricultural Genetic Engineering Biosafety Administration [166]
India Institutional Biosafety Committee, Review Committee on Genetic Manipulation and Genetic Engineering Approval Committee [167]
Argentina National Agricultural Biotechnology Advisory Committee (environmental impact), the National Service of Health and Agrifood Quality (food safety) and the National Agribusiness Direction (effect on trade) [168] Final decision made by the Secretariat of Agriculture, Livestock, Fishery and Food. [168]
Brazil National Biosafety Technical Commission (environmental and food safety) and the Council of Ministers (commercial and economical issues) [168]
Australia Office of the Gene Technology Regulator (oversees all GM products), Therapeutic Goods Administration (GM medicines) and Food Standards Australia New Zealand (GM food). [169] [170] The individual state governments can then assess the impact of release on markets and trade and apply further legislation to control approved genetically modified products. [170]

One of the key issues concerning regulators is whether GM products should be labeled. The European Commission says that mandatory labeling and traceability are needed to allow for informed choice, avoid potential false advertising [171] and facilitate the withdrawal of products if adverse effects on health or the environment are discovered. [172] The American Medical Association [173] and the American Association for the Advancement of Science [174] say that absent scientific evidence of harm even voluntary labeling is misleading and will falsely alarm consumers. Labeling of GMO products in the marketplace is required in 64 countries. [175] Labeling can be mandatory up to a threshold GM content level (which varies between countries) or voluntary. In Canada and the US labeling of GM food is voluntary, [176] while in Europe all food (including processed food) or feed which contains greater than 0.9% of approved GMOs must be labelled. [159]

Critics have objected to the use of genetic engineering on several grounds, including ethical, ecological and economic concerns. Many of these concerns involve GM crops and whether food produced from them is safe and what impact growing them will have on the environment. These controversies have led to litigation, international trade disputes, and protests, and to restrictive regulation of commercial products in some countries. [177]

Accusations that scientists are "playing God" and other religious issues have been ascribed to the technology from the beginning. [178] Other ethical issues raised include the patenting of life, [179] the use of intellectual property rights, [180] the level of labeling on products, [181] [182] control of the food supply [183] and the objectivity of the regulatory process. [184] Although doubts have been raised, [185] economically most studies have found growing GM crops to be beneficial to farmers. [186] [187] [188]

Gene flow between GM crops and compatible plants, along with increased use of selective herbicides, can increase the risk of "superweeds" developing. [189] Other environmental concerns involve potential impacts on non-target organisms, including soil microbes, [190] and an increase in secondary and resistant insect pests. [191] [192] Many of the environmental impacts regarding GM crops may take many years to be understood and are also evident in conventional agriculture practices. [190] [193] With the commercialisation of genetically modified fish there are concerns over what the environmental consequences will be if they escape. [194]

There are three main concerns over the safety of genetically modified food: whether they may provoke an allergic reaction whether the genes could transfer from the food into human cells and whether the genes not approved for human consumption could outcross to other crops. [195] There is a scientific consensus [196] [197] [198] [199] that currently available food derived from GM crops poses no greater risk to human health than conventional food, [200] [201] [202] [203] [204] but that each GM food needs to be tested on a case-by-case basis before introduction. [205] [206] [207] Nonetheless, members of the public are less likely than scientists to perceive GM foods as safe. [208] [209] [210] [211]

Genetic engineering features in many science fiction stories. [212] Frank Herbert's novel The White Plague described the deliberate use of genetic engineering to create a pathogen which specifically killed women. [212] Another of Herbert's creations, the Dune series of novels, uses genetic engineering to create the powerful but despised Tleilaxu. [213] Films such as The Island and Blade Runner bring the engineered creature to confront the person who created it or the being it was cloned from. Few films have informed audiences about genetic engineering, with the exception of the 1978 The Boys from Brazil and the 1993 Jurassic Park, both of which made use of a lesson, a demonstration, and a clip of scientific film. [214] [215] Genetic engineering methods are weakly represented in film Michael Clark, writing for The Wellcome Trust, calls the portrayal of genetic engineering and biotechnology "seriously distorted" [215] in films such as The 6th Day. In Clark's view, the biotechnology is typically "given fantastic but visually arresting forms" while the science is either relegated to the background or fictionalised to suit a young audience. [215]

Types of Mutagens: Chemical and Physical | Genetics

In this article we will discuss about the chemical and physical types of mutagens.

1. Chemical Mutagens:

Singer and Kusmierek (1982) have published an excellent review on chemical mutagenesis.

Some of the chemical mutagens and mutagenesis are given in Table 9.3, and described below:

I. Base Analogues:

A base analogue is a chemical compound similar to one of the four bases of DNA. It can be incorporated into a growing polynucleotide chain when normal process of replication occurs.’ These compounds have base pairing properties different from the bases. They replace the bases and cause stable mutation.

A very common and widely used base analogue is 5-bromouracil (5-BU) which is an analogue of thymine. The 5-BU functions like thymine and pairs with adenine (Fig. 9.6A).

The 5-BU undergoes tautomeric shift from keto form to enol form caused by bromine atom. The enol form can exist for a long time for 5-BU than for thymine (Fig. 9.6B). If 5-BU replaces a thymine, it generates a guanine during replication which in turn specifies cytosine causing G: C pair (Fig. 9.6A).

During the replication, keto form of 5-BU substitutes for T and the replication of an initial AT pair becomes an A: BU pair (Fig. 9.7A). The rare enol form of 5-BU that pairs with G is the first mutagenic step of replication. In the next round of replication G pairs with C. Thus, the transition is completed from AT→GC pair.

The 5-BU can also induce the conversion of GC to AT. The enol form infrequently acts as an analogue of cytosine rather than thymine. Due to error, GC pair is converted into a G: BU pair which in turn becomes an AT pair (Fig. 9.7B). Due to such pairing properties 5-BU is used in chemotherapy of viruses and cancer. Because of pairing with guanine it disturbs the normal replication process in microorganisms.

The 5-bromodeoxyuridine (5-BDU) can replace thymidine in DNA molecule. The 2-amino-purine (2-AP) and 2, 6-di-amino-purine (2, 6-DAP) are the purine analogues. The 2-AP normally pairs with thymine but it is able to form a single hydrogen bond with cytosine resulting in transition of AT to GC. The 2-AP and 2, 6-DAP are not as effective as 5-BU and 5-BDU.

Ii. Chemicals Changing the Specificity of Hydrogen Bonding:

There are many chemicals that after incorporation into DNA change the specificity of hydrogen -bonding. Those which are used as mutagens are nitrous oxide (HNO2), hydroxylamine (HA) and ethyl-methane-sulphonate (EMS).

Nitrous oxide converts the amino group of bases into keto group through oxidative deamination. The order of frequency of deamination (removal of amino group) is adenine > cytosine > guanine.

(b) Deamination of Adenine:

Deamination of adenine results in formation of hypoxanthine, the pairing behaviour of which is like guanine. Hence, it pairs with cytosine instead of thymine replacing AT pairing by GC pairing (Fig. 9.8A).

(c) Deamination of Cytosine:

Deamination of cytosine results in formation of uracil by replacing – NH2 group with -OH group. The affinity for hydrogen bonding of uracil is like thymine therefore, C-G pair­ing is replaced by U-A pairing (Fig. 9.8B).

(d) Deamination of Guanine:

Deamination of guanine results in formation of xanthine, the later is not mutagenic. Xanthine behaves like guanine because there is no change in pairing behaviour. Xanthine pairs with cytosine. Therefore, G-C pairing is replaced by X-C pairing.

(e) Hydroxylamine (NH2OH):

It hydroxylates the C4 nitrogen of cytosine and converts into a modified base via deamination which causes to base pairs like thyamine. Therefore, GC pairs are changed into AT pairs.

Iii. Alkylating Agents:

Addition of an alkyl group to the hydrogen bonding oxygen of guanine (N7 position) and adenine (at N3 position) residues of DNA is done by alkylating agents. As a result of alkylation, possibility of ionization is increased with the introduction of pairing errors. Hydrolysis of linkage of base-sugar occurs resulting in gap in one chain.

This phenomenon of loss of alkylated base from the DNA molecule (by breakage of bond joining the nitrogen of purine and deoxyribose) is called depurination. Depurination is not always mutagenic. The gap created by loss of a purine can effectively be repaired.

Following are some of the important widely used alkylating agents:

EMS has the specifity to remove guanine and cytosine from the chain and results in gap formation. Any base (A,T,G,C) may be inserted in the gap. During replication chain without gap will result in normal DNA. In the second round of replication gap is filled by suitable base.

If the correct base is inserted, normal DNA sequence will be produced. Insertion of incorrect bases results in transversion or transition mutation. Another example is methyl nitrosoguanidine that adds methyl group to guanine causing it to mispair with thyamine. After subsequent replication, GC is converted into AT transition.

Iv. Intercalating Agents:

There are certain dyes such as acridine orange, proflavine and acriflavin which are three ringed molecules of similar dimensions as those of purine pyrimidine pairs (Fig. 9.9). In aqueous solution these dyes can insert themselves in DNA (i.e. intercalate the DNA) between the bases in adjacent pairs by a process called intercalation.

Therefore, the dyes are called intercalating agents. The acridines are planer (flat) molecules which can be intercalated between the base pairs of DNA distort the DNA and results deletion or insertion after replication of DNA molecule. Due to deletion or insertion of intercalating agents, there occur frameshift mutations (Fig. 9.10).

2. Physical Mutagens:

I. Radiations as Mutagens:

Radiation is the most important among the physical mutagens. Radiations damaging the DNA molecules fall in the wavelength range below 340 nm and photon energy above 1 electro-volt (eV). The destructive radiation consists of ultraviolet (UV) rays, X-rays, ү-rays, alpha (α) rays, beta (β) rays, cosmic rays, neutrons, etc. (Fig. 9.11).

Radiation induced damage can be categorized into the three broad types: lethal damage (killing the organisms), potentially lethal damage (can be lethal under certain ordinary conditions) and sub-lethal damage (cells do not die unless radiation reaches to a certain threshold value). The effect of damage is at molecular level.

In a live cell radiation damage to proteins, lipoproteins, DNA, carbohydrates, etc. is caused directly by ionization/excitation, or indirectly through highly reactive free radicals produced by radiolysis of cellular water. DNA stores genetic information’s so a damage to it assumes great dimension. It can perpetuate genetic effects and, therefore, the cellular repair system is largely devoted to its welfare.

When the bacteria are exposed to radiation they gradually lose the ability to develop colonies. This gradual loss of viability can be expressed graphically by plotting the surviving colonies against the gradually increasing exposure time. This dose-response graph is called survival curve. The survival curve of bacteria is given in Fig. 9.12. The survival curve is analysed by a simple mathematical theory called hit theory.

Each organism possesses at least one sensitive site which is known as target site. Radiation photons (particles of light) damage or hit the target site and inactivate the organisms. One can derive the equation based on this theory.

The equations help to calculate the survival curve for many kinds of populations of N identical organisms exposed to dose D of radiation causing damage. The number dN damaged by a dose dD is proportional to the initial population that has not received radiation hence dN = KN

K is the constant which measures the effectiveness of dose.

Integrating this equation from N = No at D = O we get

The surviving fraction S = N/No is

A plot of S virus D gives a straight line with a slope of -K (Fig. 9.12). This type of curves are called exponential or single hit curve. The exponential curve is obtained when the phages are irradiated with X-rays.

If there is a population of different organisms, and each organism consists of at least n sites, each site must be hit to inactivate an organism. Therefore, each organism is hit by n times. The probability of one unit being hit by a dose D is, P = 1-e -KD ), so the probability of Pn will be Pn = (1-e -KD ) n

The surviving fraction S of the population is 1-pn or S = 1-(1-e -KD )n …(3)

This equation can be expanded as:

At the large value of D, the higher order terms become negligible as compared to Therefore, at high dose D,

When the equation 3 is plotted for K – 1, various values of n reveals that for small values of D, In S gradually changes (Fig. 9.13). At large value of D, equation 4 dominates and curve becomes linear.

Ii. Ultraviolet (UV) Radiation:

UV radiation causes damage in the DNA duplex of the bacteria and phages. The UV rays are absorbed and cause excitation of macromolecules. The absorption maxima of nucleic acid = (280 nm) and protein (260 nm) are more or less similar. The DNA molecule is the target molecule for UV rays but not the proteins. However, absorption spectrum of RNA is quite similar to that of DNA.

The excited DNA leads to cross-linking, single strand breaks and base damage as minor lesion and generation of nucleotide dimer as a major one. Purines are generally more radio – resistant than the pyrimidine of the latter, thymine is more reactive than cytosine.

Hence, the ratio of thymine-thymine (TT), thymine-cytosine (TC), cytosine-cytosine (CC) dimer (Fig. 9.14) is 10:3:3, respectively. A few dimers of TU and UU also appear. The initial step in pyrimidine dimerization is known to be hydration of their 4: 5 bonds.

Formation of thymine-thymine (TT) dimer causes distortion of DNA helix because the thymines are pulled towards one another. The distortion results in weakening of hydrogen-bonding to adenines in the opposing strand. This structural distortion inhibits the advance of replication fork.

Iii. The X-Rays:

The X-rays cause breaking of phosphate ester linkages in the DNA. This breakage occurs at one or more points. Consequently, a large number of bases are deleted or rearranged in the DNA molecule.

The X-rays may break the DNA either in one or both strands. If breaks occur in both strands, it becomes lethal. The DNA segment between the two breaks is removed resulting in deletion. Since both the X-rays and UV rays bring about damage in DNA molecule, they are used in sterilization of bacteria and viruses.

The basics of your genetic code, explained

We’ve been trying to figure out how DNA works since the dawn of time. We can be reasonably certain that people have always looked at one another, or at entire families, and wondered things like, “How is it we all have the exact same gloriously curly hair?” or, “Why do all the members of the family next door run like gazelles when we’re more turtle-like?” When people were thinking these kinds of things, they were thinking about DNA. They just didn’t know it yet.

And it wasn’t just in people. Farmers observed characteristics in livestock, for example, and selectively bred sheep to have gloriously curly and fine wool. They picked and cultivated the sweetest red apples found growing wild, rather than the tart, woody green ones — all decisions rooted in DNA.

We could sense there was something unseen governing inheritance in every living thing around us. A lot of theories were put forward, held up as fact, and then abandoned as the centuries and millennia ticked along (Preformationism, anyone?).

Eventually, these theories were refined into what we call genetics. Coincidentally, the theory of genetic inheritance and DNA as a compound were both described in the mid-1800’s, but they weren’t directly linked to each other for almost another 100 years. Experiments showing that DNA is the molecule responsible for inheritance were done in the 1940’s and 50’s, shortly before the now-famous double helix structure was discovered.

What is DNA?

So what is DNA? And what does it have to do with sprinting and curly hair? Put simply, DNA is the molecule that holds the genetic information that every parent passes on to their biological children, whether that parent is human, a blue whale or a rhesus monkey. It plays a role in physical features, disease, behavioral traits, and even dietary considerations like sensitivity to lactose, the sugar found in milk. Sadly, we are not all created equal when it comes to ice cream.

Think of DNA as the most intricately detailed blueprint imaginable, and one we’re all born with. It tells our bodies how to grow and function over time, and that includes detailed instructions for things like faster firing muscle fibers and gloriously curly hair.

The composition of your genetic code

To understand DNA a little better, we need to look at its building blocks — the nucleotides.

In straight-forward terms, a nucleotide is a sugar (deoxyribose), with a phosphate group and a nitrogenous base. The sugar and phosphate join up with other nucleotides above and below in a scaffold of bases that makes up a single strand of DNA.

Remember the image of DNA in the shape of a ladder? It’s a good way to picture the basics of DNA construction. Two single strands of DNA join together, with the sugar-phosphate scaffolds on each side and the nitrogenous bases in the middle, as the rungs of the ladder. The bases have a unique characteristic in that they’re sort of co-dependent, and they really like to only hang out with one other kind of base, and together they form an inseparable pair — the peanut butter and jelly of the nucleotide world.

There are four kinds of bases: Adenine, Cytosine, Guanine and Thymine (A,C,G, and T). Adenine will only be seen with Thymine, and Cytosine with Guanine. If you see Adenine and Cytosine together, that’s trouble, and a whole other topic. The bases reach across from one side of the ladder to find their best friend on the other side to form base pairs, and the irony is that despite their choosiness about who they bond with, they actually bring the whole strand together. With the rungs in place, we have double-stranded DNA.

To complete the picture, you need to imagine that the ladder spirals as it rises. Another word for spiral is helix, and that’s the one we use for DNA. Because there are two spirals, one from each side of the ladder, we call the shape a double helix.

Genes are what make you, you

But how do we get from double helixes to the genes that give us our unique traits, like sprinter’s legs or a complex relationship with cheese? Before we get there, we need to talk about chromosomes, and where our DNA comes from.

A chromosome is a single continuous double-strand of DNA. Humans have 46 of them, or 23 pairs of chromosomes. The number of chromosomes we have is somewhat arbitrary. If you were a pigeon, you’d have 80 a potato, 48. A plains viscacha rat has an incredible 112. In the case of humans, we receive one half of each pair from each of our parents: 23 from our mother, and 23 from our father. Twenty-two of these pairs are common between men and women, the twenty-third chromosome determines whether or not we’re male or female — the X and Y chromosomes.

Back to genes. Your genes are specific sections within a DNA strand that code for something — something that can be inherited from your parents. When we say, “code for something” we mean that they carry a specific set of instructions, and those instructions are expressed as a sequence of bases Adenine, Thymine, Cytosine, or Guanine. If DNA is the cookbook, then a gene is a recipe for something very specific the instructions for a certain kind of muscle fibers that allow some of us to run like the wind.

But genes don’t do the building themselves. They are responsible for the end result, but they have more of a supervisory role. The real workers are amino acids, and what genes do is instruct how amino acids are arranged into complicated groupings that we know as proteins. The shape of those groupings, and the kinds of amino acids that make them up, determines the function of that protein and what kind of cells it will help create.

In plain language: If a gene’s code says, “Hey amino acids, build me a protein for a muscle fiber, but not just any old muscle fiber, one that’s thicker and quicker to contract, but tires faster. And make a lot of them,” then that’s what the amino acids assemble into (with the help of RNA, but that’s another story), and you’re more likely to be a sprinter than a marathoner. There are approximately 20,000 human protein-coding genes instructing amino acid arrangements for proteins that create traits, and the end result of this incredibly complex undertaking is what makes you, you.

All of your genes on all of your chromosomes are collectively called your genome. Surprisingly, the protein-coding genes make up only about 2% of your genome. This small, but critically important part of your genome is called the exome. It contains all of the genes for all of the things like your sprinter gene, curly hair, lactose sensitivity or pre-disposition toward certain diseases. There’s another roughly 7- 8% that is also functional, but doesn’t code for protein, and then 90% that is, well, the subject of a lot of debate (to put it mildly). Remember, the human genome was first sequenced as recently as 2003, so we’re still learning new things every day. DNA is far from a closed book.

Exome+ Next Generation Sequencing

At Helix, we sequence what we call Exome+. That means we collect all of the data from your protein-coding 2%, as well as areas of the other 98% that we currently know have important things to say about you. It’s up to 100 times more data than other consumer genetics companies.

We also sequence your DNA as opposed to genotyping it, which means you get a more complete view. If sequencing is reading every word on every page of a book, genotyping is scanning a few words per page to get the gist of the action.The advantage of being the fastidious bookworm of DNA, aka sequencing, is that within all of that information are unknowns that will become known one day, so as the field of genetics makes new discoveries about DNA, you’ll make new discoveries about yourself.

Because of the newness of a lot of this knowledge, the lessons you may have learned in science class would not even come close to what we know today, especially the longer ago you sat in that class. Remember, we only completed the first human genome sequencing in 2003! We’ve learned a lot since then, and have a long way to go. Today we can sequence your DNA and tell you concrete things that go well beyond ancestry and physical traits — though those are good things too.

DNA has a lot to say about your health, fitness, fertility and even your taste preferences, and though some of the DNA details from this post may slip your mind again over time, the insights you’ll gain about yourself from sequencing your DNA can be pretty memorable.

Expression data are derived from records contained in the Gene Expression Omnibus (GEO), and are first log2 transformed and normalized. Referenced datasets may contain one or more condition(s), and as a result there may be a greater number of conditions than datasets represented in a single clickable histogram bar. The histogram division at 0.0 separates the down-regulated (green) conditions and datasets from those that are up-regulated (red). Click "Expression Details" to view all expression annotations and details for this locus, including a visualization of genes that share a similar expression pattern.

All manually curated literature for the specified gene, organized into topics according to their relevance to the gene (Primary Literature, Additional Literature, or Review). Click "Literature Details" to view all literature information for this locus, including shared literature between genes.

Gene terminology - is one gene a concrete, single physical sequence? - Biology

MGD is updated on a weekly basis by biologists on our curatorial staff who scan the current scientific literature, extract relevant data, and enter it in MGD. Increasingly, MGD acquires data through large scale electronic transfer. Such data include sequence data from GenBank, gene models from NCBI, Ensembl, VEGA, mutant alleles from ENU-mutagenesis groups and the International Knockout Mouse Consortium (IKMC). The data interface is intended to be flexible and comprehensive so that each view of particular records in MGD provides links to any related data throughout MGD and, where possible, to other databases on the Internet.

  • Gene, DNA marker, QTL and Cytogenetic marker descriptions
  • Mouse genetic phenotypes, genetic interrelationships, and polymorphic loci
  • Human disease ontology data (DO)
  • Polymorphic loci related to specified strains
  • SNPs and other sequence polymorphisms
  • Vertebrate homology data
  • Sequence data
  • Molecular probes and clones (probes, clones, primers and YACs)
  • Genetic and physical mapping data
  • Information on inbred strains (M. Festing's listing)
  • References supporting all data in MGD

Data Links to External Databases

MGD provides links to relevant information in external databases wherever possible.

Through. MGD links to.
MarkersEC, Ensembl, Entrez, InterPro, NCBI, PDB, UniGene, VEGA
PhenotypesOn-line Mendelian Inheritance in Man (OMIM) for human disease data
SNPs and sequence polymorphisms dbSNP
Homologies Entrez, HGNC, HomoloGene, NCBI, Ensembl Gene Tree, Uniprot, VEGA, VISTA
SequencesGenBank, RefSeq, Uni-PROT, and TrEMBL mouse gene indices
Molecular probes and clonesGenBank, EMBL, DDBJ, IMAGE and RIKEN

Genes and Markers

MGD contains information on mouse genes, DNA segments, cytogenetic markers and QTLs (see Genes and Markers). Each record may include the marker symbol, name, other names or symbols and synonyms, nomenclature history, alleles, STSs, chromosomal assignment, centimorgan location, cytogenetic band, EC number (for enzymes), phenotypic classifications, human disease data, Gene Ontology (GO) terms, MGD accession IDs and supporting references. See Interpreting a Genes and Markers Summary and Interpreting Gene Details for more information about the content of the display of a marker record as it appears in the query results.

Information on alleles , formerly embedded in phenotype descriptions, is stored as a separate data set (see Phenotypic Alleles). Links to alleles are provided in gene detail records. In addition, there is an Phenotypes, Alleles, and Disease Models Query Form for direct queries against the allele data set. See Details for the content of an allele record as displayed in query results.

Phenotypic Alleles

MGD contains information on mutant alleles, transgenes, QTLs, strain characteristics, phenotype vocabularies, human disease models, and comparative phenotypes. Integrated access to phenotype and disease model data is accessible via four query forms (Genes and Markers, Phenotypes and Alleles, Human&mdashMouse: Disease Connection, and Batch Query). These forms provide genetic, phenotypic, and computational approaches to displaying phenotypic variation sources (single-gene, genetic mutations, QTLs, strains), as well as data on human disease correlation, and mouse models. The Human Disease Ontology (DO) Browser enables you to browse and search diseases, conditions, and syndromes directly. Phenotypic allele summary and detail reports provide detailed information about the content of phenotype records including observed phenotypes in mouse and genetic background. The Human Disease and Mouse Model Detail page lists homologous mouse and human markers where mutations in one or both species have been associated with phenotypes characteristic of this disease as well as any mouse models.

Sequence Data

  • Vast amounts of sequence data are integrated with the biological information in MGD. These include mouse sequences from GenBank, RefSeq, and .
  • MGD contains sequence attributes such as length and provider data about the clones the sequences were derived from and the genes the sequences have been associated to. Because of our curated associations between mouse markers and sequences, you can search using nomenclature, map position, function (GO annotation, InterPro domain), expression (tissue and developmental stage), and phenotypes of mutant alleles.
  • Source information about the clones that the sequences are derived from, such as strain, tissue, or library, is carefully translated into controlled vocabularies (see Vocabulary Browsers). This adds enormous power to sequence queries, since authors often use multiple terms to specify a strain or tissue.

Vocabulary Browsers

  • The MGD Vocabulary Browsers provide access to restricted sets of defined terms representing complex information.
  • These vocabularies (known as DAGs or directed acyclic graphs) have a tree (or hierarchical) structure: terms are organized primarily by their relationship to other terms.
  • The MGD Vocabulary Browsers currently available are:

Browser NameUse this browser to search for .
GO (Gene Ontology) Browser GO term details and relationships.
Links to genes associated with your term or with any sub terms.
Mouse Developmental Anatomy Browser Anatomical structures.
Links to associated expression results.
Disease Ontology (DO) BrowserHuman disease terms.
Links to detail pages containing genotypes annotated with these terms.
Links to Disease Model web pages.
Mammalian Phenotype Browser Mammalian phenotype terms.
Term details and relationships among terms.
Links to genotypes annotated with each term or any sub terms.
Human Phenotype Browser Human phenotype terms.
Term details and relationships among terms.
Links to human diseases and the high-level human phenotype terms associated with the term.

SNP Data

  • MGD provides comprehensive information about reference SNPs including the reference flanking sequence, assays that comprise the SNP, gene/marker associations with their corresponding function class annotations, and links to popular gene browsers including Mouse Genome Browser and its transcript, gene model, and MGD-curated phenotype and allele tracks.
  • The Mouse SNP Query Form lets you search for RefSNPs by strains, strain comparisons, RefSNP attributes, map position, marker range, or associated genes.

Molecular Probes and Clones

Probes, clones, primers, antibodies, etc. associated with MGI data for a gene or genome feature are available via a Molecular reagents link on Gene (or genome feature) Detail pages or from a link on References -- Query Results Detail pages to Molecular Probes and Clones.

Information on genetic polymorphisms is extracted from probe/clone records in MGD.

Vertebrate Homology

MGD contains homology information for mouse, human, rat, cattle and other vertebrate organisms.

MGI provides a curated set of vertebrate homologs for the research community. MGI focuses on integration of homology sets from sequenced vertebrate genomes (e.g., human, rat, dog, chimp). MGI loads sequence based vertebrate homology assertions from NCBI HomoloGene. HomoloGene programmatically detects homologs among the genome features of several completely sequenced eukaryotic genomes. In addition, we continue to work with the research community to carefully curate gene family sets, usually at the instigation of the research community.

Homologous genes associated with a mouse gene or genome feature are available from via links from the Vertebrate Homology section a Detail page for a gene/genome feature. Performing a Quick Search using a non-mouse gene or sequence accession ID returns a link to the Vertebrate Homology Class page.

Mapping Data

MGD contains genetic mapping and linkage data, including haplotype data for linkage crosses, in situ hybridization data, deletion mapping information, translocation breakpoint mapping, somatic cell hybrids, concordance tables, congenic strains information, and physical mapping information.

Centimorgan positions for genes and markers in MGI are based on linear interpolation using the standard genetic map described in Cox et al. (2009) (PMID).

"Recombinant Congenic Strains - A New Tool for Analyzing Genetic Traits Determined by More Than One Gene," Immunogenetics 24: 416-422, 1986.

DNA Mapping Panel Data Sets

    Copeland-Jenkins:(C57BL/6J x M. spretus)F1 x C57BL/6J
    JAX Mouse Mutant Resource BCB: (C57BL/6J x CAST/Ei)F1 x C57BL/6J
    JAX Mouse Mutant Resource BSS: (C57BL/6J x SPRET/Ei)F1 x SPRET/Ei
    Kozak FvC58: (NFS/N x M. spretus)F1 x C58/J Kozak FvSpr: (NFS/N x M. spretus)F1 x M. spretus
    Kozak Skive: (NFS/N or C58/J x M. m. musculus)F1 x M. m. musculus
    Seldin: (C3H/HeJ-Fasl<gld> x M. spretus)F1 x C3H/HeJ-Fasl<gld>
    UCLA (BSB): (C57BL/6J x M. spretus)F1 x C57BL/6J

DNA Mapping Panel data may appear in tabular format, where each column represents a single offspring of the cross, and each row indicates, for each locus, which allele is present in each of the offspring. The order of rows is determined by linkage on the chromosome, and the locus nearest the centromere appears at the top of the display. Centimorgan locations for loci in the cross are determined by the provider of the cross. -->

Graphical Map Displays

Genetic Maps

Where available, gene/genome feature detail pages provide a link to a Detailed Genetic Map that shows all markers within one cM of the marker.

Pleiotropy Definition

In pleiotropy, one gene controls the expression of several phenotypic traits. Phenotypes are traits that are physically expressed such as color, body shape, and height. It is often difficult to detect which traits may be the result of pleitoropy unless a mutation occurs in a gene. Because pleiotropic genes control multiple traits, a mutation in a pleiotropic gene will impact more than one trait.

Typically, traits are determined by two alleles (variant form of a gene). Specific allele combinations determine the production of proteins which drive the processes for the development of phenotypic traits. A mutation occurring in a gene alters the DNA sequence of the gene. Changing gene segment sequences most often results in non-functioning proteins. In a pleiotropic gene, all of the traits associated with the gene will be altered by the mutation.

Gene pleiotropy, also referred to as molecular-gene pleiotropy, focuses on the number of functions of a particular gene. The functions are determined by the number of traits and biochemical factors impacted by a gene. Biochemical factors include the number of enzyme reactions catalyzed by the protein products of the gene.

Developmental pleiotropy focuses on mutations and their influence on multiple traits. The mutation of a single gene manifests in the alteration of several different traits. Diseases involving mutational pleiotropy are characterized by deficiencies in multiple organs that impact several body systems.

Selectional pleiotropy focuses on the number of separate fitness components affected by a gene mutation. The term fitness relates to how successful a particular organism is at transferring its genes to the next generation through sexual reproduction. This type of pleiotropy is concerned only with the impact of natural selection on traits.


Identification of homologous ACE1 clusters in other filamentous fungi

The ACE1 secondary metabolism gene cluster of M. grisea comprises 15 genes: ACE1 and SYN2 are PKS-NRPS hybrid genes RAP1 and RAP2 code for enoyl reductases CYP1-CYP4 for cytochrome P450 monoxygenases ORFZ for an α/β-hydrolase OXR1 and OXR2 for oxidoreductases MFS1 codes for a transporter in the MFS superfamily BC2 codes for a binuclear zinc finger transcription factor OME1 codes for an O-methyl transferase and ORF3 has no homology to known proteins (Collemare et al, unpublished results). To find gene clusters homologous to the ACE1 cluster in other fungal species, we used an algorithm that searched 26 fungal genomes for loci where at least three likely orthologs of genes from the ACE1 cluster were linked (see Materials and methods). This search identified nine similar clusters in seven fungal species from the subphylum Pezizomycotina: three Sordariomycetes (Chaetomium globosum, Fusarium oxysporum and F. verticillioides), one Dothideomycete (Stagonospora nodorum) and three Eurotiomycetes (Aspergillus clavatus, Coccidioides immitis and Uncinocarpus reesii) (Figure 1).

ACE1 and ACE1-like gene clusters in filamentous fungi. Colors indicate gene orthology in different species and paralogs in the same species. Horizontal lines indicate genes that are adjacent in the genome, with gene orientations as shown. Genomic regions are not drawn to scale. Parts A and B of the M. grisea cluster as identified in the text are marked. The core set of three genes inferred to have been present in the ancestral cluster are boxed. Vertical lines indicate the closest relatives of genes in the M. grisea cluster and one of the A. clavatus clusters, based on phylogenetic analyses (Figure 2 and Additional data file 1). The species phylogeny is based on the whole-genome supertree analysis of Fitzpatrick et al [27] in that study the placement of Dothideomycetes relative to Sordariomycetes and Eurotiomycetes varied depending on the method of analysis, so we have shown it as a trichotomy. The analysis of Hane et al of the complete S. nodorum genome placed Dothideomycetes and Sordariomycetes in a clade with Eurotiomycetes outside [47]. Species-specific gene nomenclature is shown, except for M. grisea (Collemare et al, unpublished results). Red, green and blue coloring of species names corresponds to the labelling of individual genes from the clusters in Figure 2 and Additional data file 1.0.

Two types of clusters related to the ACE1 cluster were identified: large clusters with eight or more genes are found in M. grisea, C. globosum and S. nodorum, whereas smaller clusters with three to six genes are found in the three Eurotiomycetes and in Fusarium species (Figure 1). C. globosum is unusual as its genome contains two large ACE1-like clusters, which we refer to as clusters 1 and 2. Similarly, the A. clavatus genome has two clusters as discussed below. Interestingly, a core set of three genes (homologs of ACE1, RAP1 and ORF3 boxed in Figure 1) is present in all eight species. The presence of this core suggests that the physical linkage between these three genes is ancient and can be inferred to have existed in the common ancestor of all the genomes considered in Figure 1. As well as the genes in the eight clusters shown in Figure 1, we also identified a small number of single homologs of genes from the M. grisea ACE1 cluster that are located at dispersed genomic locations in other species.

Phylogenetic analysis of the ACE1 cluster in filamentous fungi

Gene-by-gene phylogenetic analyses were carried out to decipher the evolutionary history of the loci using homologs (even at dispersed locations) of genes from M. grisea ACE1 cluster (Figure 2 and Additional data file 1). The first trend evident from this phylogenetic analysis is that genes from clusters in Eurotiomycetes and Fusarium spp. are distant from those of the M. grisea, C. globosum and S. nodorum clusters. Indeed, genes in clusters from these last three species define clades supported by high bootstrap values (> 91%), to the exclusion of genes from Eurotiomycetes and Fusarium species (Figure 2a,b,e,f). Interestingly, genes from one of the two clusters in A. clavatus are more closely related to genes in the M. grisea ACE1 cluster than to those in ACE1-like clusters from other Eurotiomycetes (see below). In view of the gene contents of the clusters and their phylogenetic relationships, we refer to the large clusters in M. grisea, C. globosum, S. nodorum and the larger of the two clusters in A. clavatus as "ACE1 clusters", and to the smaller clusters in Eurotiomycetes and Fusarium spp. as "ACE1-like clusters". These two types of cluster have probably had a long history of independent evolution, although they certainly share a common ancestor.

Maximum likelihood trees for ACE1 cluster genes and their homologs. (a) ACE1 and SYN2 (b) RAP1 and RAP2 (c) CYP1 and CYP4 (d) CYP2 and CYP3 (e) ORF3 (f) ORFZ. In each tree, genes that appear in Figure 1 are named in color or bold black. Yellow highlighting shows the five genes in the A. clavatus ACE1 cluster whose closest relatives are genes from part B of the M. grisea cluster. Bootstrap percentages are shown for all nodes. Trees were constructed from amino acid sequences as described in Methods using PHYML after alignment with ClustalW and Gblocks filtering. Trees for the other five genes in the ACE1 cluster are shown in Additional data file 1. The values of the shape parameter (α) for the gamma distribution were estimated from the data as 1.329, 1.441, 2.476, 2.615, 2.536 and 0.961 for panels a-f, respectively. The proportions of invariant sites are 0.028, 0.035, 0.030, 0.068, 0.000 and 0.000, respectively. The M. grisea SYN2 gene corresponds to parts of the automatically-annotated gene models MGG_12452.5 and MGG_12451.5.

We then focused on the origins of the duplicated genes in the M. grisea cluster. Phylogenetic trees show clearly that in M. grisea RAP2 is a paralog of RAP1, CYP3 is a paralog of CYP2, CYP4 is a paralog of CYP1, and SYN2 is a paralog of ACE1 (Figure 2a-d). Notably, in each of these pairs, one gene is located on the left-hand side of the M. grisea cluster and the other is on the right-hand side. Thus the M. grisea cluster appears to have undergone partial tandem duplication at some stage during its evolution, although the gene order is not conserved between the two parts. The presence of two ACE1 clusters in C. globosum is suggestive of a second block-duplication event in this species. However, for most genes present in both C. globosum ACE1 clusters, the copy from cluster 1 forms a clade with their M. grisea homologs. This close phylogenetic relationship is observed for ACE1, RAP1, ORFZ, OXR1, CYP1, and OXR2. The only exception to this pattern is M. grisea ORF3, which is marginally closer to the C. globosum cluster 2 gene, but with low bootstrap support (Figure 2 and Additional data file 1). This observation suggests that the duplication that gave rise to the current C. globosum clusters 1 and 2 occurred in a common ancestor of C. globosum and M. grisea, and that the corresponding cluster 2 in M. grisea was lost.

On the basis of this analysis, we divided the M. grisea cluster into two parts, A and B, so that each of the duplicated genes in M. grisea has one copy in part A and one in part B (Figure 1). Part A in M. grisea consists of nine genes, all of which have orthologs in one or both of the clusters in its closest relative C. globosum. The clusters in other species consist of homologs of genes from M. grisea part A, plus one gene from part B (ORF3 see Discussion). The order of the part A genes is not conserved among M. grisea, C. globosum and S. nodorum.

Surprisingly, this phylogenetic analysis shows that five of the six genes from part B of the M. grisea ACE1 cluster group with genes from the larger of the two clusters in A. clavatus, rather than with the genes in the more closely related (Sordariomycete) species C. globosum, or with their part A paralogs in M. grisea. Bootstrap values for grouping the M. grisea part B genes SYN2, RAP2, CYP4, CYP3 and ORF3 with their A. clavatus homologs are 98-100% (Figure 2a-e). The only gene from part B of the M. grisea cluster that does not group with A. clavatus is OME1 (panel e of Additional data file 1), but this is also the only gene whose detected homolog in A. clavatus (ACLA_002520) is not physically clustered with the others, which calls its orthology into question. The consistency of this phylogenetic result for part B genes, and its disagreement with the expected species relationships, are indicative of HGT between A. clavatus and part B of the M. grisea cluster. In contrast seven of the nine genes from part A of the M. grisea cluster, including ACE1 itself, lie at the expected phylogenetic position forming a clade with C. globosum (Figure 2 and Additional file 1 the two exceptions are CYP2, which is discordant but has a low bootstrap value of 66%, and MFS1, which cannot be analyzed because there is no homolog in the C. globosum clusters).

For the four panels in Figure 2 that include sequences from other Eurotiomycetes (C. immitis and U. reesii) as well as A. clavatus, we used the likelihood ratio test (LRT) to test whether the topologies shown (Figure 2a,b,e,f) have significantly higher likelihoods than alternative trees where the Eurotiomycetes were constrained to form a monophyletic group. In all four cases the topology shown in Figure 2 is significantly more likely than the tree expected if genes were inherited vertically (p < 0.001 for each).

Identifying the direction of gene transfer

To determine whether part B of the cluster was transferred from an M. grisea-like donor to an ancestor of A. clavatus, or vice versa, we examined phylogenetic trees constructed from those genes that have orthologs both in species that are close relatives of M. grisea and in species that are closer to A. clavatus. We would predict that if an ancestor of A. clavatus was the recipient of HGT, then the genes in its ACE1 cluster would not show the expected close relationship to other Eurotiomycete species such as C. immitis and U. reesii (Figure 1), and would instead form a clade with the donor lineage (represented by M. grisea). Conversely, if the direction of transfer was from an A. clavatus-like donor into the M. grisea lineage, we would expect the M. grisea part B genes not to form a monophyletic clade with the other Sordariomycete species C. globosum, and instead to group with A clavatus.

In the phylogenetic tree of ORF3 sequences, the shared A. clavatus-M. grisea branch lies within a clade that contains homologs from the two clusters in C. globosum, as well as the Dothideomycete S. nodorum (Figure 2e). The ORF3 orthologs from C. immitis and U. reesii clearly lie outside this clade with 95% bootstrap support. Similarly, the phylogenetic tree of RAP1 and RAP2 orthologs (Figure 2b) shows that the shared branch containing the A. clavatus gene and the part B M. grisea gene (RAP2) lies within a larger clade that includes the C. globosum and M. grisea part A (RAP1) orthologs. The homologs from C. immitis and U. reesii lie outside (91% bootstrap support). Likewise, the phylogenetic tree of the ACE1-SYN2 pair (Figure 2a) places the A. clavatus sequence within a Sordariomycete/Dothideomycete clade, distant from the other Eurotiomycetes (C. immitis and U. reesii). These topologies all indicate that an ancestor of M. grisea was the donor of the transferred part B genes, and an ancestor of A. clavatus was the recipient.

ORFZ is the only gene in the A. clavatus ACE1 cluster that does not have a homolog in part B of the M. grisea cluster. The origin of this gene in A. clavatus is not clear. Phylogenetic analysis (Figure 2f) indicates that A. clavatus ORFZ does not group with the C. immitis and U. reesii genes, and this conclusion is supported by the LRT. This result suggests a foreign origin for A. clavatus ORFZ, but the absence of a homolog in M. grisea part B makes it impossible to test whether this gene has a similar origin to its five neighboring genes in A. clavatus.

We conclude that there is phylogenetic support for the hypothesis that at least five of the six genes in the ACE1 cluster of A. clavatus originated by HGT, and that the most probable single donor is a Sordariomycete ancestor related to M. grisea.


Talking Glossary of Genetic Terms from National Human Genome Research Institute.

Activator (Ac) An autonomous transposable element in maize that also controls the movement of another transposon, Dissociator (DS).
32 Animation, 32 Video adenine One of the four bases that make up DNA. Abbreviated with an 'A.'
15 Animation 19 Animation 20 Animation 21 Animation 22 Animation 23 Animation, 23 Problem 26 Animation 28 Animation adenovirus Adenoviruses contain double-stranded DNA and are unusually stable, allowing them to survive for prolonged periods outside of the body. Adenoviruses infect membranes of the respiratory tract, eyes, intestines, and urinary tract, and can cause respiratory infections and gastrointestinal upsets.
24 Animation, 24 Bio agarose gel electophoresis A matrix composed of a highly purified form of agar is used to separate larger DNA and RNA molecules ranging from 100 to 20,000 nucleotides.
23 Animation 24 Animation, 24 Video, 24 Bio 29 Animation alkaptonuria A rare inherited condition in which a person's urine turns a dark brownish-black color when exposed to air. A mutation on the HGD gene causes the condition.
13 Bio, 13 Problem 16 Animation allele An alternate version of a gene, e.g., Gregor Mendel's pea plants have flowers with two colors: white and reddish-purple. The flower color gene in this case has two alleles, one for white and the other for reddish-purple.
2 Animation 4 Animation, 4 Problem 5 Animation, 5 Problem 11 Animation 16 Animation amino acid A class of molecules that are the building blocks of proteins. There are 20 different amino acids used to make up proteins.
15 Animation, 15 Video, 15 Bio 16 Animation apoptosis (cell death) The natural process of programmed cell death as part of normal growth and development.
38 Animation, 38 Problem Antennapedia A homeotic mutation of Drosophila, where a pair of legs replaces the antenna.
37 Animation, 37 Gallery autosome A chromosome that is not involved in sex determination.
39 concept bacterium/bacteria A single-cell prokaryotic organism. (see Escherichia coli)
17 Animation, 17 Bio 18 Animation 27 Animation, 27 Bio 28 Animation 30 Animation 40 Bio bacteriophage (phage or phage particle) A virus that infects bacteria. Altered forms are used as vectors for cloning DNA.
18 Animation 27 Animation 39 Animation base pairing Discovery made by James Watson that was key in solving the structure of DNA. The bases that make up DNA pair with each other: A to T and G to C. The pairs are held together by hydrogen bonds
19 Animation BRCA1/BRCA2 Mutations in the BRCA1 and BRCA2 tumor suppressor genes are linked to inherited breast and ovarian cancer.
36 Animation cDNA library A library composed of complementary copies of cellular mRNAs.
36 Animation39 Animation40 Animation cell cycle The process of cell division and replication. In eukaryotic cells, the cell cycle has two periods: interphase, when the cell grows and duplicates its DNA (G1, S and G2 phases), and mitosis (M phase), when the cell splits into two daughter cells.
38 Animation Central Dogma Theory developed by James Watson and Francis Crick to describe the process of protein production: DNA to RNA to protein.
21 Animation, 21 Bio 25 Animation, 25 Bio centromeres Regions of chromosomes made up of non-coding, highly repetitive DNA. During mitosis this region connects the sister chromatids and attaches the spindle fibers that draw the separated chromosomes to opposite poles.
7 Animation 8 Animation 39 Animation chromatid Each of the two daughter strands of a duplicated chromosome joined at the centromere during mitosis and meiosis.
7 Animation 8 Animation 31 Animation chromatin Refers to the combined DNA and protein material that coils up to form chromosomes.
29 Animation chromosome Chromosomes are packages of DNA found in the nucleus of cells. Humans have 46 chromosomes.
8 Animation, 8 Bio 9 Animation, 9 Video, 9 Bio 10 Animation, 10 Video, 10 Bio 11 Animation, 11 Video, 11 Gallery, 11 Bio 29 Animation, 29 Video, 29 Problem clone Refers to a replica. DNA molecules can be cloned using bacteria or viruses as hosts. A genetic clone can also refer to an organism that is a genetic copy of the original - produced using various in vitro techniques.
34 Animation, 34 Video 39 Animation codon Three bases in a DNA sequence that encodes the type of amino acid to be placed in the protein. For example, the codon G-T-G signifies the amino acid valine.
22 Animation, 22 Video, Bio 23 Animation complementary DNA (cDNA) The matching strand of a DNA molecule to which its bases pair.
19 Animation 20 Animation 22 Animation 36 Animation 39 Animation conjugation The transfer of genetic material between bacterial cells. DNA is passed from the donor cell to the recipient cell via direct cell-to-cell contact or a connection between two bacterial cells.
18 Animation crossing-over The exchange of DNA sequences between chromatids of homologous chromosomes during meiosis.
11 Animation, 11 Video 27 Animation cytosine One of the four bases that make up DNA. Abbreviated with a 'C.'
15 Animation 19 Animation 20 Animation 21 Animation 22 Animation 23 Animation, 23 Problem 26 Animation 28 Animation DNA
(Deoxyribonucleic acid) An organic acid and polymer composed of four nitrogenous bases &mdash adenine, thymine, cytosine, and guanine &mdash linked via intervening units of phosphate and the pentose sugar deoxyribose. DNA is the genetic material of most organisms and usually exists as a double-stranded molecule in which two antiparallel strands are held together by hydrogen bonds between adenine-thymine and cytosine-guanine.
17 Animation 19 Animation 20 Animation DNA array DNA arrays (also known as microarrays or gene chips) can analyze patterns of gene expression and show how gene expression responds to external factors.
36 Animation, 36 Video, 36 Bio, 36 Problem DNA sequencing Procedures for determining the nucleotide sequence of a DNA fragment.
21 Bio 23 Animation 35 Animation 39 Animation deletion A mutation in DNA where a section of DNA is deleted. Some deletions have no effect while others can have a drastic effect, depending on how much DNA was deleted and whether the deletion was in a region of DNA that encoded a protein.
27 Animation 29 Bio 32 Bio 41 Animation density gradient centrifugation A technique where cells, organelles or molecules can be separated using centrifugal force, depending on their size, shape and density.
20 Animation, 20 Video, 20 Bio, 20 Problem 21 Animation deoxyribonucleic acid (See DNA) dideoxynucleotide A deoxynucleotide that lacks a 3' hydroxyl group, and is thus unable to form a 3'-5' phosphodiester bond necessary for chain elongation. Dideoxynucleotides are used in DNA sequencing and the treatment of viral diseases.
23 Animation diploid The condition when the genome of an organism consists of two copies of each chromosome.
8 Problem Dissociator (Ds) A transposable element in maize, whose mobility is dependent on another element, Activator (Ac)
32 Animation, 32 Video dominant A genetic trait or disorder is dominant when only one copy of the gene is necessary for the trait to develop. A recessive trait or disorder develops when two copies of the gene are inherited.
4 Animation, 4 Problem 5 Animation, 5 Problem 13 Animation, 13 Problem double helix Describes the coiling of the antiparallel strands of the DNA molecule, resembling a spiral staircase in which the paired bases form the steps and the sugar-phosphate backbones form the rails.
17 Animation 19 Animation 20 Animation Drosophilia melanogaster (fruit fly) An animal model system used to study genetics. Fruit flies are easy to maintain, have large numbers of offspring, grow quickly, and the genome is relatively straightforward to disrupt and introduce foreign genes.
10 Animation, 10 Video, 10 Problem 11 Animation, 11 Video, 11 Problem 27 Animation 37 Animation, 37 Gallery, 37 Video, 37 Problem 39 Animation electrophoresis The technique of separating charged molecules in a matrix to which is applied an electrical field.
23 Animation, 24 Animation, 24 Video, 24 Bio Escherichia coli (bacteria) A type of bacteria that inhabits the human colon. E. coli is one of the most important organisms in molecular genetics research, particularly in the area of recombinant DNA, as it serves as a host for plasmid and other cloning vectors.
18 Animation, Problem 34 Animation, 34 Video, 34 Problem eugenics An effort to breed better human beings by encouraging the reproduction of people with "good" genes and discouraging those with "bad" genes. The eugenics movement of the 19th century started in the United States and was eventually discredited.
14 Animation, 14 Gallery, 14 Problem evolution The long-term process through which a population of organisms accumulates genetic changes that enable its members to successfully adapt to environmental conditions and to better exploit food resources.
12 Animation, Gallery, 12 Video, 12 Bio exon A section of a gene that contains the instructions for making a protein.
24 Animation, 24 Problem 26 Animation 34 Animation gene A portion of DNA that contains instructions for making a protein.
6 Animation 10 Animation 11 Animation, 12 Animation, 16 Animation gene expression The process of producing a protein from its DNA- and mRNA-coding sequences.
33 Animation 36 Animation 37 Animation 39 Animation genetic (linkage) map A linear map of the relative positions of genes along a chromosome. Distances are established by linkage analysis, which determines the frequency at which two gene loci become separated during chromosomal recombination.
11 Animation, 24 Animation GeneChips® A process of making and embedding specific DNA sequences onto a surface to be used for large-scale DNA analysis (see also DNA array).
36 Animation, 36 Video, 36 Bio genetic code The three-letter code that translates nucleic acid sequence into protein sequence (see also codon).
21 Animation 22 video, 22 Animation genome Refers to all the DNA of an organism - the entire genetic component.
30 Animation 39 Animation, Video genotype The structure of DNA that determines the expression of a trait (phenotype).
2 Animation 5 Animation germ line The sequence of cells that have genetic material that may be passed on to offspring, e.g., zygote to gametocyte to sperm and egg cells. Germ line cells can reproduce indefinitely. guanine One of the four bases that make up DNA. Abbreviated with a 'G.'
15 Animation 19 Animation 20 Animation 21 Animation 22 Animation 23 Animation, 23 Problem 26 Animation 28 Animation haploid The chromosome number equal to one complete set of the genetic endowment of a eukaryotic organism.
8 Animation, 8 Problem heterozygous Possessing two different forms (alleles) of a gene, one inherited from each parent. Also known as a "gene carrier". histone Any of five related proteins, composed primarily of basic amino acids, which are the scaffold around which DNA is wound to form the chromatin structure of eukaryotic chromosomes.
29 Animation, 29 Gallery, 29 Problem homeotic gene Genes that control developmental patterns, determining where, when, and how body segments and structures develop. Alterations in these genes cause changes in patterns of body parts in animals and plants.
37 Animation, 37 Video, 37 Problem homolog Any member of a set of genes or DNA sequences from different organisms whose nucleotide sequences show a high degree of one-to-one correspondence.
11 Animation, 40 Animation homozygous Possessing two identical forms (alleles) of a gene, one inherited from each parent. Human artificial chromosome (HAC) An artificial vector used to transfer or express large fragments of human DNA, similar to a yeast artificial chromosome (YAC) and bacterial artificial chromosome (BAC). Human Genome Project An international collaboration to sequence and annotate the entire human genome. The public part of the project was undertaken by various sequencing centers around the world with contributions from thousands of scientists. The private part of the project was undertaken by Celera Genomics.
39 Animation, 39 Video, 39 Gallery, 39 Bio 41 Animation hybrid The offspring of two parents differing in at least one genetic characteristic (trait). Also, heteroduplex DNA or DNA-RNA molecule.
5 Animation, 5 Bio 12 Animation, 12 Problem hybridization The hydrogen bonding of complementary DNA and/or RNA sequences to form a duplex molecule.
31 Animation, 31 Bio hydrogen bond A relatively weak bond formed between a hydrogen atom (which is covalently bound to a nitrogen or oxygen atom) and a nitrogen or oxygen with an unshared electron pair.
19 Animation insertion A mutation in DNA where a section of DNA is inserted. Some insertions have no effect while others can have a drastic effect, depending on how much DNA was inserted and whether the insertion was in a region of DNA that encoded a protein. insulin A peptide hormone secreted by the islets of Langerhans of the pancreas that regulates the level of sugar in the blood.
23 Animation, 23 Bio interferon A family of small proteins that stimulate viral resistance in cells.
35 Animation intron A section of a gene that does not contain any instructions for making a protein. Introns separate exons (the coding sections of a gene) from each other.
24 Animation, 24 Problem 26 Animation knockout A gene that is prevented from functioning in an organism. "Knockout genes" can be used to study the unknown effects of a particular gene. A "knockout mouse" is a mouse that has had one or more genes "knocked out". lac operon An "operon" is the close arrangement of related genes and their common regulation. The lac operon controls the transport and metabolism of lactose in bacteria, and involves both positive and negative regulation.
33 Animation, 33 Video, 33 Bio, 33 Problem library A collection of cells, usually bacteria or yeast, that have been transformed with recombinant vectors carrying DNA inserts from a single species.
10 Animation 36 Animation 39 Animation linkage The frequency of coinheritance of a pair of genes and/or genetic markers, which provides a measure of their physical proximity to one another on a chromosome.
11 Animation, 11 Bio mapping Determining the physical location of a gene or genetic marker on a chromosome.
11 Animation 24 Animation marker Used to identify cells, individuals or species, a marker is a gene or DNA sequence with a known chromosomal location. meiosis The reduction division process by which haploid gametes and spores are formed, consisting of a single duplication of the genetic material followed by two mitotic divisions.
8 Animation, 8 Gallery, 8 Bio 11 Animation Mendelian inheritance Rules describing the inheritance of characteristics, based around dominant and recessive patterns of inheritance. Mendel's First Law states that each gamete receives one copy of each parental allele Mendel's Second Law states that alleles sort independently of one other during gamete formation. messenger RNA (mRNA) A type of RNA involved in protein production. DNA is transcribed into mRNA, which is then translated into amino acids to form proteins.
21 Animation, 21 Gallery, 21 Bio 22 Animation, 22 Problem 24 Animation, 24 Gallery, 24 Video, 24 Bio 31 Animation 39 Animation metaphase A phase during the eukaryotic cell cycle when condensed chromosomes align in the middle of a dividing cell before the cell is separated into two daughter cells. microarray (see DNA array) mitosis The process of cell division during which somatic cells are made. In mitosis, one cell divides evenly to produce two daughter cells that have the same chromosome number.
7 Animation, 7 Bio, 7 Problem 8 Animation mitochondrion An organelle that generates adenosine triphosphate (ATP), the main source of cellular energy. Mitochondria are also involved in cell signaling, differentiation, and cell cycle control and have their own independent genome, separate to the cell's nuclear DNA. mitochondrial DNA Mitochondria have their own independent genome, separate to a cell's nuclear DNA (mtDNA). The genome is usually circular, replicates independently and is comprised mainly of genes involved with production of adenosine triphosphate (ATP), the main source of cellular energy.
30 Animation, 30 Video, 30 Problem model organism A non-human organism, such as yeast, fruit fly or mouse, used to study biological phenomena, such as genetics. Model organisms share many genes, and biochemical and physiological processes with humans. mutation A change in the genetic code (the A's, C's, G's and T's) of a gene.
27 Animation, 27 Gallery, 27 Video, 27 Bio 28 Animation 32 Animation Neurospora crassa (bread mold) A haploid model organism, most famously used by Beadle and Tatum to study metabolic pathways and propose the "one gene, one enzyme" hypothesis.
16 Animation, 16 Gallery nitrocellulose A membrane used to immobilize DNA, RNA, or protein, which can then be probed with a labeled sequence or antibody.
36 Animation nitrogenous bases The purines (adenine and guanine) and pyrimidines (thymine, cytosine, and uracil) that comprise DNA and RNA molecules.
15 Animation, 15 Problem 19 Animation nuclease A class of enzymes that degrades DNA and/or RNA molecules by cleaving the phosphodiester bonds that link adjacent nucleotides.
17 Animation, 17 Problem 25 Animation, 25 Problem nuclein The term used by Friedrich Miescher to describe the nuclear material he discovered in 1869, which today is known as DNA.
15 Animation nucleotide A building block of DNA and RNA, consisting of a nitrogenous base, a five-carbon sugar, and a phosphate group.
15 Animation 15 Problem 23 Animation, 23 Problem nucleus The membrane-bound region of a eukaryotic cell that contains the chromosomes.
7 Animation, 7 Problem 8 Animation, 8 Problem 30 Animation oncogene A gene that contributes to cancer formation when mutated or inappropriately expressed.
40 Animation operator A prokaryotic regulatory element that interacts with a repressor to control the transcription of adjacent structural genes.
33 Animation, 33 Problem P53 A protein that acts as a "checkpoint" in cells, inducing either growth arrest, DNA repair, or cell death when the cell's DNA is damaged. Most cancer cells have mutations in the p53 protein. pedigree A diagram mapping the genetic history of a particular family.
13 Animation, 13 Problem 14 Animation, 14 Problem phage (see bacteriophage) phenotype The expression of a trait based on the genetic makeup or genotype.
2 Animation 4 Animation 5 Animation, 5 Problem 10 Animation, 10 Problem 11 Animation, 11 Problem phosphate One of the components of DNA, the phosphodiester bonds form the sugar-phosphate backbone. Adenosine triphosphate (ATP) is also the main source of cellular energy.
15 Animation 23 Animation physical map A map showing physical locations on a DNA molecule, such as restriction sites, RFLPs, and sequence-tagged sites. plasmid Circular pieces of DNA with bacterial components. These exist outside of and are replicated along with the bacterial chromosomes. There can be multiple plasmid copies in one bacterial cell. Foreign DNA can be added into plasmids using molecular enzymes to "cut" and "paste" the DNA.
33 Animation 34 Animation 39 Animation polylinker A short DNA sequence containing several restriction enzyme recognition sites that is contained in cloning vectors.
34 Problem polymerase An enzyme that catalyzes the addition of multiple subunits to a substrate molecule.
DNA: 20 Animation, 20 Video
RNA: 21 Animation 33 Animation polymerase chain reaction (PCR) A procedure that enzymatically amplifies a target DNA sequence of up to several thousand base pairs through repeated replication by DNA polymerase.
39 Animation polynucleotide A DNA polymer composed of multiple nucleotides.
15 Animation, 15 Problem polypeptide (protein) A polymer composed of multiple amino acid units linked by peptide bonds.
15 Animation primer A short DNA or RNA fragment annealed to single-stranded DNA, from which DNA polymerase extends a new DNA strand to produce a duplex molecule.
23 Animation 39 Animation, 39 Problem 41 Animation, 41 Problem proinsulin A polypeptide precursor of proinsulin, from which a 24-amino-acid signal peptide sequence is responsible for extracellular secretion.
34 Animation prokaryote An organism whose cell(s) lacks a nucleus and other membrane-bound vesicles. Prokaryotes include all members of the Kingdom Monera.
18 Animation promoter A region of DNA extending 150-300 bp upstream from the transcription start site that contains binding sites for RNA polymerase and a number of proteins that regulate the rate of transcription of the adjacent gene.
33 Animation 37 Animation protease An enzyme that cleaves peptide bonds that link amino acids in protein molecules.
17 Animation, 17 Problem 25 Animation, 25 Problem protein A polymer of amino acids linked via peptide bonds and which may be composed of two or more polypeptide chains.
15 Animation RNA (ribonucleic acid) An organic acid composed of repeating nucleotide units of adenine, guanine, cytosine, and uracil, whose ribose components are linked by phosphodiester bonds.
21 Animation, 21 Problem 26 Animation, 26 Video, 26 Problem 24 Animation, 24 Gallery, 24 Video, 24 Problem 25 Animation, 25 Video, 25 Problem radioactive probe/tag A fragment of DNA or RNA used to detect specific nucleotide sequences using the principle of DNA complementarity. The probe is "tagged" with a radioactive or fluorescent molecular marker to allow visualization.
18 Animation 20 Animation 21 Animation 23 Animation 25 Animation, 25 Problem reading frame A series of triplet codons beginning from a specific nucleotide. Depending on where one begins, each DNA strand contains three different reading frames.
22 Animation, 22 Problem 27 Animation receptor A protein molecule that binds one or more specific kinds of signaling molecules (ligands), which causes the receptor to change shape (conformation). This can then cause a loss or gain of protein activity and resultant cellular responses.
33 Animation 35 Animation, 35 Problem recessive A recessive genetic trait or disorder presents only when two copies of the gene are inherited. For a dominant trait or disorder one copy of the gene is necessary for the trait to develop.
4 Animation, 4 Problem 5 Animation, 5 Problem 13 Animation, 13 Problem recognition sequence (site) A nucleotide sequence&mdash composed typically of 4, 6, or 8 nucleotides&mdash that is recognized by a restriction endonuclease. Type II enzymes cut (and their corresponding modification enzymes methylate) within or very near the recognition see Recognition sequence (site)
24 Animation, 24 Problem 34 Animation, 34 Problem recombination A process whereby a section of DNA (or sometimes RNA) is broken and joined to another section of DNA, either naturally or artificially, allowing genetic variation. A common natural example is chromosomal crossover.
11 Animation, 11 Video, 11 Problem 27 Animation 32 Animation repressor A DNA-binding protein in prokaryotes that blocks gene transcription by binding to the operator.
33 Animation, 33 Video, 33 Problem restriction endonuclease (enzyme) A class of endonucleases that cleaves DNA after recognizing a specific sequence.
24 Animation, 24 Video, 24 Problem 34 Animation, 34 Problem 39 Animation restriction fragment length polymorphism (RFLP) Differences in nucleotide sequence between alleles at a chromosomal locus result in restriction fragments of varying lengths detected by Southern analysis.
39 Animation restriction map A physical map showing the locations of restriction enzyme recognition sites.
24 Animation, 24 Problem 39 Animation retrovirus A member of a class of RNA viruses that utilizes the enzyme reverse transcriptase to reverse copy its genome into a DNA intermediate, which integrates into the host-cell chromosome. Many naturally occurring cancers of vertebrate animals are caused by retroviruses.
25 Animation, 25 Gallery, 25 Video reverse transcriptase (RNA-dependent DNA polymerase) An enzyme isolated from retrovirus-infected cells that synthesizes a complementary (c)DNA strand from an RNA template.
25 Animation, 25 Video, 25 Problem 36 Animation 39 Animation ribonucleic acid (RNA) An organic acid composed of repeating nucleotide units of adenine, guanine, cytosine, and uracil, whose ribose components are linked by phosphodiester bonds.
15 Animation, 21 Animation ribosomes Structure made up of proteins and RNA that are the sites of protein production in the cell. Ribosomes decode messenger RNA (mRNA) and assemble amino acids into proteins based on the mRNA script.
21 Animation 22 Animation Saccharomyces cerevisiae Brewer's yeast
38 Animation 40 Animation selectable marker A gene whose expression allows one to identify cells that have been transformed or transfected with a vector containing the marker gene.
18 Animation, 18 Problem 27 Animation 34 Animation, 34 Problem semiconservative replication During DNA duplication, each strand of a parent DNA molecule is a template for the synthesis of its new complementary strand. Thus, one half of a preexisting DNA molecule is conserved during each round of replication.
20 Animation, 20 Video, 20 Problem sex chromosome Sex chromosomes in the germ cells of most animals and some plants combine to determine the sex and sex-linked characteristics of an individual. Sex chromosomes are usually designated X and Y, with females being XX and males being XY in mammals. sex linked A pattern of inheritance when an allele is located on a sex chromosome and the phenotype of the allele is therefore related to the sex of the organism. As humans have more genes on the X chromosome than the Y chromosome, there are more human X-linked traits and diseases than Y-linked. signal transduction The biochemical events that conduct the signal of a hormone or growth factor from the cell exterior, through the cell membrane, and into the cytoplasm. This involves a number of molecules, including receptors, G proteins, and second messengers.
35 Animation, 35 Problem splicing Processing messenger RNA (mRNA) after transcription to remove introns and join exons to allow translation of mRNA into proteins.
24 Animation, 24 Problem stem cells Stem cells can differentiate into a range of specialized cells (they are "pluripotent"). Embryonic and adult stem cells can be grown artificially and have the potential to be used for medical therapies.
41 Animation sticky ends A protruding, single-stranded nucleotide sequence produced when a restriction endonuclease cleaves off center in its recognition sequence.
34 Animation, 34 Problem stop codon Any of three mRNA sequences (UGA, UAG, UAA) that do not code for an amino acid and thus signal the end of protein synthesis.
22 Animation, Problem 27 Animation subtractive hybridization A powerful technique to study gene expression in specific tissues or cell types at a specific stage of development.
36 Animation, 36 Video suicide gene Genes that cause a cell to undergo cell death ("apoptosis"). These genes may be activated through a variety of signaling processes, with the p53 protein being a common "switch."
38 Animation supercoiling DNA is tightly packaged within the nucleus by adding twists to create structural tension that is then relieved by supercoil structures. The DNA first wraps around histones to form a 10nm fiber, then this fiber is coiled into a 30nm fiber which is then coiled upon itself again.
29 Animation, 29 Gallery, 29 Video, 29 Problem supernatant The soluble liquid fraction of a sample after centrifugation or precipitation of insoluble solids.
17 Animation 20 Animation 21 Animation 25 Animation, 25 Problem, 35 Animation template An RNA or single-stranded DNA molecule upon which a complementary nucleotide strand is synthesized.
20 Animation 21 Animation 22 Animation 23 Animation 24 Animation 25 Animation, 25 Bio 26 Animation 31 Animation termination codon (see stop codon) thymine One of the four bases that make up DNA, abbreviated with a "T." Thymine is replaced by uracil in RNA.
15 Animation 19 Animation 20 Animation 21 Animation 22 Animation 23 Animation, 23 Problem 26 Animation 28 Animation trait (see phenotype) transcription The process through which DNA is copied to messenger RNA (mRNA) for the production of protein.
21 Animation transfer RNA (tRNA) A specific class of RNA molecules that carry animo acids to the ribosomes for assembly into protein. There is a specific tRNA for each of the 20 amino acids.
21 Animation, 21 Video transformation In prokaryotes, the natural or induced uptake and expression of a foreign DNA sequence&mdash typically a recombinant plasmid in experimental systems. In higher eukaryotes, the conversion of cultured cells to a malignant phenotype&mdash typically through infection by a tumor virus or transfection with an oncogene.
34 Animation, 34 Video transforming principle The hypothesis, now proven, that bacteria are capable of transferring genetic information through transformation.
17 Animation transgenic A vertebrate organism in which a foreign DNA gene (a transgene) is stably incorporated into its genome early in embryonic development. The transgene is present in both somatic and germ cells, is expressed in one or more tissues, and is inherited by offspring in a Mendelian fashion.
41 Animation translation The process through which messenger RNA (mRNA) is used and decoded to produce protein. This happens at the protein-making factories of the cell called ribosomes.
21 Animation 22 Animation, 22 Video, 22 Problem translocation Breakage of a large segment of DNA from one chromosome, followed by the segment's attachment to a different chromosome.
29 Bio 32 Bio transposon (transposable, or movable genetic element). A relatively small DNA segment that has the ability to move from one chromosomal position to another.
32 Animation, 32 Video, 32 Bio, 32 Problem trypsin A proteolytic enzyme that hydrolyzes peptide bonds on the carboxyl side of the amino acids arginine and lysine.
25 Animation, 25 Problem Ultrabithorax (Ubx) A homeotic gene that acts on the development of thoracic segments T2 and T3 in Drosophila.
37 Animation uracil Found in RNA, this pyrimidine derivative pairs with adenine and replaces thymine during DNA transcription.
21 Animation, 21 Problem 22 Animation, 22 Problem 26 Animation vector An autonomously replicating DNA molecule into which foreign DNA fragments are inserted and then propagated in a host cell.
41 Animation, 41 Problem virus (see also retrovirus) An infectious particle composed of a protein capsule and a nucleic acid core, which is dependent on a host organism for replication.
18 Animation 25 Animation, 25 Gallery, 25 Bio 27 Animation 24 Bio