Amino acid coding table. Genetic code: description, characteristics, research history

23.09.2019

Leading scientific journal Nature announced the discovery of a second genetic code - a kind of "code within a code", which was recently cracked by molecular biologists and computer programmers. Moreover, in order to reveal it, they did not use evolutionary theory, but information technology.

The new code is called the Splicing Code. It is within the DNA. This code controls the underlying genetic code in a very complex yet predictable way. The splicing code controls how and when genes and regulatory elements are assembled. Revealing this code within a code helps shed light on some of the long-standing mysteries of genetics that have surfaced since the Complete Human Genome Sequencing Project. One such mystery was why there are only 20,000 genes in an organism as complex as the human being? (Scientists expected to find a lot more.) Why are genes broken into segments (exons) that are separated by non-coding elements (introns) and then joined together (i.e., spliced) after transcription? And why are genes turned on in some cells and tissues and not in others? For two decades, molecular biologists have tried to elucidate the mechanisms of genetic regulation. This article points to a very important point in understanding what is really going on. It doesn't answer every question, but it does demonstrate that the internal code exists. This code is a communication system that can be deciphered so clearly that scientists could predict how a genome might behave in certain situations and with inexplicable accuracy.

Imagine that you hear an orchestra in the next room. You open the door, look inside and see three or four musicians playing musical instruments in the room. This is what Brandon Frey, who helped break the code, says the human genome looks like. He says: “We were only able to detect 20,000 genes, but we knew that they form a huge number of protein products and regulatory elements. How? One of the methods is called alternative splicing". Different exons (parts of genes) can be assembled in different ways. “For example, three genes for the neurexin protein can create over 3,000 genetic messages that help control the brain’s wiring system.” Frey says. Right there in the article, it says that scientists know that 95% of our genes have alternative splicing, and in most cases, transcripts (RNA molecules resulting from transcription) are expressed differently in different types of cells and tissues. There must be something that controls how these thousands of combinations are assembled and expressed. This is the task of the Splicing Code.

Readers who want a quick overview of the discovery can read the article at Science Daily entitled "Researchers who cracked the 'Splicing Code' unravel the mystery behind biological complexity". The article says: “Scientists at the University of Toronto have gained a fundamental new understanding of how living cells use a limited number of genes to form incredibly complex organs like the brain.”. Nature magazine itself begins with Heidi Ledford's "Code Within Code." This was followed by a paper by Tejedor and Valcarcel titled “Gene Regulation: Breaking the Second Genetic Code. Finally, a paper by a group of researchers from the University of Toronto led by Benjamin D. Blencoe and Brandon D. Frey, "Deciphering the Splicing Code," was decisive.

This article is an information science victory that reminds us of codebreakers from World War II. Their methods included algebra, geometry, probability theory, vector calculus, information theory, program code optimization, and other advanced techniques. What they didn't need was evolutionary theory, which has never been mentioned in scientific articles. Reading this article, you can see how much tension the authors of this overture are under:

“We describe a ‘splicing code’ scheme that uses combinations of hundreds of RNA properties to predict tissue-mediated changes in alternative splicing of thousands of exons. The code establishes new classes of splicing patterns, recognizes different regulatory programs in different tissues, and establishes mutation-controlled regulatory sequences. We have uncovered widely used regulatory strategies, including: using unexpectedly large property pools; detection of low levels of exon inclusion, which are attenuated by the properties of specific tissues; the manifestation of properties in introns is deeper than previously thought; and modulation of the levels of the splice variant by the structural characteristics of the transcript. The code helped establish a class of exons whose inclusion mutes expression in adult tissues, activating mRNA degradation, and whose exclusion promotes expression during embryogenesis. The code facilitates the disclosure and detailed description of genome-wide regulated events of alternative splicing.”

The team that cracked the code included specialists from the Department of Electronics and Computer Engineering, as well as from the Department of Molecular Genetics. (Frey himself works for Microsoft Research, a division of Microsoft Corporation) Like the decoders of the past, Frey and Barash developed "a new computer-assisted biological analysis that detects 'code words' hidden within the genome". With the help of a huge amount of data created by molecular geneticists, a group of researchers carried out "reverse engineering" of the splicing code until they could predict how he would act. Once the researchers got the hang of it, they tested the code for mutations and saw how exons were inserted or removed. They found that the code could even cause tissue-specific changes or act differently depending on whether it was an adult mouse or an embryo. One gene, Xpo4, is associated with cancer; The researchers noted: “These data support the conclusion that Xpo4 gene expression must be tightly controlled to avoid potential detrimental effects, including oncogenesis (cancer), since it is active during embryogenesis but is reduced in adult tissues. It turns out that they were absolutely surprised by the level of control they saw. Intentionally or not, Frey did not use random variation and selection as a clue, but the language of intelligent design. He noted: "Understanding a complex biological system is like understanding a complex electronic circuit."

Heidi Ledford said that the apparent simplicity of Watson-Crick's genetic code, with its four bases, triplet codons, 20 amino acids, and 64 DNA "characters" - hides a whole world of complexity. Encapsulated within this simpler code, the splicing code is much more complex.

But between DNA and proteins lies RNA, a separate world of complexity. RNA is a transformer that sometimes carries genetic messages, and sometimes controls them, while using many structures that can influence its function. In a paper published in the same issue, a team of researchers led by Benjamin D. Blencoe and Brandon D. Frey at the University of Toronto in Ontario, Canada, report attempts to unravel a second genetic code that can predict how messenger RNA segments are transcribed from a particular genes can mix and match to form a variety of products in different tissues. This process is known as alternative splicing. This time there is no simple table - instead, algorithms that combine more than 200 different properties of DNA with definitions of the structure of RNA.

The work of these researchers indicates the rapid progress that computational methods have made in modeling RNA. In addition to understanding alternative splicing, computer science is helping scientists predict RNA structures and identify small regulatory fragments of RNA that do not code for proteins. "It's a wonderful time", says Christopher Berg, a computer biologist at the Massachusetts Institute of Technology in Cambridge. “In the future, we will have a huge success”.

Computer science, computer biology, algorithms, and codes were not part of Darwin's vocabulary when he developed his theory. Mendel had a very simplified model of how traits are distributed during inheritance. In addition, the idea that features are encoded was only introduced in 1953. We see that the original genetic code is regulated by an even more complex code included in it. These are revolutionary ideas.. Moreover, there are all indications that this level of control is not the last. Ledford reminds us that, for example, RNA and proteins have a three-dimensional structure. The function of molecules can change when their shape changes. There must be something that controls folding so that the three-dimensional structure does what the function requires. In addition, access to genes appears to be controlled another code, histone code. This code is encoded by molecular markers or "tails" on histone proteins that serve as centers for DNA coiling and supercoiling. Describing our time, Ledford speaks of "permanent renaissance in RNC informatics".

Tejedor and Valcarcel agree that complexity lies behind simplicity. “In theory, everything looks very simple: DNA forms RNA, which then creates a protein”, - they begin their article. “But the reality is much more complicated.”. In the 1950s, we learned that all living organisms, from bacteria to humans, have a basic genetic code. But we soon realized that complex organisms (eukaryotes) have some unnatural and difficult to understand property: their genomes have peculiar sections, introns, that must be removed so that exons can join together. Why? The fog is clearing today “The main advantage of this mechanism is that it allows different cells to choose alternative ways of splicing the messenger RNA precursor (pre-mRNA) and thus one gene generates different messages,” they explain, "and then different mRNAs can code for different proteins with different functions". From less code, you get more information, as long as there is this other code inside the code that knows how to do it.

What makes cracking the splicing code so difficult is that the factors that control exon assembly are set by many other factors: sequences near exon boundaries, intron sequences, and regulatory factors that either aid or inhibit the splicing mechanism. Besides, "the effects of a certain sequence or factor may vary depending on its location relative to the boundaries of the intron-exon or other regulatory motifs", - Tejedor and Valcarcel explain. “Therefore, the most difficult task in predicting tissue-specific splicing is to compute the algebra of the myriad of motifs and the relationships between the regulatory factors that recognize them.”.

To solve this problem, a team of researchers entered into the computer a huge amount of data about the RNA sequences and the conditions under which they were formed. "The computer was then given the task of identifying the combination of properties that would best explain the experimentally established tissue-specific exon selection.". In other words, the researchers reverse engineered the code. Like World War II codebreakers, once scientists know the algorithm, they can make predictions: "It correctly and accurately identified alternative exons and predicted their differential regulation between pairs of tissue types." And just like any good scientific theory, the discovery provided new insights: “This allowed us to re-explain previously established regulatory motivations and pointed to previously unknown properties of known regulators, as well as unexpected functional relationships between them.”, the researchers noted. “For example, the code implies that the inclusion of exons leading to processed proteins is a general mechanism for controlling the process of gene expression during the transition from embryonic tissue to adult tissue.”.

Tejedor and Valcarcel consider the publication of their paper an important first step: "The work ... is better seen as the discovery of the first fragment of the much larger Rosetta Stone needed to decipher the alternative messages of our genome." According to these scientists, future research will undoubtedly improve their knowledge of this new code. At the end of their article, they mention evolution in passing, and they do it in a very unusual way. They say, “That doesn't mean that evolution created these codes. This means that progress will require an understanding of how the codes interact. Another surprise was that the degree of conservation observed to date raises the question of the possible existence of "species-specific codes".

The code probably works in every single cell, and therefore must be responsible for more than 200 types of mammalian cells. It also has to handle a huge variety of alternative splicing patterns, not to mention simple decisions to include or skip a single exon. The limited evolutionary retention of regulation of alternative splicing (estimated to be about 20% between humans and mice) raises the question of the existence of species-specific codes. Moreover, the relationship between DNA processing and gene transcription influences alternative splicing, and recent evidence points to the packaging of DNA by histone proteins and histone covalent modifications (the so-called epigenetic code) in the regulation of splicing. Therefore, future methods will have to establish the exact interaction between the histone code and the splicing code. The same applies to the still little understood influence of complex RNA structures on alternative splicing.

Codes, codes and more codes. The fact that scientists say almost nothing about Darwinism in these papers indicates that evolutionary theorists, adherents of old ideas and traditions, have a lot to think about after they read these papers. But those who are enthusiastic about the biology of codes will be at the forefront. They have a great opportunity to take advantage of the exciting web application that the codebreakers have created to encourage further exploration. It can be found on the University of Toronto website called "Alternative Splicing Prediction Website". Visitors will look in vain for mention of evolution here, despite the old axiom that nothing in biology makes sense without it. The new 2010 version of this expression might sound like this: "Nothing in biology makes sense unless viewed in the light of computer science" .

Links and notes

We're glad we were able to tell you about this story on the day it was published. Perhaps this is one of the most significant scientific articles of the year. (Of course, every big discovery made by other groups of scientists, like the discovery of Watson and Crick, is significant.) The only thing we can say to this is: “Wow!” This discovery is a remarkable confirmation of Designed Creation and a huge challenge to the Darwinian empire. It is interesting how evolutionists will try to correct their simplified history of random mutations and natural selection, which was invented back in the 19th century, in the light of these new data.

Do you understand what Tejedor and Valcarcel are talking about? Views can have their own code specific to those views. “Therefore, future methods will have to establish the exact interaction between the histone [epigenetic] code and the splicing code,” they note. In translation, this means: “Darwinists have nothing to do with it. They just can't handle it." If the simple genetic code of Watson-Crick was a problem for the Darwinists, then what do they say now about the splicing code, which creates thousands of transcripts from the same genes? And how will they deal with the epigenetic code that controls gene expression? And who knows, maybe in this incredible “interaction” that we are just beginning to learn about, other codes are involved, reminiscent of the Rosetta Stone, just beginning to emerge from the sand?

Now that we're thinking about codes and computer science, we're starting to think about different paradigms for new research. What if the genome partially acts as a storage network? What if cryptography takes place in it or compression algorithms occur? We should remember about modern information systems and information storage technologies. Maybe we will even find elements of steganography. Undoubtedly, there are additional resistance mechanisms, such as duplications and corrections, that may help explain the existence of pseudogenes. Whole genome copying may be a response to stress. Some of these phenomena may prove to be useful indicators of historical events that have nothing to do with a universal common ancestor, but help explore comparative genomics within informatics and resistance design, and help understand the cause of a disease.

Evolutionists find themselves in a major quandary. The researchers tried to modify the code, but got only cancer and mutations. How are they going to navigate the field of fitness when it's all mined with catastrophes waiting in the wings as soon as someone starts tampering with these inextricably linked codes? We know there is some built-in resilience and portability, but the whole picture is an incredibly complex, designed, optimized information system, not a jumble of pieces that can be played around endlessly. The whole idea of ​​code is the concept of intelligent design.

A.E. Wilder-Smith emphasized this. The code assumes an agreement between the two parts. An agreement is an agreement in advance. It implies planning and purpose. The SOS symbol, as Wilder-Smith would say, we use by convention as a distress signal. SOS does not look like a disaster. It doesn't smell like a disaster. It doesn't feel like a disaster. People would not understand that these letters stand for disaster if they did not understand the essence of the agreement itself. Similarly, an alanine codon, HCC, does not look, smell, or feel like alanine. A codon would have nothing to do with alanine unless there was a pre-established agreement between the two coding systems (protein code and DNA code) that "GCC should stand for alanine." To convey this agreement, a family of transducers, aminoacyl-tRNA synthetases, are used, which translate one code into another.

This was to strengthen the theory of design in the 1950s, and many creationists preached it effectively. But evolutionists are like eloquent salesmen. They made up their tales about the Tinker Bell fairy, who deciphers the code and creates new species through mutation and selection, and convinced many people that miracles can still happen today. Well, well, today is the 21st century outside the window and we know the epigenetic code and the splicing code - two codes that are much more complex and dynamic than the simple code of DNA. We know about codes within codes, about codes above codes and below codes - we know a whole hierarchy of codes. This time around, the evolutionists can't just put their finger in the gun and bluff us with their beautiful speeches when guns are placed on both sides - a whole arsenal aimed at their main structural elements. All this is a game. A whole era of computer science has grown around them, they have long gone out of fashion and look like the Greeks, who are trying to climb modern tanks and helicopters with spears.

Sad to admit, evolutionists don't understand this, or even if they do, they're not going to give up. Incidentally, this week, just as the article on the Splicing Code was published, the most vicious and hateful anti-creation and intelligent design rhetoric in recent memory has been pouring from the pages of pro-Darwinian magazines and newspapers. We are yet to hear of many more such examples. And as long as they hold the microphones in their hands and control the institutions, many people will fall for them, thinking that science continues to give them a good reason. We are telling you all this so that you will read this material, study it, understand it, and stock up on the information you need in order to combat this fanatical, misleading nonsense with the truth. Now, go ahead!

They line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L)Leucine
CUC (Leu/L)Leucine
CUA (Leu/L) Leucine
CUG (Leu/L) Leucine

In some proteins, non-standard amino acids such as selenocysteine ​​and pyrrolysine are inserted by the stop codon-reading ribosome, which depends on the sequences in the mRNA. Selenocysteine ​​is now considered as the 21st, and pyrrolysine as the 22nd amino acid that makes up proteins.

Despite these exceptions, the genetic code of all living organisms has common features: a codon consists of three nucleotides, where the first two are defining, codons are translated by tRNA and ribosomes into a sequence of amino acids.

Deviations from the standard genetic code.
Example codon Usual meaning Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Mammalian mitochondria, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A, G) Arginine Stop

The history of ideas about the genetic code

Nevertheless, in the early 1960s, new data revealed the failure of the "comma-free code" hypothesis. Then experiments showed that codons, considered by Crick to be meaningless, can provoke protein synthesis in a test tube, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a number of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but also serves as a start codon - as a rule, translation begins from the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. jukes th, osawa s, The genetic code in mitochondria and chloroplasts., Experientia. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code". microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins.". Adv Protein Chem. 7 : 1-67. PMID 14933251 .
  7. M. Ichas biological code. - Peace, 1971.
  8. WATSON JD, CRICK FH. (April 1953). «Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid.". Nature 171 : 737-738. PMID 13054692 .
  9. WATSON JD, CRICK FH. (May 1953). "Genetical implications of the structure of deoxyribonucleic acid.". Nature 171 : 964-967. PMID 13063483 .
  10. Crick F.H. (April 1966). "The genetic code - yesterday, today, and tomorrow." Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relationship between Deoxyribonucleic Acid and Protein Structures.". Nature 173 : 318. DOI: 10.1038/173318a0 . PMID 13882203 .
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins.". Adv Biol Med Phys. 4 : 23-68. PMID 13354508 .
  13. Gamow G, Ycas M. (1955). STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. ". Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789 .
  14. Crick FH, Griffith JS, Orgel LE. (1957). CODES WITHOUT COMMAS. ". Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to the decoding of DNA. - M.: Tsentrpoligraf, 2006. - 208 s - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros Educational Journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation. 2010 .

Lecture 5 Genetic code

Concept definition

The genetic code is a system for recording information about the sequence of amino acids in proteins using the sequence of nucleotides in DNA.

Since DNA is not directly involved in protein synthesis, the code is written in the language of RNA. RNA contains uracil instead of thymine.

Properties of the genetic code

1. Tripletity

Each amino acid is encoded by a sequence of 3 nucleotides.

Definition: A triplet or codon is a sequence of three nucleotides that codes for one amino acid.

The code cannot be monopleth, since 4 (the number of different nucleotides in DNA) is less than 20. The code cannot be doublet, because 16 (the number of combinations and permutations of 4 nucleotides by 2) is less than 20. The code can be triplet, because 64 (the number of combinations and permutations from 4 to 3) is greater than 20.

2. Degeneracy.

All amino acids, with the exception of methionine and tryptophan, are encoded by more than one triplet:

2 AKs for 1 triplet = 2.

9 AKs x 2 triplets = 18.

1 AK 3 triplets = 3.

5 AKs x 4 triplets = 20.

3 AKs x 6 triplets = 18.

A total of 61 triplet codes for 20 amino acids.

3. The presence of intergenic punctuation marks.

Definition:

Gene is a segment of DNA that codes for one polypeptide chain or one molecule tPHK, rRNA orsPHK.

GenestPHK, rPHK, sPHKproteins do not code.

At the end of each gene encoding a polypeptide, there is at least one of 3 triplets encoding RNA stop codons, or stop signals. In mRNA they look like this: UAA, UAG, UGA . They terminate (end) the broadcast.

Conventionally, the codon also applies to punctuation marks AUG - the first after the leader sequence. (See lecture 8) It performs the function of a capital letter. In this position, it codes for formylmethionine (in prokaryotes).

4. Uniqueness.

Each triplet encodes only one amino acid or is a translation terminator.

The exception is the codon AUG . In prokaryotes, in the first position (capital letter), it codes for formylmethionine, and in any other position, it codes for methionine.

5. Compactness, or the absence of intragenic punctuation marks.
Within a gene, each nucleotide is part of a significant codon.

In 1961, Seymour Benzer and Francis Crick experimentally proved that the code is triplet and compact.

The essence of the experiment: "+" mutation - the insertion of one nucleotide. "-" mutation - loss of one nucleotide. A single "+" or "-" mutation at the beginning of a gene corrupts the entire gene. A double "+" or "-" mutation also spoils the entire gene.

A triple "+" or "-" mutation at the beginning of the gene spoils only part of it. A quadruple "+" or "-" mutation again spoils the entire gene.

The experiment proves that the code is triplet and there are no punctuation marks inside the gene. The experiment was carried out on two adjacent phage genes and showed, in addition, the presence of punctuation marks between genes.

6. Versatility.

The genetic code is the same for all creatures living on Earth.

In 1979 Burrell opened ideal human mitochondrial code.

Definition:

“Ideal” is the genetic code in which the rule of degeneracy of the quasi-doublet code is fulfilled: If the first two nucleotides in two triplets coincide, and the third nucleotides belong to the same class (both are purines or both are pyrimidines), then these triplets encode the same amino acid .

There are two exceptions to this rule in generic code. Both deviations from the ideal code in the universal relate to the fundamental points: the beginning and end of protein synthesis:

codon

Universal

code

Mitochondrial codes

Vertebrates

Invertebrates

Yeast

Plants

STOP

STOP

With UA

A G A

STOP

STOP

230 substitutions do not change the class of the encoded amino acid. to tearability.

In 1956, Georgy Gamov proposed a variant of the overlapped code. According to the Gamow code, each nucleotide, starting from the third in the gene, is part of 3 codons. When the genetic code was deciphered, it turned out that it was non-overlapping, i.e. each nucleotide is part of only one codon.

Advantages of the overlapped genetic code: compactness, lesser dependence of the protein structure on the insertion or deletion of a nucleotide.

Disadvantage: high dependence of the protein structure on nucleotide substitution and restriction on neighbors.

In 1976, the DNA of the φX174 phage was sequenced. It has a single stranded circular DNA of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

It turned out that there is an overlap. The E gene is completely within the gene D . Its initiation codon appears as a result of a one nucleotide shift in the reading. Gene J starts where gene ends D . Gene initiation codon J overlaps with the termination codon of the gene D due to a shift of two nucleotides. The design is called "reading frame shift" by a number of nucleotides that is not a multiple of three. To date, overlap has only been shown for a few phages.

Information capacity of DNA

There are 6 billion people on Earth. Hereditary information about them
enclosed in 6x10 9 spermatozoa. According to various estimates, a person has from 30 to 50
thousand genes. All humans have ~30x10 13 genes, or 30x10 16 base pairs, which make up 10 17 codons. The average book page contains 25x10 2 characters. The DNA of 6x10 9 spermatozoa contains information equal in volume to approximately

4x10 13 book pages. These pages would occupy the volume of 6 NSU buildings. 6x10 9 sperm take up half of a thimble. Their DNA takes up less than a quarter of a thimble.

GENETIC CODE(Greek, genetikos referring to origin; syn.: code, biological code, amino acid code, protein code, nucleic acid code) - a system for recording hereditary information in the nucleic acid molecules of animals, plants, bacteria and viruses by alternating the sequence of nucleotides.

Genetic information (Fig.) from cell to cell, from generation to generation, with the exception of RNA-containing viruses, is transmitted by reduplication of DNA molecules (see Replication). The implementation of DNA hereditary information in the process of cell life is carried out through 3 types of RNA: informational (mRNA or mRNA), ribosomal (rRNA) and transport (tRNA), which are synthesized on DNA as on a matrix using the RNA polymerase enzyme. At the same time, the sequence of nucleotides in a DNA molecule uniquely determines the sequence of nucleotides in all three types of RNA (see Transcription). The information of a gene (see) encoding a proteinaceous molecule is carried only by mRNA. The end product of the implementation of hereditary information is the synthesis of protein molecules, the specificity of which is determined by the sequence of their amino acids (see Translation).

Since only 4 different nitrogenous bases are present in DNA or RNA [in DNA - adenine (A), thymine (T), guanine (G), cytosine (C); in RNA - adenine (A), uracil (U), cytosine (C), guanine (G)], the sequence of which determines the sequence of 20 amino acids in the protein, the problem of G. to., i.e., the problem of translating a 4-letter alphabet of nucleic acids into the 20-letter alphabet of polypeptides.

For the first time, the idea of ​​matrix synthesis of protein molecules with the correct prediction of the properties of a hypothetical matrix was formulated by N. K. Koltsov in 1928. In 1944, Avery (O. Avery) et al., found that DNA molecules are responsible for the transfer of hereditary traits during transformation in pneumococci . In 1948, E. Chargaff showed that in all DNA molecules there is a quantitative equality of the corresponding nucleotides (A-T, G-C). In 1953, F. Crick, J. Watson and Wilkins (M. H. F. Wilkins), based on this rule and data from X-ray diffraction analysis (see), came to the conclusion that a DNA molecule is a double helix, consisting of two polynucleotide strands linked together by hydrogen bonds. Moreover, only T can be located against A of one chain in the second, and only C against G. This complementarity leads to the fact that the nucleotide sequence of one chain uniquely determines the sequence of the other. The second significant conclusion that follows from this model is that the DNA molecule is capable of self-reproduction.

In 1954, G. Gamow formulated G.'s problem to. in its modern form. In 1957, F. Crick expressed the Adapter Hypothesis, assuming that amino acids interact with the nucleic acid not directly, but through intermediaries (now known as tRNA). In the years that followed, all the principal links in the general scheme for the transmission of genetic information, initially hypothetical, were confirmed experimentally. In 1957 mRNAs were discovered [A. S. Spirin, A. N. Belozersky et al.; Folkin and Astrakhan (E. Volkin, L. Astrachan)] and tRNA [Hoagland (M. V. Hoagland)]; in 1960, DNA was synthesized outside the cell using existing DNA macromolecules as a template (A. Kornberg) and DNA-dependent RNA synthesis was discovered [Weiss (S. V. Weiss) et al.]. In 1961, a cell-free system was created, in which, in the presence of natural RNA or synthetic polyribonucleotides, protein-like substances were synthesized [M. Nirenberg and Matthaei (J. H. Matthaei)]. The problem of cognition of G. to. consisted of studying the general properties of the code and actually deciphering it, that is, finding out which combinations of nucleotides (codons) code for certain amino acids.

The general properties of the code were elucidated regardless of its decoding and mainly before it by analyzing the molecular patterns of the formation of mutations (F. Crick et al., 1961; N. V. Luchnik, 1963). They come down to this:

1. The code is universal, i.e. identical, at least in the main, for all living beings.

2. The code is triplet, that is, each amino acid is encoded by a triple of nucleotides.

3. The code is non-overlapping, i.e. a given nucleotide cannot be part of more than one codon.

4. The code is degenerate, that is, one amino acid can be encoded by several triplets.

5. Information about the primary structure of the protein is read from mRNA sequentially, starting from a fixed point.

6. Most of the possible triplets have "meaning", i.e., encode amino acids.

7. Of the three "letters" of the codon, only two (obligate) are of primary importance, while the third (optional) carries much less information.

Direct deciphering of the code would consist in comparing the nucleotide sequence in the structural gene (or the mRNA synthesized on it) with the amino acid sequence in the corresponding protein. However, this way is still technically impossible. Two other ways were used: protein synthesis in a cell-free system using artificial polyribonucleotides of known composition as a matrix and analysis of the molecular patterns of mutation formation (see). The first brought positive results earlier and historically played a big role in deciphering G. to.

In 1961, M. Nirenberg and Mattei used as a matrix a homo-polymer - a synthetic polyuridyl acid (i.e., artificial RNA of the composition UUUU ...) and received polyphenylalanine. From this it followed that the codon of phenylalanine consists of several U, i.e., in the case of a triplet code, it stands for UUU. Later, along with homopolymers, polyribonucleotides consisting of different nucleotides were used. In this case, only the composition of the polymers was known, while the arrangement of nucleotides in them was statistical, and therefore the analysis of the results was statistical and gave indirect conclusions. Quite quickly, we managed to find at least one triplet for all 20 amino acids. It turned out that the presence of organic solvents, changes in pH or temperature, some cations and especially antibiotics make the code ambiguous: the same codons begin to stimulate the inclusion of other amino acids, in some cases one codon began to encode up to four different amino acids. Streptomycin affected the reading of information both in cell-free systems and in vivo, and was effective only on streptomycin-sensitive bacterial strains. In streptomycin-dependent strains, he "corrected" the reading from codons that had changed as a result of the mutation. Similar results gave reason to doubt the correctness of G.'s decoding to. with the help of a cell-free system; confirmation was required, and primarily by in vivo data.

The main data on G. to. in vivo were obtained by analyzing the amino acid composition of proteins in organisms treated with mutagens (see) with a known mechanism of action, for example, nitrogenous to-one, which causes the replacement of C by U and A by D. Useful information is also provided by the analysis of mutations caused by non-specific mutagens, comparison of differences in the primary structure of related proteins in different species, correlation between the composition of DNA and proteins, etc.

G.'s decoding to. on the basis of data in vivo and in vitro gave the coinciding results. Later, three other methods for deciphering the code in cell-free systems were developed: binding of aminoacyl-tRNA (i.e., tRNA with an attached activated amino acid) with trinucleotides of a known composition (M. Nirenberg et al., 1965), binding of aminoacyl-tRNA with polynucleotides starting with a certain triplet (Mattei et al., 1966), and the use of polymers as mRNA, in which not only the composition, but also the order of nucleotides is known (X. Korana et al., 1965). All three methods complement each other, and the results are consistent with the data obtained in experiments in vivo.

In the 70s. 20th century there were methods of especially reliable verification of the results of G.'s decoding to. It is known that mutations that occur under the influence of proflavin consist in the loss or insertion of individual nucleotides, which leads to a shift in the reading frame. In the T4 phage, a number of mutations were induced by proflavin, in which the composition of lysozyme changed. This composition was analyzed and compared with those codons that should have been obtained by a shift in the reading frame. There was a complete match. Additionally, this method made it possible to establish which triplets of the degenerate code encode each of the amino acids. In 1970, Adams (J. M. Adams) and his collaborators managed to partially decipher G. to. by a direct method: in the R17 phage, the base sequence was determined in a fragment of 57 nucleotides in length and compared with the amino acid sequence of its shell protein. The results were in complete agreement with those obtained by less direct methods. Thus, the code is deciphered completely and correctly.

The results of decoding are summarized in a table. It lists the composition of codons and RNA. The composition of tRNA anticodons is complementary to mRNA codons, i.e. instead of U they contain A, instead of A - U, instead of C - G and instead of G - C, and corresponds to the codons of the structural gene (that strand of DNA, with which information is read) with the only difference being that uracil takes the place of thymine. Of the 64 triplets that can be formed by a combination of 4 nucleotides, 61 have "sense", i.e., encode amino acids, and 3 are "nonsense" (devoid of meaning). There is a fairly clear relationship between the composition of triplets and their meaning, which was discovered even when analyzing the general properties of the code. In some cases, triplets encoding a specific amino acid (eg, proline, alanine) are characterized by the fact that the first two nucleotides (obligate) are the same, and the third (optional) can be anything. In other cases (when encoding, for example, asparagine, glutamine), two similar triplets have the same meaning, in which the first two nucleotides coincide, and any purine or any pyrimidine takes the place of the third.

Nonsense codons, 2 of which have special names corresponding to the designation of phage mutants (UAA-ochre, UAG-amber, UGA-opal), although they do not encode any amino acids, they are of great importance when reading information, encoding the end of the polypeptide chain .

Information is read in the direction from 5 1 -> 3 1 - to the end of the nucleotide chain (see Deoxyribonucleic acids). In this case, protein synthesis proceeds from an amino acid with a free amino group to an amino acid with a free carboxyl group. The start of synthesis is encoded by the AUG and GUG triplets, which in this case include a specific starting aminoacyl-tRNA, namely N-formylmethionyl-tRNA. The same triplets, when localized within the chain, encode methionine and valine, respectively. The ambiguity is removed by the fact that the beginning of reading is preceded by nonsense. There is evidence that the boundary between mRNA regions encoding different proteins consists of more than two triplets and that the secondary structure of RNA changes in these places; this issue is under investigation. If a nonsense codon occurs within a structural gene, then the corresponding protein is built only up to the location of this codon.

The discovery and decoding of the genetic code - an outstanding achievement of molecular biology - had an impact on all biol, sciences, in some cases laying the foundation for the development of special large sections (see Molecular genetics). G.'s opening effect to. and the researches connected with it compare with that effect which was rendered on biol, sciences by Darwin's theory.

The universality of G. to. is a direct proof of the universality of the basic molecular mechanisms of life in all representatives of the organic world. Meanwhile, the large differences in the functions of the genetic apparatus and its structure during the transition from prokaryotes to eukaryotes and from unicellular to multicellular ones are probably associated with molecular differences, the study of which is one of the tasks of the future. Since G.'s research to. is only a matter of recent years, the significance of the results obtained for practical medicine is only indirect in nature, allowing for the time being to understand the nature of diseases, the mechanism of action of pathogens and medicinal substances. However, the discovery of such phenomena as transformation (see), transduction (see), suppression (see), indicates the fundamental possibility of correcting pathologically altered hereditary information or its correction - the so-called. genetic engineering (see).

Table. GENETIC CODE

First nucleotide of the codon

Second nucleotide of the codon

Third, codon nucleotide

Phenylalanine

J Nonsense

tryptophan

Histidine

Glutamic acid

Isoleucine

Aspartic

Methionine

Asparagine

Glutamine

* Encodes the end of the chain.

** Also encodes the beginning of the chain.

Bibliography: Ichas M. Biological code, trans. from English, M., 1971; Archer N.B. Biophysics of cytogenetic defeats and a genetic code, L., 1968; Molecular genetics, trans. from English, ed. A. N. Belozersky, part 1, M., 1964; Nucleic acids, trans. from English, ed. A. N. Belozersky. Moscow, 1965. Watson JD Molecular biology of the gene, trans. from English, M., 1967; Physiological Genetics, ed. M. E. Lobasheva S. G., Inge-Vechtoma-va, L., 1976, bibliogr.; Desoxyribonucleins&ure, Schlttssel des Lebens, hrsg. v „E. Geissler, B., 1972; The genetic code, Gold Spr. Harb. Symp. quant. Biol., v. 31, 1966; W o e s e C. R. The genetic code, N. Y. a. o., 1967.

The genetic code is a way of encoding the sequence of amino acids in a protein molecule using the sequence of nucleotides in a nucleic acid molecule. The properties of the genetic code follow from the features of this coding.

Each amino acid of a protein is associated with three successive nucleic acid nucleotides - triplet, or codon. Each of the nucleotides can contain one of four nitrogenous bases. In RNA, these are adenine (A), uracil (U), guanine (G), cytosine (C). By combining nitrogenous bases in different ways (in this case, nucleotides containing them), you can get many different triplets: AAA, GAU, UCC, GCA, AUC, etc. The total number of possible combinations is 64, i.e. 43.

The proteins of living organisms contain about 20 amino acids. If nature “conceived” to encode each amino acid not with three, but with two nucleotides, then the variety of such pairs would not be enough, since there would be only 16 of them, i.e. 42.

Thus, the main property of the genetic code is its triplet. Each amino acid is encoded by a triplet of nucleotides.

Since there are significantly more possible different triplets than amino acids used in biological molecules, such a property as redundancy genetic code. Many amino acids began to be encoded not by one codon, but by several. For example, the amino acid glycine is encoded by four different codons: GGU, GGC, GGA, GGG. Redundancy is also called degeneracy.

Correspondence between amino acids and codons is reflected in the form of tables. For example, these:

In relation to nucleotides, the genetic code has the following property: uniqueness(or specificity): each codon corresponds to only one amino acid. For example, the GGU codon can only code for glycine and no other amino acid.

Again. Redundancy is about the fact that several triplets can encode the same amino acid. Specificity - Each specific codon can code for only one amino acid.

There are no special punctuation marks in the genetic code (except for stop codons that indicate the end of polypeptide synthesis). The function of punctuation marks is performed by the triplets themselves - the end of one means that another will begin next. This implies the following two properties of the genetic code: continuity And non-overlapping. Continuity is understood as the reading of triplets immediately one after another. Non-overlapping means that each nucleotide can be part of only one triplet. So the first nucleotide of the next triplet always comes after the third nucleotide of the previous triplet. A codon cannot start at the second or third nucleotide of the preceding codon. In other words, the code does not overlap.

The genetic code has the property universality. It is the same for all organisms on Earth, which indicates the unity of the origin of life. There are very rare exceptions to this. For example, some triplets of mitochondria and chloroplasts code for amino acids other than their usual ones. This may indicate that at the dawn of the development of life, there were slightly different variations of the genetic code.

Finally, the genetic code has noise immunity, which is a consequence of its property as redundancy. Point mutations, sometimes occurring in DNA, usually result in the replacement of one nitrogenous base with another. This changes the triplet. For example, it was AAA, after the mutation it became AAG. However, such changes do not always lead to a change in the amino acid in the synthesized polypeptide, since both triplets, due to the property of the redundancy of the genetic code, can correspond to one amino acid. Given that mutations are more often harmful, the noise immunity property is useful.

The genetic, or biological, code is one of the universal properties of living nature, proving the unity of its origin. Genetic code- this is a method of encoding the amino acid sequence of a polypeptide using a nucleic acid nucleotide sequence (informative RNA or a complementary DNA section on which mRNA is synthesized).

There are other definitions.

Genetic code- this is the correspondence to each amino acid (which is part of living proteins) of a certain sequence of three nucleotides. Genetic code is the relationship between nucleic acid bases and protein amino acids.

In the scientific literature, the genetic code is not understood as the sequence of nucleotides in the DNA of any organism, which determines its individuality.

It is wrong to assume that one organism or species has one code, and another has another. The genetic code is how amino acids are encoded by nucleotides (i.e. principle, mechanism); it is universal for all living things, the same for all organisms.

Therefore, it is incorrect to say, for example, "The genetic code of a person" or "The genetic code of an organism", which is often used in near-scientific literature and films.

In these cases, we usually mean the genome of a person, an organism, etc.

The diversity of living organisms and the characteristics of their vital activity is primarily due to the diversity of proteins.

The specific structure of a protein is determined by the order and quantity of the various amino acids that make up its composition. The amino acid sequence of the peptide is encrypted in DNA using the biological code. From the point of view of the diversity of the set of monomers, DNA is a more primitive molecule than a peptide. DNA is a variety of alternations of only four nucleotides. This has long prevented researchers from considering DNA as the material of heredity.

How amino acids are encoded by nucleotides

1) Nucleic acids (DNA and RNA) are polymers made up of nucleotides.

Each nucleotide can include one of four nitrogenous bases: adenine (A, en: A), guanine (G, G), cytosine (C, en: C), thymine (T, en: T). In the case of RNA, thymine is replaced by uracil (Y, U).

When considering the genetic code, only nitrogenous bases are taken into account.

Then the DNA chain can be represented as their linear sequence. For example:

The mRNA region complementary to this code will be as follows:

2) Proteins (polypeptides) are polymers consisting of amino acids.

In living organisms, 20 amino acids are used to build polypeptides (a few more are very rare). One letter can also be used to designate them (although three are more often used - an abbreviation for the name of the amino acid).

Amino acids in a polypeptide are also linearly linked by a peptide bond. For example, suppose there is a region of a protein with the following sequence of amino acids (each amino acid is denoted by a single letter):

3) If the task is to encode each amino acid using nucleotides, then it boils down to how to encode 20 letters using 4 letters.

This can be done by matching the letters of the 20-letter alphabet to words made up of several letters of the 4-letter alphabet.

If one amino acid is encoded by one nucleotide, then only four amino acids can be encoded.

If each amino acid is matched with two consecutive nucleotides in the RNA chain, then sixteen amino acids can be encoded.

Indeed, if there are four letters (A, U, G, C), then the number of their different pair combinations will be 16: (AU, UA), (AG, GA), (AC, CA), (UG, GU), ( UC, CU), (GC, CG), (AA, UU, GG, CC).

[Brackets are used for convenience of perception.] This means that only 16 different amino acids can be encoded with such a code (two-letter word): each will have its own word (two consecutive nucleotides).

From mathematics, the formula for determining the number of combinations looks like this: ab = n.

Here n is the number of different combinations, a is the number of letters of the alphabet (or the base of the number system), b is the number of letters in a word (or digits in a number). If we substitute the 4-letter alphabet and words consisting of two letters into this formula, we get 42 = 16.

If three consecutive nucleotides are used as the code word for each amino acid, then 43 = 64 different amino acids can be encoded, since 64 different combinations can be made up of four letters taken in three (for example, AUG, GAA, CAU, GGU, etc.).

d.). This is already more than enough to code for 20 amino acids.

Exactly the three-letter code is used in the genetic code. Three consecutive nucleotides that code for the same amino acid are called triplet(or codon).

Each amino acid is associated with a specific triplet of nucleotides.

In addition, since the combinations of triplets overlap the number of amino acids, many amino acids are encoded by multiple triplets.

Three triplets do not code for any of the amino acids (UAA, UAG, UGA).

They mark the end of a broadcast and are called stop codons(or nonsense codons).

The AUG triplet encodes not only the amino acid methionine, but also initiates translation (plays the role of a start codon).

Below are tables of correspondence of amino acids to nucleoitide triplets.

According to the first table, it is convenient to determine the corresponding amino acid from a given triplet. For the second - for a given amino acid, the triplets corresponding to it.

Consider an example of the implementation of the genetic code. Let there be mRNA with the following content:

Let's break the sequence of nucleotides into triplets:

Let's compare each triplet with the amino acid of the polypeptide encoded by it:

Methionine - Aspartic acid - Serine - Threonine - Tryptophan - Leucine - Leucine - Lysine - Asparagine - Glutamine

The last triplet is a stop codon.

Properties of the genetic code

The properties of the genetic code are largely a consequence of the way amino acids are coded.

The first and obvious property is tripletity.

It is understood as the fact that the code unit is a sequence of three nucleotides.

An important property of the genetic code is its non-overlapping. A nucleotide included in one triplet cannot be included in another.

That is, the sequence AGUGAA can only be read as AGU-GAA, but not, for example, like this: AGU-GUG-GAA. That is, if a GU pair is included in one triplet, it cannot already be an integral part of another.

Under uniqueness The genetic code understands that each triplet corresponds to only one amino acid.

For example, the AGU triplet encodes the amino acid serine and no other amino acid.

Genetic code

This triplet uniquely corresponds to only one amino acid.

On the other hand, several triplets can correspond to one amino acid. For example, the same serine, in addition to AGU, corresponds to the codon AGC. This property is called degeneracy genetic code.

Degeneracy allows you to leave many mutations harmless, since often the replacement of one nucleotide in DNA does not lead to a change in the value of the triplet. If you look closely at the amino acid triplet correspondence table, you can see that if an amino acid is encoded by several triplets, then they often differ in the last nucleotide, that is, it can be anything.

Some other properties of the genetic code are also noted (continuity, noise immunity, universality, etc.).

Stability as an adaptation of plants to the conditions of existence. The main reactions of plants to the action of adverse factors.

Plant resistance is the ability to withstand the effects of extreme environmental factors (soil and air drought).

The unambiguity of the ge-not-ti-che-th code is manifest in the fact that

This property has been developed in the process of evolution and is genetically fixed. In areas with unfavorable conditions, stable decorative forms and local varieties of cultivated plants - drought-resistant - were formed. One or another level of resistance inherent in plants is revealed only under the action of extreme environmental factors.

As a result of the onset of such a factor, the irritation phase begins - a sharp deviation from the norm of a number of physiological parameters and their rapid return to normal. Then there is a change in the intensity of metabolism and damage to intracellular structures. At the same time, all synthetic ones are suppressed, all hydrolytic ones are activated, and the overall energy supply of the body decreases. If the effect of the factor does not exceed the threshold value, the adaptation phase begins.

An adapted plant reacts less to repeated or increasing exposure to an extreme factor. At the organismic level, the interaction of m / y organs is added to the mechanisms of adaptation. The weakening of the flow of water, mineral and organic compounds through the plant intensifies competition between organs, and their growth stops.

Bio-resistance in plants determined. max. is the value of the extreme factor at which the plants still form viable seeds. Agronomic sustainability is determined by the degree of yield reduction. Plants are characterized by their resistance to a specific type of extreme factor - wintering, gas-resistant, salt-resistant, drought-resistant.

Type roundworms, unlike flatworms, have a primary body cavity - a schizocele, formed due to the destruction of the parenchyma that fills the gaps between the body wall and internal organs - its function is transport.

It maintains homeostasis. The body shape is round in diameter. The integument is cuticularized. Musculature is represented by a layer of longitudinal muscles. The intestine is end-to-end and consists of 3 sections: anterior, middle and posterior. The mouth opening is located on the ventral surface of the anterior end of the body. The pharynx has a characteristic triangular lumen. The excretory system is represented by protonephridia or special skin - hypodermal glands. Most species are dioecious, with only sexual reproduction.

Development is direct, rarely with metamorphosis. They have a constant cellular composition of the body and lack the ability to regenerate. The anterior intestine consists of the oral cavity, pharynx, and esophagus.

They do not have a middle or rear section. The excretory system consists of 1-2 giant cells of the hypodermis. The longitudinal excretory canals lie in the lateral ridges of the hypodermis.

Properties of the genetic code. Proofs of the triplet code. Deciphering codons. Termination codons. The concept of genetic suppression.

The idea that information is encoded in the gene in the primary structure of the protein was specified by F.

Crick in his sequence hypothesis, according to which the sequence of gene elements determines the sequence of amino acid residues in the polypeptide chain. The validity of the sequence hypothesis is proved by the colinearity of the structures of the gene and the polypeptide encoded by it. The most significant achievement in 1953 was the idea that. That the code is most likely triplet.

; DNA base pairs: A-T, T-A, G-C, C-G - can encode only 4 amino acids if each pair corresponds to one amino acid. As you know, there are 20 basic amino acids in proteins. If we assume that each amino acid corresponds to 2 base pairs, then 16 amino acids (4 * 4) can be encoded - this is again not enough.

If the code is triplet, then 64 codons (4 * 4 * 4) can be made from 4 base pairs, which is more than enough to encode 20 amino acids. Creek and his coworkers assumed that the code was triplet, that there were no "commas" between codons, i.e., separating characters; the reading of the code within a gene occurs from a fixed point in one direction. In the summer of 1961, Kirenberg and Mattei reported on the deciphering of the first codon and proposed a method for determining the composition of codons in a cell-free system of protein synthesis.

So, the codon for phenylalanine was deciphered as UUU in mRNA. Further, as a result of applying the methods developed by the Koran, Nirenberg and Leder in 1965.

a code dictionary was compiled in its modern form. Thus, the acquisition of mutations in T4 phages caused by deletion or addition of bases was evidence of the triplet code (property 1). These dropouts and additions, leading to frame shifts when “reading” the code, were eliminated only by restoring the correctness of the code, this prevented the appearance of mutants. These experiments also showed that the triplets do not overlap, i.e., each base can belong to only one triplet. (Property 2).

Most amino acids have more than one codon. A code in which the number of amino acids is less than the number of codons is called degenerate (property 3), i.e.

e. a given amino acid can be coded for by more than one triplet. In addition, three codons do not code for any amino acid at all (“nonsense codons”) and act as a “stop signal”. The stop codon is the end point of the DNA functional unit, the cistron. Termination codons are the same in all species and are represented as UAA, UAG, UGA. A notable feature of the code is that it is universal (property 4).

In all living organisms, the same triplets code for the same amino acids.

The existence of three types of mutant codons - terminators and their suppression have been shown in E. coli and yeast. The discovery of genes - suppressors, "comprehending" nonsense - alleles of different genes, indicates that the translation of the genetic code can change.

Mutations affecting the tRNA anticodon change their codon specificity and create an opportunity for mutation suppression at the translational level. Suppression at the level of translation may occur due to mutations in the genes encoding some ribosome proteins. As a result of these mutations, the ribosome "mistakes", for example, in reading nonsense codons and "understands" them at the expense of some non-mutant tRNAs. Along with genotypic suppression, acting at the level of translation, phenotypic suppression of nonsense alleles is also possible: with a decrease in temperature, with the action of aminoglycoside antibiotics that bind to ribosomes, such as streptomycin, on cells.

22. Reproduction of higher plants: vegetative and asexual. Spore formation, spore structure, equal and heterosporous. Reproduction as a property of living matter, that is, the ability of an individual to give rise to its own kind, existed in the early stages of evolution.

Forms of reproduction can be divided into 2 types: asexual and sexual. Actually asexual reproduction is carried out without the participation of germ cells, with the help of specialized cells - spores. They are formed in the organs of asexual reproduction - sporangia as a result of mitotic division.

The spore during its germination reproduces a new individual, similar to the parent, with the exception of spores of seed plants, in which the spore has lost the function of reproduction and settlement. Spores can also be formed by reduction division, with single-celled spores spilling out.

Propagation of plants with the help of vegetative (part of the shoot, leaf, root) or division of unicellular algae in half is called vegetative (bulb, cuttings).

Sexual reproduction is carried out by special sex cells - gametes.

Gametes are formed as a result of meiosis, there are female and male. As a result of their fusion, a zygote appears, from which a new organism subsequently develops.

Plants differ in the types of gametes. In some unicellular organisms, it functions as a gamete at a certain time. Different-sex organisms (gametes) merge - this sexual process is called hologamy. If male and female gametes are morphologically similar, mobile - these are isogametes.

And the sexual process isogamous. If female gametes are somewhat larger and less mobile than male gametes, then these are heterogametes, and the process is heterogamy. Oogamy - female gametes are very large and immobile, male gametes are small and mobile.

12345678910Next ⇒

Genetic code - correspondence between DNA triplets and amino acids of proteins

The need to encode the structure of proteins in the linear sequence of mRNA and DNA nucleotides is dictated by the fact that during translation:

  • there is no correspondence between the number of monomers in the mRNA matrix and the product - the synthesized protein;
  • there is no structural similarity between RNA and protein monomers.

This eliminates the complementary interaction between the matrix and the product, the principle by which the construction of new DNA and RNA molecules is carried out during replication and transcription.

From this it becomes clear that there must be a "dictionary" that makes it possible to find out which mRNA nucleotide sequence provides for the inclusion of amino acids in a given sequence in a protein. This "dictionary" is called the genetic, biological, nucleotide, or amino acid code. It allows you to encode the amino acids that make up proteins using a specific sequence of nucleotides in DNA and mRNA. It has certain properties.

Tripletity. One of the main questions in elucidating the properties of the code was the question of the number of nucleotides, which should determine the inclusion of one amino acid in the protein.

It was found that the coding elements in the encoding of the amino acid sequence are indeed triplets of nucleotides, or triplets, which have been named "codons".

Meaning of codons.

It was possible to establish that out of 64 codons, the inclusion of amino acids in the synthesized polypeptide chain encodes 61 triplets, and the remaining 3 - UAA, UAG, UGA do not encode the inclusion of amino acids in the protein and were originally called meaningless or non-sense codons. However, later it was shown that these triplets signal the completion of translation, and therefore they became known as termination or stop codons.

mRNA codons and nucleotide triplets in the DNA coding strand with direction from 5' to 3'-end have the same sequence of nitrogenous bases, except that in DNA instead of uracil (U), characteristic of mRNA, is thymine (T).

Specificity.

Each codon corresponds to only one specific amino acid. In this sense, the genetic code is strictly unambiguous.

Table 4-3.

Unambiguity is one of the properties of the genetic code, manifested in the fact that ...

The main components of the protein synthesis system

Required Components Functions
1 . Amino acids Substrates for protein synthesis
2. tRNA tRNAs act as adapters. They interact with the acceptor end with amino acids, and with the anticodon - with the mRNA codon.
3.

Aminoacyl-tRNA synthetase

Each aa-tRNA synthetase catalyzes the specific binding reaction of one of the 20 amino acids with the corresponding tRNA
4.mRNA The matrix contains a linear sequence of codons that determine the primary structure of proteins
5. Ribosomes Ribonucleoprotein subcellular structures that are the site of protein synthesis
6. Energy sources
7. Protein factors of initiation, elongation, termination Specific extraribosomal proteins required for the translation process (12 initiation factors: elF; 2 elongation factors: eEF1, eEF2, and termination factors: eRF)
8.

Magnesium ions

Cofactor that stabilizes the structure of ribosomes

Notes: elF( eukaryotic initiation factors) are initiation factors; eEF( eukaryotic elongation factors) are elongation factors; eRF ( eukaryotic releasing factors) are termination factors.

degeneracy. In mRNA and DNA, 61 triplets make sense, each of which encodes the inclusion of one of the 20 amino acids in the protein.

It follows from this that in informational molecules the inclusion of the same amino acid in a protein is determined by several codons. This property of the biological code is called degeneracy.

In humans, only 2 amino acids are encrypted with one codon - Met and Tri, while Leu, Ser and Apr - with six codons, and Ala, Val, Gli, Pro, Tre - with four codons (Table 1).

The redundancy of coding sequences is the most valuable property of the code, since it increases the resistance of the information flow to the adverse effects of the external and internal environment. In determining the nature of an amino acid to be included in a protein, the third nucleotide in a codon is not as important as the first two. As can be seen from Table. 4-4, for many amino acids, the replacement of the nucleotide in the third position of the codon does not affect its meaning.

Linearity of information recording.

During translation, mRNA codons are "read" from a fixed starting point sequentially and do not overlap. There are no signals in the record of information indicating the end of one codon and the beginning of the next. The AUG codon is initiating and is read both at the beginning and in other regions of the mRNA as Met. The triplets following it are read sequentially without any gaps up to the stop codon, at which the synthesis of the polypeptide chain is completed.

Versatility.

Until recently, it was believed that the code is absolutely universal, i.e. the meaning of code words is the same for all studied organisms: viruses, bacteria, plants, amphibians, mammals, including humans.

However, one exception later became known, it turned out that mitochondrial mRNA contains 4 triplets that have a different meaning than in mRNA of nuclear origin. Thus, in mitochondrial mRNA, the UGA triplet encodes Tri, AUA codes for Met, and ACA and AGG are read as additional stop codons.

Gene and product colinearity.

In prokaryotes, a linear correspondence between the sequence of codons of the gene and the sequence of amino acids in the protein product has been found, or, as they say, there is colinearity between the gene and the product.

Table 4-4.

Genetic code

First Foundation Second base
U WITH A G
U UUU hair dryer UCU Cep UAU Tire UGU Cys
UUC Hair dryer UCC Ser iASTir UGC Cys
UUA Lei UCA Cep UAA* UGA*
UUG Lei UCG Ser UAG* UGG Apr
WITH Cuu Lei CCU Pro CAU Gis CGU Apr
CUC Lei SSS Pro SAS Gis CGC Apr
CUA Lei SSA Pro CAA Gln CGA Apr
CUG Lei CCG Pro CAG Gln CGG Apr
A AUU Ile ACU Tpe AAU Asn AGU Ser
AUC Ile ACC Tre AAS Asn AGG Ser
AUA Met ASA Tre AAA Liz AGA Apr
AUG Met ACG Tre AAG Liz AGG Apr
G GUU Ban GCU Ala GAU Asp GGU Gli
GUC Shaft GCC Ala GAC Asp GGC Glee
GUA Val GSA Ala GAA Glu GGA Glee
GUG Shaft GСG Ala GAG Glu GGG Glee

Notes: U, uracil; C - cytosine; A - adenine; G, guanine; * - termination codon.

In eukaryotes, the base sequences in the gene, the co-linear amino acid sequences in the protein, are interrupted by introns.

Therefore, in eukaryotic cells, the amino acid sequence of a protein is co-linear with the sequence of exons in a gene or mature mRNA after post-transcriptional removal of introns.



Similar articles