反转录转座子
反转录转座子(retrotransposon)或返座元(retroposon),是由RNA介导转座的转座子的元件,在结构和复制上与反转录病毒(retrovirus)类似,只是没有病毒感染必须的env基因,它通过转录合成mRNA,再逆转录合成新的元件整合到基因组中完成转座,每转座1次拷贝数就会增加1份,可以增強自己的基因組。因此,它是許多真核細胞生物中数量最大的一类可活动遗传成分。在植物中特别丰富,它们是核DNA的一个主要组成部分。在玉米的基因组49-78%是反转录转座子[1],而在小麦中包含约90%的基因组重复序列和68%的转座子[2]。在哺乳动物中,几乎有一半的基因组(45%至48%)包含转座子或残余转座子。人类基因组有大约42%DNA转座子,而反转录转座子约占2-3%[3]。
Biological activity
The retrotransposons' replicative mode of transposition through an RNA intermediate increases the copy numbers of elements rapidly and thereby can increase genome size. Like DNA transposable elements (class II transposons), retrotransposons can induce mutations by inserting near or within genes. Furthermore, retrotransposon-induced mutations are relatively stable, because the sequence at the insertion site is retained as they transpose via the replication mechanism.
Retrotransposons copy themselves to RNA and then back to DNA that may integrate back to the genome. The second step of forming DNA may be carried out by a reverse transcriptase which the retrotransposon encodes.[4] Transposition and survival of retrotransposons within the host genome are possibly regulated both by retrotransposon- and host-encoded factors, to avoid deleterious effects on host and retrotransposon as well, in a relationship that has existed for many millions of years between retrotransposons and their plant hosts. The understanding of how retrotransposons and their hosts' genomes have co-evolved mechanisms to regulate transposition, insertion specificities, and mutational outcomes in order to optimize each other's survival is still in its infancy.
Most retrotransposons are very old and through accumulated mutations, are no longer able to retrotranspose.
反转录转座子的两大类
反转录转座子可以分成两大类:
- 一类是LTR反转录转座子,包括Tyl-copia类和Ty3-gypsy类转座子,是具有长末端重复序列(1ong terminal repeats,LTR)的转座子,这也是反转录病毒基因组的特征性结构,这类反转录转座子可以编码反转录酶(Reverse transcripatase)或整合酶(integrases),自主地进行转录,其转座机制同反转录病毒相似,但不能像反转录病毒那样以自由感染的方式进行传播,高等植物中的反转录转座子主要属于Tyl-copia类,分布十分广泛,几乎覆盖了所有高等植物种类。
- 另一类是非LTR反转录转座子,包括LINE(long interspersed nuclear elements,长散在核重复序列)类、SINE(Short interspersed nuclear elements,短散在核重复序列)类、复合SINE转座子类,没有长末端重复序列(non-long terminal repeats,non-LTR),自身也没有转座酶或整合酶的编码能力,需要在细胞内已有的酶系统作用下进行转座。所有反转录转座子都有一个共同特点,就是在其插入位点上产生短的正向重复序列。
LTR反转录转座子
LTR反转录转座子是具有长末端重复序列(1ong terminal repeats,LTR)的转座子,其序列從大約100 bp到超過5 kb不等。LTR 反转录转座子可再根據其序列上的相似程度及編碼基因成品的次序而分類為:
三類。
Ty1-copia and Ty3-gypsy groups of retrotransposons are commonly found in high copy number (up to a few million copies per haploid nucleus) in animals, fungi, protista, and plants genomes. Pao-BEL like elements have so far only been found in animals[5][6]. About 8% of the human genome and approximately 10% of the mouse genome are composed of the LTR transposons.[7]
Tyl-copia类转座子
are abundant in species ranging from single-cell algae to bryophytes, gymnosperms, and angiosperms.
Ty3-gypsy类转座子
are also widely distributed, including both gymnosperms and angiosperms.
非LTR反转录转座子
consists of two sub-types, long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs). They can also be found in high copy numbers (up to 250,000[來源請求]) in the plant species.
長散在核重複序列
長散在核重複序列(Long interspersed repetitive elements或Long interspersed nuclear elements,縮寫作LINE)[8] are a group of genetic elements that are found in large numbers in eukaryotic genomes. They are transcribed (or are the evolutionary remains of what was once transcribed) to an RNA using an RNA polymerase II promoter that resides inside the LINE. LINEs code for the enzyme reverse transcriptase, and many LINEs also code for an endonuclease (e.g. RNase H). The reverse transcriptase has a higher specificity for the LINE RNA than other RNA, and makes a DNA copy of the RNA that can be integrated into the genome at a new site.[9]
The 5' UTR contains the promoter sequence, while the 3' UTR contains a polyadenylation signal (AATAAA) and a poly-A tail.[10] Because LINEs move by copying themselves (instead of moving, like transposons do), they enlarge the genome. The human genome, for example, contains about 900,000 LINEs, which is roughly 21% of the genome.[11]
長散在核重複序列
長散在核重複序列(Short interspersed repetitive elements or Short interspersed nuclear elements,縮寫作SINE)[8] are short DNA sequences (<500 bases[12]) that represent reverse-transcribed RNA molecules originally transcribed by RNA polymerase III into tRNA, rRNA, and other small nuclear RNAs. SINEs do not encode a functional reverse transcriptase protein and rely on other mobile elements for transposition. The most common SINEs in primates are called Alu sequences. Alu elements are 280 base pairs long, do not contain any coding sequences, and can be recognized by the restriction enzyme AluI (thus the name). With about 1 million copies, SINEs make up about 13% of the human genome.[11] While historically viewed as "junk DNA", recent research suggests that in some rare cases both LINEs and SINEs were incorporated into novel genes, so as to evolve new functionality.[13]. The distribution of these elements has been implicated in some genetic diseases and cancers.
複合SINE轉座子
Two SINES may act in concert to flank and mobilize an intervening single copy DNA sequence. This was reported for a 710 bp DNA sequence upstream of the bovine beta globin gene [14]. The DNA arrangement forms a composite transposon whose presence has been confirmed by the complete bovine genomic sequence where the mobilized sequence may be found on bovine chromosome 15 in contig NW_001493315.1 nucleotides #1085432–1086142 and the originating sequence may be found on bovine chromosome 2 in contig NW_001501789.2 nucleotides #1096679–1097389. It is likely that similar composite transposons exist in other bovine genomic regions and other mammalian genomes. They could be detected with suitable algorithms.
参见
- 反转录病毒/Endogenous retrovirus
- 转座子/Transposon
- Genomic organization
- Interspersed repeat
- Retrotransposon markers, a powerful method of reconstructing phylogenies.
- 转录
- 分子生物学的中心法则
參考
- ^ SanMiguel P, Bennetzen JL. Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotranposons (PDF). Annals of Botany. 1998, 82 (Suppl A): 37–44.
- ^ Li W, Zhang P, Fellers JP, Friebe B, Gill BS. Sequence composition, organization, and evolution of the core Triticeae genome. Plant J. 2004, 40 (4): 500–11. PMID 15500466. doi:10.1111/j.1365-313X.2004.02228.x. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ Lander ES, Linton LM, Birren B; et al. Initial sequencing and analysis of the human genome. Nature. 2001, 409 (6822): 860–921. PMID 11237011. doi:10.1038/35057062. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ Dombroski BA, Feng Q, Mathias SL; et al. An in vivo assay for the reverse transcriptase of human retrotransposon L1 in Saccharomyces cerevisiae. Mol. Cell. Biol. 1994, 14 (7): 4485–92. PMC 358820 . PMID 7516468. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ Copeland CS, Mann VH, Morales ME, Kalinna BH, Brindley PJ. The Sinbad retrotransposon from the genome of the human blood fluke, Schistosoma mansoni, and the distribution of related Pao-like elements. BMC Evol. Biol. 2005, 5 (1): 20. PMC 554778 . PMID 15725362. doi:10.1186/1471-2148-5-20.
- ^ Wicker T, Sabot F, Hua-Van A; et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 2007, 8 (12): 973–82. PMID 17984973. doi:10.1038/nrg2165. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ McCarthy EM, McDonald JF. Long terminal repeat retrotransposons of Mus musculus. Genome Biol. 2004, 5 (3): R14. PMC 395764 . PMID 15003117. doi:10.1186/gb-2004-5-3-r14.
- ^ 8.0 8.1 Singer MF. SINEs and LINEs: highly repeated short and long interspersed sequences in mammalian genomes. Cell. 1982, 28 (3): 433–4. PMID 6280868. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ Ohshima K, Okada N. SINEs and LINEs: symbionts of eukaryotic genomes with a common tail. Cytogenet. Genome Res. 2005, 110 (1-4): 475–90. PMID 16093701. doi:10.1159/000084981.
- ^ Deininger PL, Batzer MA. Mammalian retroelements. Genome Res. 2002, 12 (10): 1455–65. PMID 12368238. doi:10.1101/gr.282402. 已忽略未知参数
|month=
(建议使用|date=
) (帮助) - ^ 11.0 11.1 Pierce, Benjamin C. Genetics: a conceptual approach 2nd. San Francisco: W.H. Freeman. 2005: 311. ISBN 0-7167-8881-0.
- ^ Stansfield, William D.; King, Robert C. A dictionary of genetics 5th. Oxford [Oxfordshire]: Oxford University Press. 1997. ISBN 0-19-509441-7.
- ^ Santangelo, Andrea; de Souza, Flavio; Franchini, Lucia; Bumaschny, Viviana; Low, Malcolm; Rubinstein,Marcelo. Ancient Exaptation of a CORE-SINE Retroposon into a Highly Conserved Mammalian Neuronal Enhancer of the Proopiomelanocortin Gene. PLoS Genetics (Public Library of Science). 2007-10, 3 (10): e166 [2007-12-31]. doi:10.1371/journal.pgen.0030166.
- ^ Zelnick CR, Burks DJ, Duncan CH. A composite transposon 3' to the cow fetal globin gene binds a sequence specific factor. Nucleic Acids Res. 1987, 15 (24): 10437–53. PMC 339954 . PMID 2827124. 已忽略未知参数
|month=
(建议使用|date=
) (帮助)