RNA splicing, in molecular biology, is a form of RNA processing in which a newly made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). During splicing, introns (Non-coding regions) are removed and exons (Coding Regions) are joined together. For nuclear-encoded genes, splicing takes place within the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually required in order to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing is carried out in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleo proteins (snRNPs). Self-splicing introns, or ribozymes capable of catalyzing their own excision from their parent RNA molecule, also exist.
Several methods of RNA splicing occur in nature; the type of splicing depends on the structure of the spliced intron and the catalysts required for splicing to occur.
The word intron is derived from the terms intragenic region, and intracistron, that is, a segment of DNA that is located between two exons of a gene. The term intron refers to both the DNA sequence within a gene and the corresponding sequence in the unprocessed RNA transcript. As part of the RNA processing pathway, introns are removed by RNA splicing either shortly after or concurrent with transcription. Introns are found in the genes of most organisms and many viruses. They can be located in a wide range of genes, including those that generate proteins, ribosomal RNA (rRNA), and transfer RNA (tRNA).
Within introns, a donor site (5' end of the intron), a branch site (near the 3' end of the intron) and an acceptor site (3' end of the intron) are required for splicing. The splice donor site includes an almost invariant sequence GU at the 5' end of the intron, within a larger, less highly conserved region. The splice acceptor site at the 3' end of the intron terminates the intron with an almost invariant AG sequence. Upstream (5'-ward) from the AG there is a region high in pyrimidines (C and U), or polypyrimidine tract. Further upstream from the polypyrimidine tract is the branchpoint, which includes an adenine nucleotide involved in lariat formation. The consensus sequence for an intron (in IUPAC nucleic acid notation) is: G-G-[cut]-G-U-R-A-G-U (donor site) ... intron sequence ... Y-U-R-A-C (branch sequence 20-50 nucleotides upstream of acceptor site) ... Y-rich-N-C-A-G-[cut]-G (acceptor site). However, it is noted that the specific sequence of intronic splicing elements and the number of nucleotides between the branchpoint and the nearest 3’ acceptor site affect splice site selection. Also, point mutations in the underlying DNA or errors during transcription can activate a cryptic splice site in part of the transcript that usually is not spliced. This results in a mature messenger RNA with a missing section of an exon. In this way, a point mutation, which might otherwise affect only a single amino acid, can manifest as a deletion or truncation in the final protein.
Formation and activity
Splicing is catalyzed by the spliceosome, a large RNA-protein complex composed of five small nuclear ribonucleoproteins (snRNPs, pronounced 'snurps'). Assembly and activity of the spliceosome occurs during transcription of the pre-mRNA. The RNA components of snRNPs interact with the intron and are involved in catalysis. Two types of spliceosomes have been identified (major and minor) which contain different snRNPs.
- The major spliceosome splices introns containing GU at the 5' splice site and AG at the 3' splice site. It is composed of the U1, U2, U4, U5, and U6 snRNPs and is active in the nucleus. In addition, a number of proteins including U2 small nuclear RNA auxiliary factor 1 (U2AF35), U2AF2 (U2AF65) and SF1 are required for the assembly of the spliceosome. The spliceosome forms different complexes during the splicing process:
- Complex A (pre-spliceosome)
- The U2 snRNP displaces SF1 and binds to the branch point sequence and ATP is hydrolyzed;
- Complex A (pre-spliceosome)
- Complex B (pre-catalytic spliceosome)
- The U5/U4/U6 snRNP trimer binds, and the U5 snRNP binds exons at the 5' site, with U6 binding to U2;
- Complex B (pre-catalytic spliceosome)
- Complex B*
- The U1 snRNP is released, U5 shifts from exon to intron, and the U6 binds at the 5' splice site;
- Complex B*
- Complex C (catalytic spliceosome)
- U4 is released, U6/U2 catalyzes transesterification, making the 5'-end of the intron ligate to the A on intron and form a lariat, U5 binds exon at 3' splice site, and the 5' site is cleaved, resulting in the formation of the lariat;
- Complex C (catalytic spliceosome)
- This type of splicing is termed canonical splicing or termed the lariat pathway, which accounts for more than 99% of splicing. By contrast, when the intronic flanking sequences do not follow the GU-AG rule, noncanonical splicing is said to occur (see "minor spliceosome" below).
- The minor spliceosome is very similar to the major spliceosome, but instead it splices out rare introns with different splice site sequences. While the minor and major spliceosomes contain the same U5 snRNP, the minor spliceosome has different but functionally analogous snRNPs for U1, U2, U4, and U6, which are respectively called U11, U12, U4atac, and U6atac.
In most cases, splicing removes introns as single units from precursor mRNA transcripts. However, in some cases, especially in mRNAs with very long introns, splicing happens in steps, with part of an intron removed and then the remaining intron is spliced out in a following step. This has been found first in the Ultrabithorax (Ubx) gene of the fruit fly, Drosophila melanogaster, and a few other Drosophila genes, but cases in human have been reported as well.
Self-splicing occurs for rare introns that form a ribozyme, performing the functions of the spliceosome by RNA alone. There are three kinds of self-splicing introns, Group I, Group II and Group III. Group I and II introns perform splicing similar to the spliceosome without requiring any protein. This similarity suggests that Group I and II introns may be evolutionarily related to the spliceosome. Self-splicing may also be very ancient, and may have existed in an RNA world present before protein.
Two transesterifications characterize the mechanism in which group I introns are spliced:
- 3'OH of a free guanine nucleoside (or one located in the intron) or a nucleotide cofactor (GMP, GDP, GTP) attacks phosphate at the 5' splice site.
- 3'OH of the 5' exon becomes a nucleophile and the second transesterification results in the joining of the two exons.
The mechanism in which group II introns are spliced (two transesterification reaction like group I introns) is as follows:
- The 2'OH of a specific adenosine in the intron attacks the 5' splice site, thereby forming the lariat
- The 3'OH of the 5' exon triggers the second transesterification at the 3' splice site, thereby joining the exons together.
tRNA (also tRNA-like) splicing is another rare form of splicing that usually occurs in tRNA. The splicing reaction involves a different biochemistry than the spliceosomal and self-splicing pathways.
In the yeast Saccharomyces cerevisiae, a yeast tRNA splicing endonuclease heterotetramer, composed of TSEN54, TSEN2, TSEN34, and TSEN15, cleaves pre-tRNA at two sites in the acceptor loop to form a 5'-half tRNA, terminating at a 2',3'-cyclic phosphodiester group, and a 3'-half tRNA, terminating at a 5'-hydroxyl group, along with a discarded intron. Yeast tRNA kinase then phosphorylates the 5'-hydroxyl group using adenosine triphosphate. Yeast tRNA cyclic phosphodiesterase cleaves the cyclic phosphodiester group to form a 2'-phosphorylated 3' end. Yeast tRNA ligase adds an adenosine monophosphate group to the 5' end of the 3'-half and joins the two halves together. NAD-dependent 2'-phosphotransferase then removes the 2'-phosphate group.
Splicing occurs in all the kingdoms or domains of life, however, the extent and types of splicing can be very different between the major divisions. Eukaryotes splice many protein-coding messenger RNAs and some non-coding RNAs. Prokaryotes, on the other hand, splice rarely and mostly non-coding RNAs. Another important difference between these two groups of organisms is that prokaryotes completely lack the spliceosomal pathway.
Because spliceosomal introns are not conserved in all species, there is debate concerning when spliceosomal splicing evolved. Two models have been proposed: the intron late and intron early models (see intron evolution).
Spliceosomal splicing and self-splicing involve a two-step biochemical process. Both steps involve transesterification reactions that occur between RNA nucleotides. tRNA splicing, however, is an exception and does not occur by transesterification.
Spliceosomal and self-splicing transesterification reactions occur via two sequential transesterification reactions. First, the 2'OH of a specific branchpoint nucleotide within the intron, defined during spliceosome assembly, performs a nucleophilic attack on the first nucleotide of the intron at the 5' splice site, forming the lariat intermediate. Second, the 3'OH of the released 5' exon then performs an electrophilic attack at the first nucleotide following the last nucleotide of the intron at the 3' splice site, thus joining the exons and releasing the intron lariat.
In many cases, the splicing process can create a range of unique proteins by varying the exon composition of the same mRNA. This phenomenon is then called alternative splicing. Alternative splicing can occur in many ways. Exons can be extended or skipped, or introns can be retained. It is estimated that 95% of transcripts from multiexon genes undergo alternative splicing, some instances of which occur in a tissue-specific manner and/or under specific cellular conditions. Development of high throughput mRNA sequencing technology can help quantify the expression levels of alternatively spliced isoforms. Differential expression levels across tissues and cell lineages allowed computational approaches to be developed to predict the functions of these isoforms. Given this complexity, alternative splicing of pre-mRNA transcripts is regulated by a system of trans-acting proteins (activators and repressors) that bind to cis-acting sites or "elements" (enhancers and silencers) on the pre-mRNA transcript itself. These proteins and their respective binding elements promote or reduce the usage of a particular splice site. The binding specificity comes from the sequence and structure of the cis-elements, e.g. in HIV-1 there are many donor and acceptor splice sites. Among the various splice sites, ssA7, which is 3' acceptor site, folds into three stem loop structures, i.e. Intronic splicing silencer (ISS), Exonic splicing enhancer (ESE), and Exonic splicing silencer (ESSE3). Solution structure of Intronic splicing silencer and its interaction to host protein hnRNPA1 give insight into specific recognition. However, adding to the complexity of alternative splicing, it is noted that the effects of regulatory factors are many times position-dependent. For example, a splicing factor that serves as a splicing activator when bound to an intronic enhancer element may serve as a repressor when bound to its splicing element in the context of an exon, and vice versa. In addition to the position-dependent effects of enhancer and silencer elements, the location of the branchpoint (i.e., distance upstream of the nearest 3’ acceptor site) also affects splicing. The secondary structure of the pre-mRNA transcript also plays a role in regulating splicing, such as by bringing together splicing elements or by masking a sequence that would otherwise serve as a binding element for a splicing factor.
Splicing response to DNA damage
DNA damage affects splicing factors by altering their post-translational modification, localization, expression and activity. Furthermore, DNA damage often disrupts splicing by interfering with its coupling to transcription. DNA damage also has an impact on the splicing and alternative splicing of genes intimately associated with DNA repair. For instance, DNA damages modulate the alternative splicing of the DNA repair genes Brca1 and Ercc1.
Experimental manipulation of splicing
Splicing events can be experimentally altered by binding steric-blocking antisense oligos such as Morpholinos or Peptide nucleic acids to snRNP binding sites, to the branchpoint nucleotide that closes the lariat,Split gene theory or to splice-regulatory element binding sites.
Splicing errors and variation
It has been suggested that one third of all disease-causing mutations impact on splicing. Common errors include:
- Mutation of a splice site resulting in loss of function of that site. Results in exposure of a premature stop codon, loss of an exon, or inclusion of an intron.
- Mutation of a splice site reducing specificity. May result in variation in the splice location, causing insertion or deletion of amino acids, or most likely, a disruption of the reading frame.
- Displacement of a splice site, leading to inclusion or exclusion of more RNA than expected, resulting in longer or shorter exons.
Although many splicing errors are safeguarded by a cellular quality control mechanism termed nonsense-mediated mRNA decay (NMD), a number of splicing-related diseases also exist, as suggested above.
Allelic differences in mRNA splicing are likely to be a common and important source of phenotypic diversity at the molecular level, in addition to their contribution to genetic disease susceptibility. Indeed, genome-wide studies in humans have identified a range of genes that are subject to allele-specific splicing.
In addition to RNA, proteins can undergo splicing. Although the biomolecular mechanisms are different, the principle is the same: parts of the protein, called inteins instead of introns, are removed. The remaining parts, called exteins instead of exons, are fused together. Protein splicing has been observed in a wide range of organisms, including bacteria, archaea, plants, yeast and humans.
|Wikimedia Commons has media related to Splicing.|
- Gilbert W (February 1978). "Why genes in pieces?". Nature. 271 (5645): 501. doi:10.1038/271501a0. PMID 622185.
- Tonegawa S, Maxam AM, Tizard R, Bernard O, Gilbert W (March 1978). "Sequence of a mouse germ-line gene for a variable region of an immunoglobulin light chain". Proceedings of the National Academy of Sciences of the United States of America. 75 (3): 1485–9. doi:10.1073/pnas.75.3.1485. PMC 411497. PMID 418414.
- Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, Curado J, Snyder M, Gingeras TR, Guigó R (September 2012). "Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs". Genome Research. 22 (9): 1616–25. doi:10.1101/gr.134445.111. PMC 3431479. PMID 22955974.
- Roy SW, Gilbert W (March 2006). "The evolution of spliceosomal introns: patterns, puzzles and progress". Nature Reviews. Genetics. 7 (3): 211–21. doi:10.1038/nrg1807. PMID 16485020.
- Clancy S (2008). "RNA Splicing: Introns, Exons and Spliceosome". Nature Education. 1 (1): 31. Archived from the original on 15 March 2011. Retrieved 31 March 2011.
- Black DL (June 2003). "Mechanisms of alternative pre-messenger RNA splicing". Annual Review of Biochemistry. 72 (1): 291–336. doi:10.1146/annurev.biochem.72.121801.161720. PMID 12626338.
- "Molecular Biology of the Cell". 2012 Journal Citation Reports. Web of Science (Science ed.). Thomson Reuters. 2013.
- Taggart AJ, DeSimone AM, Shih JS, Filloux ME, Fairbrother WG (June 2012). "Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo". Nature Structural & Molecular Biology. 19 (7): 719–21. doi:10.1038/nsmb.2327. PMC 3465671. PMID 22705790.
- Corvelo A, Hallegger M, Smith CW, Eyras E (November 2010). Meyer (ed.). "Genome-wide association between branch point properties and alternative splicing". PLoS Computational Biology. 6 (11): e1001016. Bibcode:2010PLSCB...6E1016C. doi:10.1371/journal.pcbi.1001016. PMC 2991248. PMID 21124863.
- Graveley BR, Hertel KJ, Maniatis T (June 2001). "The role of U2AF35 and U2AF65 in enhancer-dependent splicing". RNA. 7 (6): 806–18. doi:10.1017/s1355838201010317. PMC 1370132. PMID 11421359. Archived from the original on 2018-11-20. Retrieved 2014-12-17.
- Matlin AJ, Clark F, Smith CW (May 2005). "Understanding alternative splicing: towards a cellular code". Nature Reviews. Molecular Cell Biology. 6 (5): 386–98. doi:10.1038/nrm1645. PMID 15956978.
- Matera AG, Wang Z (February 2014). "A day in the life of the spliceosome". Nature Reviews. Molecular Cell Biology. 15 (2): 108–21. doi:10.1038/nrm3742. PMC 4060434. PMID 24452469.
- Guth S, Valcárcel J (December 2000). "Kinetic role for mammalian SF1/BBP in spliceosome assembly and function after polypyrimidine tract recognition by U2AF". The Journal of Biological Chemistry. 275 (48): 38059–66. doi:10.1074/jbc.M001483200. PMID 10954700.
- Cheng Z, Menees TM (December 2011). "RNA splicing and debranching viewed through analysis of RNA lariats". Molecular Genetics and Genomics. 286 (5–6): 395–410. doi:10.1007/s00438-011-0635-y. PMID 22065066.
- Ng B, Yang F, Huston DP, Yan Y, Yang Y, Xiong Z, Peterson LE, Wang H, Yang XF (December 2004). "Increased noncanonical splicing of autoantigen transcripts provides the structural basis for expression of untolerized epitopes". The Journal of Allergy and Clinical Immunology. 114 (6): 1463–70. doi:10.1016/j.jaci.2004.09.006. PMC 3902068. PMID 15577853.
- Patel AA, Steitz JA (December 2003). "Splicing double: insights from the second spliceosome". Nature Reviews. Molecular Cell Biology. 4 (12): 960–70. doi:10.1038/nrm1259. PMID 14685174.
- Sibley CR, Emmett W, Blazquez L, Faro A, Haberman N, Briese M, Trabzuni D, Ryten M, Weale ME, Hardy J, Modic M, Curk T, Wilson SW, Plagnol V, Ule J (May 2015). "Recursive splicing in long vertebrate genes". Nature. 521 (7552): 371–375. Bibcode:2015Natur.521..371S. doi:10.1038/nature14466. PMC 4471124. PMID 25970246.
- Duff MO, Olson S, Wei X, Garrett SC, Osman A, Bolisetty M, Plocik A, Celniker SE, Graveley BR (May 2015). "Genome-wide identification of zero nucleotide recursive splicing in Drosophila". Nature. 521 (7552): 376–9. Bibcode:2015Natur.521..376D. doi:10.1038/nature14475. PMC 4529404. PMID 25970244.
- Di Segni G, Gastaldi S, Tocchini-Valentini GP (May 2008). "Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells". Proceedings of the National Academy of Sciences of the United States of America. 105 (19): 6864–9. Bibcode:2008PNAS..105.6864D. doi:10.1073/pnas.0800420105. JSTOR 25461891. PMC 2383978. PMID 18458335.
- Trotta CR, Miao F, Arn EA, Stevens SW, Ho CK, Rauhut R, Abelson JN (June 1997). "The yeast tRNA splicing endonuclease: a tetrameric enzyme with two active site subunits homologous to the archaeal tRNA endonucleases". Cell. 89 (6): 849–58. doi:10.1016/S0092-8674(00)80270-6. PMID 9200603.
- Westaway SK, Phizicky EM, Abelson J (March 1988). "Structure and function of the yeast tRNA ligase gene". The Journal of Biological Chemistry. 263 (7): 3171–6. PMID 3277966. Archived from the original on 2018-11-18. Retrieved 2014-12-17.
- Paushkin SV, Patel M, Furia BS, Peltz SW, Trotta CR (April 2004). "Identification of a human endonuclease complex reveals a link between tRNA splicing and pre-mRNA 3' end formation". Cell. 117 (3): 311–21. doi:10.1016/S0092-8674(04)00342-3. PMID 15109492.
- Soma A (1 April 2014). "Circularly permuted tRNA genes: their expression and implications for their physiological relevance and development". Frontiers in Genetics. 5: 63. doi:10.3389/fgene.2014.00063. PMC 3978253. PMID 24744771.
- Abelson J, Trotta CR, Li H (May 1998). "tRNA splicing". The Journal of Biological Chemistry. 273 (21): 12685–8. doi:10.1074/jbc.273.21.12685. PMID 9582290.
- Fica SM, Tuttle N, Novak T, Li NS, Lu J, Koodathingal P, Dai Q, Staley JP, Piccirilli JA (November 2013). "RNA catalyses nuclear pre-mRNA splicing". Nature. 503 (7475): 229–34. Bibcode:2013Natur.503..229F. doi:10.1038/nature12734. PMC 4666680. PMID 24196718.
- Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (December 2008). "Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing". Nature Genetics. 40 (12): 1413–5. doi:10.1038/ng.259. PMID 18978789.
- Eksi R, Li HD, Menon R, Wen Y, Omenn GS, Kretzler M, Guan Y (Nov 2013). "Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data". PLoS Computational Biology. 9 (11): e1003314. Bibcode:2013PLSCB...9E3314E. doi:10.1371/journal.pcbi.1003314. PMC 3820534. PMID 24244129.
- Li HD, Menon R, Omenn GS, Guan Y (August 2014). "The emerging era of genomic data integration for analyzing splice isoform function". Trends in Genetics. 30 (8): 340–7. doi:10.1016/j.tig.2014.05.005. PMC 4112133. PMID 24951248.
- Jain N, Morgan CE, Rife BD, Salemi M, Tolbert BS (January 2016). "Solution Structure of the HIV-1 Intron Splicing Silencer and Its Interactions with the UP1 Domain of Heterogeneous Nuclear Ribonucleoprotein (hnRNP) A1". The Journal of Biological Chemistry. 291 (5): 2331–44. doi:10.1074/jbc.M115.674564. PMC 4732216. PMID 26607354.
- Lim KH, Ferraris L, Filloux ME, Raphael BJ, Fairbrother WG (July 2011). "Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes". Proceedings of the National Academy of Sciences of the United States of America. 108 (27): 11093–8. Bibcode:2011PNAS..10811093H. doi:10.1073/pnas.1101135108. PMC 3131313. PMID 21685335.
- Warf MB, Berglund JA (March 2010). "Role of RNA structure in regulating pre-mRNA splicing". Trends in Biochemical Sciences. 35 (3): 169–78. doi:10.1016/j.tibs.2009.10.004. PMC 2834840. PMID 19959365.
- Reid DC, Chang BL, Gunderson SI, Alpert L, Thompson WA, Fairbrother WG (December 2009). "Next-generation SELEX identifies sequence and structural determinants of splicing factor binding in human pre-mRNA sequence". RNA. 15 (12): 2385–97. doi:10.1261/rna.1821809. PMC 2779669. PMID 19861426.
- Shkreta L, Chabot B (October 2015). "The RNA Splicing Response to DNA Damage". Biomolecules. 5 (4): 2935–77. doi:10.3390/biom5042935. PMC 4693264. PMID 26529031.
- Draper BW, Morcos PA, Kimmel CB (July 2001). "Inhibition of zebrafish fgf8 pre-mRNA splicing with morpholino oligos: a quantifiable method for gene knockdown". Genesis. 30 (3): 154–6. doi:10.1002/gene.1053. PMID 11477696.
- Sazani P, Kang SH, Maier MA, Wei C, Dillman J, Summerton J, Manoharan M, Kole R (October 2001). "Nuclear antisense effects of neutral, anionic and cationic oligonucleotide analogs". Nucleic Acids Research. 29 (19): 3965–74. doi:10.1093/nar/29.19.3965. PMC 60237. PMID 11574678.
- Morcos PA (June 2007). "Achieving targeted and quantifiable alteration of mRNA splicing with Morpholino oligos". Biochemical and Biophysical Research Communications. 358 (2): 521–7. doi:10.1016/j.bbrc.2007.04.172. PMID 17493584.
- Bruno IG, Jin W, Cote GJ (October 2004). "Correction of aberrant FGFR1 alternative RNA splicing through targeting of intronic regulatory elements". Human Molecular Genetics. 13 (20): 2409–20. doi:10.1093/hmg/ddh272. PMID 15333583.
- Danckwardt S, Neu-Yilik G, Thermann R, Frede U, Hentze MW, Kulozik AE (March 2002). "Abnormally spliced beta-globin mRNAs: a single point mutation generates transcripts sensitive and insensitive to nonsense-mediated mRNA decay". Blood. 99 (5): 1811–6. doi:10.1182/blood.V99.5.1811. PMID 11861299.
- Ward AJ, Cooper TA (January 2010). "The pathobiology of splicing". The Journal of Pathology. 220 (2): 152–63. doi:10.1002/path.2649. PMC 2855871. PMID 19918805.
- Hanada K, Yang JC (June 2005). "Novel biochemistry: post-translational protein splicing and other lessons from the school of antigen processing". Journal of Molecular Medicine. 83 (6): 420–8. doi:10.1007/s00109-005-0652-6. PMID 15759099.