Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations in the cell or outside it. Proteins can be targeted to the inner space of an organelle, different intracellular membranes, plasma membrane, or to exterior of the cell via secretion. This delivery process is carried out based on information contained in the protein itself. Correct sorting is crucial for the cell; errors can lead to diseases.
- This article deals with protein targeting in eukaryotes except where noted.
Targeting signals are the pieces of information that enable the cellular transport machinery to correctly position a protein inside or outside the cell. This information is contained in the polypeptide chain or in the folded protein. The continuous stretch of amino acid residues in the chain that enables targeting are called signal peptides or targeting peptides. There are two types of targeting peptides, the presequences and the internal targeting peptides. The presequences of the targeting peptide are often found at the N-terminal extension and is composed of between 6-136 basic and hydrophobic amino acids. In case of peroxisomes the targeting sequence is on the C-terminal extension mostly. Other signals, known as signal patches, are composed of parts which are separate in the primary sequence. They become functional when folding brings them together on the protein surface. In addition, protein modifications like glycosylations can induce targeting.
In 1970, Günter Blobel conducted experiments on the translocation of proteins across membranes. He was awarded the 1999 Nobel prize for his findings. He discovered that many proteins have a signal sequence, that is, a short amino acid sequence at one end that functions like a postal code for the target organelle. The translation of mRNA into protein by a ribosome takes place within the cytosol. If the synthesized proteins "belong" in a different organelle, they can be transported there in either of two ways depending on the protein: Co-translational translocation (translocation during the process of translation), and post-translational translocation (translocation after the process of translation is complete).
Most proteins that are secretory, membrane-bound, or reside in the endoplasmic reticulum (ER), golgi or endosomes use the co-translational translocation pathway. This process begins with the N-terminal signal peptide of the protein being recognized by a signal recognition particle (SRP) while the protein is still being synthesized on the ribosome. The synthesis pauses while the ribosome-protein complex is transferred to an SRP receptor on the ER in eukaryotes, and the plasma membrane in prokaryotes. There, the nascent protein is inserted into the translocon, a membrane-bound protein conducting channel composed of the Sec61 translocation complex in eukaryotes, and the homologous SecYEG complex in prokaryotes. In secretory proteins and type I transmembrane proteins, the signal sequence is immediately cleaved from the nascent polypeptide once it has been translocated into the membrane of the ER (eukaryotes) or plasma membrane (prokaryotes) by signal peptidase. The signal sequence of type II membrane proteins and some polytopic membrane proteins are not cleaved off and therefore are referred to as signal anchor sequences. Within the ER, the protein is first covered by a chaperone protein to protect it from the high concentration of other proteins in the ER, giving it time to fold correctly. Once folded, the protein is modified as needed (for example, by glycosylation), then transported to the Golgi for further processing and goes to its target organelles or is retained in the ER by various ER retention mechanisms.
The amino acid chain of transmembrane proteins, which often are transmembrane receptors, passes through a membrane one or several times. They are inserted into the membrane by translocation, until the process is interrupted by a stop-transfer sequence, also called a membrane anchor or signal-anchor sequence. These complex membrane proteins are at the moment mostly understood using the same model of targeting that has been developed for secretory proteins. However, many complex multi-transmembrane proteins contain structural aspects that do not fit the model. Seven transmembrane G-protein coupled receptors (which represent about 5% of the genes in humans) mostly do not have an amino-terminal signal sequence. In contrast to secretory proteins, the first transmembrane domain acts as the first signal sequence, which targets them to the ER membrane. This also results in the translocation of the amino terminus of the protein into the ER membrane lumen. This would seem to break the rule of "co-translational" translocation which has always held for mammalian proteins targeted to the ER. This has been demonstrated with opsin with in vitro experiments. A great deal of the mechanics of transmembrane topology and folding remains to be elucidated.
Even though most secretory proteins are co-translationally translocated, some are translated in the cytosol and later transported to the ER/plasma membrane by a post-translational system. In prokaryotes this requires certain cofactors such as SecA and SecB. This pathway is poorly understood in eukaryotes, but is facilitated by Sec62 and Sec63, two membrane-bound proteins.
In addition, proteins targeted to other destinations, such as mitochondria, chloroplasts, or peroxisomes, use specialized post-translational pathways. Also, proteins targeted for the nucleus are translocated post-translation. They pass through the nuclear envelope via nuclear pores.
Sorting of proteins
Most mitochondrial proteins are synthesized as cytosolic precursors containing uptake peptide signals. Cytosolic chaperones deliver preproteins to channel linked receptors in the mitochondrial membrane. The preprotein with presequence targeted for the mitochondria is bound by receptors and the General Import Pore (GIP) (Receptors and GIP are collectively known as Translocase of Outer Membrane or TOM) at the outer membrane. The preprotein is translocated through TOM as hairpin loops. The preprotein is transported through the intermembrane space by small TIMs (which also acts as molecular chaperones) to the TIM23 or 22 (Translocase of Inner Membrane) at the inner membrane. Within the matrix the targeting sequence is cleaved off by mtHsp70.
- Binds to internal targeting peptides and acts as a docking point for cytosolic chaperones.
- Binds presequences
- Binds both presequences and internal targeting peptides
The presequence translocase23 (TIM23) is localized to the mitochondrial inner membrane and acts a pore forming protein which binds precursor proteins with its N-terminus. TIM23 acts a translocator for preproteins for the mitochondrial matrix, the inner mitochondrial membrane as well as for the intermembrane space. TIM50 is bound to TIM23 at the inner mitochondrial side and found to bind presequences. TIM44 is bound on the matrix side and found binding to mtHsp70.
The presequence translocase22 (TIM22) binds preproteins exclusively bound for the inner mitochondrial membrane.
Mitochondrial matrix targeting sequences are rich in positively charged amino acids and hydroxylated ones.
Proteins are targeted to submitochondrial compartments by multiple signals and several pathways.
Targeting to the outer membrane, intermembrane space, and inner membrane often requires another signal sequence in addition to the matrix targeting sequence.
The preprotein for chloroplasts may contain a stromal import sequence or a stromal and thylakoid targeting sequence. The majority of preproteins are translocated through the Toc and Tic complexes located within the chloroplast envelope. In the stroma the stromal import sequence is cleaved off and folded as well as intra-chloroplast sorting to thylakoids continues. Proteins targeted to the envelope of chloroplasts usually lack cleavable sorting sequence.
Both chloroplasts and mitochondria
Many proteins are needed in both mitochondria and chloroplasts. In general the targeting peptide is of intermediate character to the two specific ones. The targeting peptides of these proteins have a high content of basic and hydrophobic amino acids, a low content of negatively charged amino acids. They have a lower content of alanine and a higher content of leucine and phenylalanine. The dual targeted proteins have a more hydrophobic targeting peptide than both mitochondrial and chloroplastic ones.
All peroxisomal proteins are encoded by nuclear genes.
To date there are two types of known Peroxisome Targeting Signals (PTS):
Peroxisome targeting signal 1 (PTS1): a C-terminal tripeptide with a consensus sequence (S/A/C)-(K/R/H)-(L/A). The most common PTS1 is serine-lysine-leucine (SKL). Most peroxisomal matrix proteins possess a PTS1 type signal.
Peroxisome targeting signal 2 (PTS2): a nonapeptide located near the N-terminus with a consensus sequence (R/K)-(L/V/I)-XXXXX-(H/Q)-(L/A/F) (where X can be any amino acid).
There are also proteins that possess neither of these signals. Their transport may be based on a so-called "piggy-back" mechanism: such proteins associate with PTS1-possessing matrix proteins and are translocated into the peroxisomal matrix together with them.
In bacteria and archaea
As discussed above (see protein translocation), most prokaryotic membrane-bound and secretory proteins are targeted to the plasma membrane by either a co-translation pathway that uses bacterial SRP or a post-translation pathway that requires SecA and SecB. At the plasma membrane, these two pathways deliver proteins to the SecYEG translocon for translocation. Bacteria may have a single plasma membrane (Gram-positive bacteria), or an inner membrane plus an outer membrane separated by the periplasm (Gram-negative bacteria). Besides the plasma membrane the majority of prokaryotes lack membrane-bound organelles as found in eukaryotes, but they may assemble proteins onto various types of inclusions such as gas vesicles and storage granules.
In gram-negative bacteria proteins may be incorporated into the plasma membrane, the outer membrane, the periplasm or secreted into the environment. Systems for secreting proteins across the bacterial outer membrane may be quite complex and play key roles in pathogenesis. These systems may be described as type I secretion, type II secretion, etc.
In most gram-positive bacteria, certain proteins are targeted for export across the plasma membrane and subsequent covalent attachment to the bacterial cell wall. A specialized enzyme, sortase, cleaves the target protein at a characteristic recognition site near the protein C-terminus, such as an LPXTG motif (where X can be any amino acid), then transfers the protein onto the cell wall. Several analogous systems are found that likewise feature a signature motif on the extracytoplasmic face, a C-terminal transmembrane domain, and cluster of basic residues on the cytosolic face at the protein's extreme C-terminus. The PEP-CTERM/exosortase system, found in many Gram-negative bacteria, seems to be related to extracellular polymeric substance production. The PGF-CTERM/archaeosortase A system in archaea is related to S-layer production. The GlyGly-CTERM/rhombosortase system, found in the Shewanella, Vibrio, and a few other genera, seems involved in the release of proteases, nucleases, and other enzymes.
Identifying protein targeting motifs in proteins
Minimotif Miner is a bioinformatics tool that searches protein sequence queries for a known protein targeting sequence motifs.
- Kanner EM, Friedlander M, Simon SM. (2003). "Co-translational targeting and translocation of the amino terminus of opsin across the endoplasmic membrane requires GTP but not ATP". J. Biol. Chem. 278 (10): 7920–7926. doi:10.1074/jbc.M207462200. PMID 12486130.
- Kanner EM, Klein IK. et al. (2002). "The amino terminus of opsin translocates "posttranslationally" as efficiently as cotranslationally". Biochemistry 41 (24): 7707–7715. doi:10.1021/bi0256882. PMID 12056902.