C1orf52: Difference between revisions
Line 3: | Line 3: | ||
= C1orf52 = |
= C1orf52 = |
||
'''Chromosome 1 open reading frame 52''', is a [[protein]] in [[Human|'' |
'''Chromosome 1 open reading frame 52''', is a [[protein]] in [[Human|''Homo sapiens'']], encoded by the C1orf52 [[gene]]. C1orf52 exhibits [[Cytoplasm|cytoplasmic]] and [[Cell nucleus|nuclear]] expression in most tissues.<ref>{{Cite web |title=C1orf52 protein expression summary - The Human Protein Atlas |url=https://www.proteinatlas.org/ENSG00000162642-C1orf52 |access-date=2024-09-21 |website=www.proteinatlas.org}}</ref> |
||
== Gene == |
== Gene == |
||
[[File: |
[[File:C1orf52_Gene_Neighborhood.jpg|thumb|318x318px|C1orf52 gene neighborhood. B-cell lymphoma 10 (BCL10), B-cell lymphoma antisense 1 (BCL-AS1), dimethylarginine dimethylaminohydrolase 1 (DDAH1), and synapse defective Rho GTPase homolog 2 (SYDE2) genes are located in close proximity to C1orf52 on chromosome 1.]] |
||
C1orf52 is located on the minus strand of the short arm of [[Chromosome 1]] at 1p22.3.<ref name=":2">{{Cite web |title=NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/gene/148423}}</ref> Including [[Intron|introns]] and [[Exon|exons]], the gene is 9,720 [[Base pair|base pairs]] |
C1orf52 is located on the minus strand of the short arm of [[Chromosome 1]] at 1p22.3.<ref name=":2">{{Cite web |title=NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/gene/148423}}</ref> Including [[Intron|introns]] and [[Exon|exons]], the gene is 9,720 [[Base pair|base pairs]] with 3 exons.<ref name=":0">{{Cite web |title=C1orf52 Gene - Chromosome 1 Open Reading Frame 52 |url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=C1orf52}}</ref> C1orf52 is located downstream of BCL10. |
||
== Transcript == |
== Transcript == |
||
Including [[Untranslated region|untranslated regions]], the mRNA is 3254 base pairs long.<ref>{{Cite web |title=NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/nuccore/NM_198077.4}}</ref> The mRNA contains a short 5' untranslated region of 29 base pairs. |
Including [[Untranslated region|untranslated regions]], the mRNA is 3254 base pairs long.<ref>{{Cite web |title=NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/nuccore/NM_198077.4}}</ref> The mRNA contains a short 5' untranslated region of 29 base pairs. |
||
=== Transcript Variants === |
=== Transcript Variants === |
||
There is a [[Alternative splicing|transcript variant]] that includes an additional exon.<ref name=":2" /> This alternate exon in the [[coding region]] in variant 2 results in a [[Frameshift mutation|frameshift]] and early [[stop codon]]. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for [[nonsense-mediated decay]].<ref name=":2" /> |
There is a [[Alternative splicing|transcript variant]] that includes an additional exon.<ref name=":2" /> This alternate exon in the [[coding region]] in variant 2 results in a [[Frameshift mutation|frameshift]] after [[nucleotide]] 306 and early [[stop codon]]. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for [[nonsense-mediated decay]].<ref name=":2" /> |
||
{| class="wikitable" |
|||
|+ |
|||
!Exons |
|||
!1 |
|||
!2 |
|||
!3 |
|||
!4 |
|||
!Protein Length (amino acids) |
|||
|- |
|||
|Transcript Variant 1 |
|||
|306 |
|||
| - |
|||
|199 |
|||
|2750 |
|||
|182 |
|||
|- |
|||
|Transcript Variant 2 |
|||
|306 |
|||
|127 |
|||
|199 |
|||
|2750 |
|||
|none |
|||
|} |
|||
No protein [[Protein isoform|isoforms]] of C1orf52 have been reported. <ref name=":1">{{Cite web |title=Protein BLAST: search protein databases using a protein query |url=https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins |access-date=2024-09-21 |website=blast.ncbi.nlm.nih.gov}}</ref> |
|||
== Protein == |
== Protein == |
||
=== General Properties === |
=== General Properties === |
||
The primary encoded protein consists of 182 [[Amino acid|amino acids]] with a molecular weight of 20 |
The primary encoded protein consists of 182 [[Amino acid|amino acids]] with a molecular weight of ~20 [[Dalton (unit)|kDa]].<ref name=":0" /> The protein contains a [[domain of unknown function]] (DUF4660), also known as pFAM15559, that is 98 amino acids long.<ref name=":0" /> The domain of unknown function is flanked by two disordered regions, which make up the majority of the rest of the protein. C1orf52 enables RNA binding activity. |
||
== Homology == |
== Homology == |
||
Line 26: | Line 50: | ||
=== Orthologs === |
=== Orthologs === |
||
C1orf52 [[Sequence homology|orthologs]] are found in all common classes of [[Vertebrate|vertebrates:]] fish, birds, amphibians, reptiles, and mammals. Orthologs were also found in [[Invertebrate|invertebrates]] including sponges, marine tunicate, and a annelid worm. Orthologs were not found in insects, fungi, plants or protists. |
|||
⚫ | |||
⚫ | |||
{| class="wikitable" |
{| class="wikitable" |
||
|+ |
|+ |
||
Line 38: | Line 64: | ||
|- |
|- |
||
|Human |
|Human |
||
|Homo Sapiens |
|''Homo Sapiens'' |
||
|0 |
|0 |
||
|NP_932343.1 |
|NP_932343.1 |
||
Line 46: | Line 72: | ||
|- |
|- |
||
|House Mouse |
|House Mouse |
||
|[[House mouse|Mus musculus]] |
|[[House mouse|''Mus musculus'']] |
||
|90.5 |
|90.5 |
||
|NP_079831.1 |
|NP_079831.1 |
||
Line 54: | Line 80: | ||
|- |
|- |
||
|Chicken |
|Chicken |
||
|[[Red junglefowl|Gallus gallus]] |
|[[Red junglefowl|''Gallus gallus'']] |
||
|320.5 |
|320.5 |
||
|NP_001264489.2 |
|NP_001264489.2 |
||
Line 62: | Line 88: | ||
|- |
|- |
||
|Zebrafish |
|Zebrafish |
||
|[[Zebrafish|Danio rerio]] |
|[[Zebrafish|''Danio rerio'']] |
||
|429.6 |
|429.6 |
||
|NP_956836.1 |
|NP_956836.1 |
||
Line 70: | Line 96: | ||
|- |
|- |
||
|Smalltooth Sawfish |
|Smalltooth Sawfish |
||
|[[Smalltooth sawfish|Pristis pectinata]] |
|[[Smalltooth sawfish|''Pristis pectinata'']] |
||
|440 |
|440 |
||
|XP_051869055.1 |
|XP_051869055.1 |
||
Line 78: | Line 104: | ||
|- |
|- |
||
|Deep Sea Sponge |
|Deep Sea Sponge |
||
|[[Geodia barretti]] |
|[[Geodia barretti|''Geodia barretti'']] |
||
|700 |
|700 |
||
|CAI8039110.1 |
|CAI8039110.1 |
Revision as of 03:21, 1 October 2024
This sandbox is in the article namespace. Either move this page into your userspace, or remove the {{User sandbox}} template.
C1orf52
Chromosome 1 open reading frame 52, is a protein in Homo sapiens, encoded by the C1orf52 gene. C1orf52 exhibits cytoplasmic and nuclear expression in most tissues.[1]
Gene
C1orf52 is located on the minus strand of the short arm of Chromosome 1 at 1p22.3.[2] Including introns and exons, the gene is 9,720 base pairs with 3 exons.[3] C1orf52 is located downstream of BCL10.
Transcript
Including untranslated regions, the mRNA is 3254 base pairs long.[4] The mRNA contains a short 5' untranslated region of 29 base pairs.
Transcript Variants
There is a transcript variant that includes an additional exon.[2] This alternate exon in the coding region in variant 2 results in a frameshift after nucleotide 306 and early stop codon. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for nonsense-mediated decay.[2]
Exons | 1 | 2 | 3 | 4 | Protein Length (amino acids) |
---|---|---|---|---|---|
Transcript Variant 1 | 306 | - | 199 | 2750 | 182 |
Transcript Variant 2 | 306 | 127 | 199 | 2750 | none |
No protein isoforms of C1orf52 have been reported. [5]
Protein
General Properties
The primary encoded protein consists of 182 amino acids with a molecular weight of ~20 kDa.[3] The protein contains a domain of unknown function (DUF4660), also known as pFAM15559, that is 98 amino acids long.[3] The domain of unknown function is flanked by two disordered regions, which make up the majority of the rest of the protein. C1orf52 enables RNA binding activity.
Homology
Paralogs
There were no paralogs of C1orf52 identified in the human genome.[5]
Orthologs
C1orf52 orthologs are found in all common classes of vertebrates: fish, birds, amphibians, reptiles, and mammals. Orthologs were also found in invertebrates including sponges, marine tunicate, and a annelid worm. Orthologs were not found in insects, fungi, plants or protists.
Orthologs of C1orf52 were traced back to the phylum Porifera.
Common Name | Genus and Species | Date of Divergence from Humans (MYA) | Assession Number | Sequence Length | Sequence Identity to Humans | Sequence Similarity to Humans |
---|---|---|---|---|---|---|
Human | Homo Sapiens | 0 | NP_932343.1 | 182 | 100% | 100% |
House Mouse | Mus musculus | 90.5 | NP_079831.1 | 180 | 85.2% | 89.0% |
Chicken | Gallus gallus | 320.5 | NP_001264489.2 | 183 | 63.0% | 71.4% |
Zebrafish | Danio rerio | 429.6 | NP_956836.1 | 214 | 45.9% | 58.3% |
Smalltooth Sawfish | Pristis pectinata | 440 | XP_051869055.1 | 205 | 44.9% | 58.9% |
Deep Sea Sponge | Geodia barretti | 700 | CAI8039110.1 | 221 | 27.1% | 38.1% |
References
- ^ "C1orf52 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2024-09-21.
- ^ a b c "NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52".
- ^ a b c "C1orf52 Gene - Chromosome 1 Open Reading Frame 52".
- ^ "NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52".
- ^ a b "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2024-09-21.