Revision as of 03:21, 1 October 2024

This sandbox is in the article namespace. Either move this page into your userspace, or remove the {{User sandbox}} template.

C1orf52

Chromosome 1 open reading frame 52, is a protein in Homo sapiens, encoded by the C1orf52 gene. C1orf52 exhibits cytoplasmic and nuclear expression in most tissues.^[1]

Gene

C1orf52 is located on the minus strand of the short arm of Chromosome 1 at 1p22.3.^[2] Including introns and exons, the gene is 9,720 base pairs with 3 exons.^[3] C1orf52 is located downstream of BCL10.

Transcript

Including untranslated regions, the mRNA is 3254 base pairs long.^[4] The mRNA contains a short 5' untranslated region of 29 base pairs.

Transcript Variants

There is a transcript variant that includes an additional exon.^[2] This alternate exon in the coding region in variant 2 results in a frameshift after nucleotide 306 and early stop codon. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for nonsense-mediated decay.^[2]


Exons	1	2	3	4	Protein Length (amino acids)
Transcript Variant 1	306	-	199	2750	182
Transcript Variant 2	306	127	199	2750	none

No protein isoforms of C1orf52 have been reported. ^[5]

Protein

General Properties

The primary encoded protein consists of 182 amino acids with a molecular weight of ~20 kDa.^[3] The protein contains a domain of unknown function (DUF4660), also known as pFAM15559, that is 98 amino acids long.^[3] The domain of unknown function is flanked by two disordered regions, which make up the majority of the rest of the protein. C1orf52 enables RNA binding activity.

Homology

Paralogs

There were no paralogs of C1orf52 identified in the human genome.^[5]

Orthologs

C1orf52 orthologs are found in all common classes of vertebrates: fish, birds, amphibians, reptiles, and mammals. Orthologs were also found in invertebrates including sponges, marine tunicate, and a annelid worm. Orthologs were not found in insects, fungi, plants or protists.

Orthologs of C1orf52 were traced back to the phylum Porifera.


Common Name	Genus and Species	Date of Divergence from Humans (MYA)	Assession Number	Sequence Length	Sequence Identity to Humans	Sequence Similarity to Humans
Human	Homo Sapiens	0	NP_932343.1	182	100%	100%
House Mouse	Mus musculus	90.5	NP_079831.1	180	85.2%	89.0%
Chicken	Gallus gallus	320.5	NP_001264489.2	183	63.0%	71.4%
Zebrafish	Danio rerio	429.6	NP_956836.1	214	45.9%	58.3%
Smalltooth Sawfish	Pristis pectinata	440	XP_051869055.1	205	44.9%	58.9%
Deep Sea Sponge	Geodia barretti	700	CAI8039110.1	221	27.1%	38.1%

References

^ "C1orf52 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2024-09-21.
^ ^a ^b ^c "NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52".
^ ^a ^b ^c "C1orf52 Gene - Chromosome 1 Open Reading Frame 52".
^ "NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52".
^ ^a ^b "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2024-09-21.

[1] "C1orf52 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2024-09-21.

[:2-2] "NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52".

[:0-3] "C1orf52 Gene - Chromosome 1 Open Reading Frame 52".

[4] "NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52".

[:1-5] "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2024-09-21.

[1]

[2]

[3]

[4]

[5]

@@ Line 3: / Line 3: @@
 = C1orf52 =
-'''Chromosome 1 open reading frame 52''', is a [[protein]] in [[Human|''homo sapiens'']], encoded by the C1orf52 [[gene]]. C1orf52 exhibits [[Cytoplasm|cytoplasmic]] and [[Cell nucleus|nuclear]] expression in most tissues.<ref>{{Cite web |title=C1orf52 protein expression summary - The Human Protein Atlas |url=https://www.proteinatlas.org/ENSG00000162642-C1orf52 |access-date=2024-09-21 |website=www.proteinatlas.org}}</ref>
+'''Chromosome 1 open reading frame 52''', is a [[protein]] in [[Human|''Homo sapiens'']], encoded by the C1orf52 [[gene]]. C1orf52 exhibits [[Cytoplasm|cytoplasmic]] and [[Cell nucleus|nuclear]] expression in most tissues.<ref>{{Cite web |title=C1orf52 protein expression summary - The Human Protein Atlas |url=https://www.proteinatlas.org/ENSG00000162642-C1orf52 |access-date=2024-09-21 |website=www.proteinatlas.org}}</ref>
 == Gene ==
-[[File:C1orf52_neighborhood.jpg|thumb|289x289px|C1orf52 gene neighborhood. B-cell lymphoma 10 (BCL10), B-cell lymphoma antisense 1 (BCL-AS1), dimethylarginine dimethylaminohydrolase 1 (DDAH1), and synapse defective Rho GTPase homolog 2 (SYDE2) genes are located in close proximity to C1orf52 on chromosome 1.]]
+[[File:C1orf52_Gene_Neighborhood.jpg|thumb|318x318px|C1orf52 gene neighborhood. B-cell lymphoma 10 (BCL10), B-cell lymphoma antisense 1 (BCL-AS1), dimethylarginine dimethylaminohydrolase 1 (DDAH1), and synapse defective Rho GTPase homolog 2 (SYDE2) genes are located in close proximity to C1orf52 on chromosome 1.]]
-C1orf52 is located on the minus strand of the short arm of [[Chromosome 1]] at 1p22.3.<ref name=":2">{{Cite web |title=NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/gene/148423}}</ref> Including [[Intron|introns]] and [[Exon|exons]], the gene is 9,720 [[Base pair|base pairs]] and spans the chromosomal [[Locus (genetics)|locus]] from 85,249,953 to 85,259,672.<ref name=":0">{{Cite web |title=C1orf52 Gene - Chromosome 1 Open Reading Frame 52 |url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=C1orf52}}</ref> The gene contains 3 exons.
+C1orf52 is located on the minus strand of the short arm of [[Chromosome 1]] at 1p22.3.<ref name=":2">{{Cite web |title=NCBI (National Center for Biotechnology Information) Gene Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/gene/148423}}</ref> Including [[Intron|introns]] and [[Exon|exons]], the gene is 9,720 [[Base pair|base pairs]] with 3 exons.<ref name=":0">{{Cite web |title=C1orf52 Gene - Chromosome 1 Open Reading Frame 52 |url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=C1orf52}}</ref> C1orf52 is located downstream of BCL10.
 == Transcript ==
-Including [[Untranslated region|untranslated regions]], the mRNA is 3254 base pairs long.<ref>{{Cite web |title=NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/nuccore/NM_198077.4}}</ref> The mRNA contains a short 5' untranslated region of 29 base pairs. No [[Protein isoform|isoforms]] of C1orf52 have been reported. <ref name=":1">{{Cite web |title=Protein BLAST: search protein databases using a protein query |url=https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins |access-date=2024-09-21 |website=blast.ncbi.nlm.nih.gov}}</ref>
+Including [[Untranslated region|untranslated regions]], the mRNA is 3254 base pairs long.<ref>{{Cite web |title=NCBI (National Center for Biotechnology Information) Nucleotide Entry on C1orf52 |url=https://www.ncbi.nlm.nih.gov/nuccore/NM_198077.4}}</ref> The mRNA contains a short 5' untranslated region of 29 base pairs.
 === Transcript Variants ===
-There is a [[Alternative splicing|transcript variant]] that includes an additional exon.<ref name=":2" /> This alternate exon in the [[coding region]] in variant 2 results in a [[Frameshift mutation|frameshift]] and early [[stop codon]]. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for [[nonsense-mediated decay]].<ref name=":2" />
+There is a [[Alternative splicing|transcript variant]] that includes an additional exon.<ref name=":2" /> This alternate exon in the [[coding region]] in variant 2 results in a [[Frameshift mutation|frameshift]] after [[nucleotide]] 306 and early [[stop codon]]. The C1orf52 protein is not formed by this transcript because the product is significantly truncated and the transcript is a candidate for [[nonsense-mediated decay]].<ref name=":2" />
+{| class="wikitable"
+|+
+!Exons
+!1
+!2
+!3
+!4
+!Protein Length (amino acids)
+|-
+|Transcript Variant 1
+|306
+| -
+|199
+|2750
+|182
+|-
+|Transcript Variant 2
+|306
+|127
+|199
+|2750
+|none
+|}
+No protein [[Protein isoform|isoforms]] of C1orf52 have been reported. <ref name=":1">{{Cite web |title=Protein BLAST: search protein databases using a protein query |url=https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins |access-date=2024-09-21 |website=blast.ncbi.nlm.nih.gov}}</ref>
 == Protein ==
 === General Properties ===
-The primary encoded protein consists of 182 [[Amino acid|amino acids]] with a molecular weight of 20.599 [[Dalton (unit)|kDa]].<ref name=":0" /> The protein contains a [[domain of unknown function]] (DUF4660) from amino acid 30 to 127.<ref name=":0" />
+The primary encoded protein consists of 182 [[Amino acid|amino acids]] with a molecular weight of ~20 [[Dalton (unit)|kDa]].<ref name=":0" /> The protein contains a [[domain of unknown function]] (DUF4660), also known as pFAM15559, that is 98 amino acids long.<ref name=":0" /> The domain of unknown function is flanked by two disordered regions, which make up the majority of the rest of the protein. C1orf52 enables RNA binding activity.
 == Homology ==
@@ Line 26: / Line 50: @@
 === Orthologs ===
+C1orf52 [[Sequence homology|orthologs]] are found in all common classes of [[Vertebrate|vertebrates:]] fish, birds, amphibians, reptiles, and mammals. Orthologs were also found in [[Invertebrate|invertebrates]] including sponges, marine tunicate, and a annelid worm. Orthologs were not found in insects, fungi, plants or protists.
-[[Sequence homology|Orthologs]] of C1orf52 were traced back to the [[phylum]] [[Sponge|Porifera]].
+Orthologs of C1orf52 were traced back to the [[phylum]] [[Sponge|Porifera]].
 {| class="wikitable"
 |+
@@ Line 38: / Line 64: @@
 |-
 |Human
-|Homo Sapiens
+|''Homo Sapiens''
 |0
 |NP_932343.1
@@ Line 46: / Line 72: @@
 |-
 |House Mouse
-|[[House mouse|Mus musculus]]
+|[[House mouse|''Mus musculus'']]
 |90.5
 |NP_079831.1
@@ Line 54: / Line 80: @@
 |-
 |Chicken
-|[[Red junglefowl|Gallus gallus]]
+|[[Red junglefowl|''Gallus gallus'']]
 |320.5
 |NP_001264489.2
@@ Line 62: / Line 88: @@
 |-
 |Zebrafish
-|[[Zebrafish|Danio rerio]]
+|[[Zebrafish|''Danio rerio'']]
 |429.6
 |NP_956836.1
@@ Line 70: / Line 96: @@
 |-
 |Smalltooth Sawfish
-|[[Smalltooth sawfish|Pristis pectinata]]
+|[[Smalltooth sawfish|''Pristis pectinata'']]
 |440
 |XP_051869055.1
@@ Line 78: / Line 104: @@
 |-
 |Deep Sea Sponge
-|[[Geodia barretti]]
+|[[Geodia barretti|''Geodia barretti'']]
 |700
 |CAI8039110.1