Gene
Entrez ID Entrez Gene ID - the GENE ID in NCBI Gene database.
1278
Gene name Gene Name - the full gene name approved by the HGNC.
Collagen type I alpha 2 chain
Gene symbol Gene Symbol - the official gene symbol approved by the HGNC.
COL1A2
Synonyms (NCBI Gene) Gene synonyms aliases
EDSARTH2, EDSCV, OI4
Disease Acronyms (UniProt) Disease acronyms from UniProt database
EDSARTH2, EDSCV, OI4
Chromosome Chromosome number
7
Chromosome location Chromosomal Location - indicates the cytogenetic location of the gene or region on the chromosome.
7q21.3
Summary Summary of gene provided in NCBI Entrez Gene.
This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutatio
SNPs SNP information provided by dbSNP.
SNP ID Visualize variation Clinical significance Consequence
rs66612022 G>A,T Pathogenic Missense variant, coding sequence variant
rs66619856 G>A,T Pathogenic Missense variant, coding sequence variant
rs66773001 G>A,T Likely-pathogenic Missense variant, coding sequence variant
rs66820119 G>A,C,T Pathogenic Splice acceptor variant
rs66883877 G>A,C,T Likely-pathogenic Missense variant, coding sequence variant
miRNA miRNA information provided by mirtarbase database.
miRTarBase ID miRNA Experiments Reference
MIRT001928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT000472 hsa-let-7g-5p qRT-PCR, Luciferase reporter assay, Western blot 20338660
MIRT001928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT001928 hsa-miR-29c-3p Immunohistochemistry, qRT-PCR 21125666
MIRT001928 hsa-miR-29c-3p Immunohistochemistry, qRT-PCR 21125666
Transcription factors
Transcription factor Regulation Reference
CEBPZ Unknown 8910550
CIITA Repression 16439692
CIITA Unknown 15247294
EP300 Unknown 24058639
FLI1 Repression 24058639
Gene ontology (GO) Gene ontology information of associated ontologies with gene provided by GO database.
GO ID Ontology Definition Evidence Reference
GO:0001501 Process Skeletal system development IMP 8841196, 17955022, 18375391
GO:0001568 Process Blood vessel development IMP 17211858
GO:0002020 Function Protease binding IPI 19932771
GO:0005201 Function Extracellular matrix structural constituent IBA 21873635
GO:0005201 Function Extracellular matrix structural constituent NAS 8982144
Other IDs Other ids provides unique ids of gene in databases such as OMIM, HGNC, ENSEMBLE.
MIM HGNC e!Ensembl
120160 2198 ENSG00000164692
Protein
UniProt ID P08123
Protein name Collagen alpha-2(I) chain (Alpha-2 type I collagen)
Protein function Type I collagen is a member of group I collagen (fibrillar forming collagen).
PDB 5CTD , 5CTI , 5CVA , 6JEC
Family and domains

Pfam

Accession ID Position in sequence Description Type
PF01391 Collagen 29 82 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 88 150 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 130 207 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 460 529 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 601 665 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 1045 1114 Collagen triple helix repeat (20 copies) Repeat
PF01410 COLFI 1131 1365 Fibrillar collagen C-terminal domain Family
Tissue specificity TISSUE SPECIFICITY: Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.
Sequence
MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPPGRDGEDGPTG
PPGPPGPPGPPGLGGNFAAQYD
GKGVGLGPGPMGLMGPRGPPGAAGAPGPQGFQGPAGEP
GEPGQTGPA
GARGPAGPPGKAGEDGHPGKPGRPGERGVVGPQGARGFPGTPGLPGFKGIR
GHNGLDGLKGQPGAPGVKGEPGAPGEN
GTPGQTGARGLPGERGRVGAPGPAGARGSDGSV
GPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNP
GANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGPAGSKGESGNK
GEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRGSPGSRGLPGADGRAGVMGPP
GSRGASGPAGVRGPNGDAGRPGEPGLMGPRGLPGSPGNIGPAGKEGPVGLPGIDGRPGPI
GPAGARGEPGNIGFPGPKGPTGDPGKNGDKGHAGLAGARGAPGPDGNNG
AQGPPGPQGVQ
GGKGEQGPPGPPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAA
GPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEP
GLRGE
IGNPGRDGARGAPGAVGAPGPAGATGDRGEAGAAGPAGPAGPRGSPGERGEVGPA
GPNGFAGPAGAAGQPGAKGERGAKGPKGENGVVGPTGPVGAAGPAGPNGPPGPAGSRGDG
GPPGMTGFPGAAGRTGPPGPSGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPP
GFAGEKGPSGEAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIA
GPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGYPGNIGPVGAA
GAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSGPQGIRGDKGEPGEKGPRGLP
GLKGHNGLQGLPGIAGHHGDQGAPGSVGPAGPRGPAGPSGPAGKDGRTGHPGTVGPAGIR
GPQGHQGPAGPPGPPGPPGPPGVSGGGYDFGYDG
DFYRADQPRSAPSLRPKDYEVDATLK
SLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG
ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLL
ANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK
TNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGADQEFFVDIGPVCF
K
Sequence length 1366
Interactions View interactions
Associated diseases Disease information provided by ClinVar, GenCC, and GWAS databases.
Unknown
Disease term Disease name Evidence References Source
Arthrochalasia Ehlers-Danlos syndrome Ehlers-Danlos syndrome, arthrochalasia type, 2 GenCC
Ehlers-Danlos Syndrome Ehlers-Danlos syndrome, cardiac valvular type GenCC
Osteogenesis Imperfecta Ehlers-Danlos/osteogenesis imperfecta syndrome, osteogenesis imperfecta GenCC
Breast Cancer Breast Cancer Importantly, breast cancer patients bearing PRC2 LOF mutations displayed significantly worse prognosis compared with PRC2 wild-type patients GWAS, CBGDA
Associations from Text Mining
Disease Name Relationship Type References
Adenocarcinoma Associate 26482433, 36171259
Adenocarcinoma of Lung Associate 31074380, 37795779
Adenoma Associate 30175151
Alopecia Associate 16755026
Alzheimer Disease Associate 26482433
Amelogenesis Imperfecta Type IB Associate 8702873
Aneurysm Associate 27381111
Aneurysm Ruptured Associate 34290266
Anodontia Associate 23227268, 32234057
Aortic Valve Disease Associate 32089075