Gene
Entrez ID Entrez Gene ID - the GENE ID in NCBI Gene database.
1277
Gene name Gene Name - the full gene name approved by the HGNC.
Collagen type I alpha 1 chain
Gene symbol Gene Symbol - the official gene symbol approved by the HGNC.
COL1A1
Synonyms (NCBI Gene) Gene synonyms aliases
CAFYD, EDSARTH1, EDSC, OI1, OI2, OI3, OI4
Disease Acronyms (UniProt) Disease acronyms from UniProt database
CAFYD, EDSARTH1, OI1, OI2, OI3, OI4
Chromosome Chromosome number
17
Chromosome location Chromosomal Location - indicates the cytogenetic location of the gene or region on the chromosome.
17q21.33
Summary Summary of gene provided in NCBI Entrez Gene.
This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutati
SNPs SNP information provided by dbSNP.
SNP ID Visualize variation Clinical significance Consequence
rs1800211 C>T Likely-pathogenic, uncertain-significance Coding sequence variant, missense variant, intron variant
rs1800214 G>A,C,T Conflicting-interpretations-of-pathogenicity, uncertain-significance Coding sequence variant, missense variant
rs2586486 G>A,T Pathogenic Coding sequence variant, stop gained, missense variant
rs8179178 C>A,T Pathogenic Coding sequence variant, missense variant
rs34940368 G>A,C Likely-benign, pathogenic, benign Synonymous variant, missense variant, coding sequence variant
miRNA miRNA information provided by mirtarbase database.
miRTarBase ID miRNA Experiments Reference
MIRT000928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT000928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT000928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT000928 hsa-miR-29c-3p Luciferase reporter assay 18390668
MIRT000928 hsa-miR-29c-3p Luciferase reporter assay 18390668
Transcription factors
Transcription factor Regulation Reference
CIITA Repression 16439692
ETS1 Unknown 16564026
MKL1 Activation 22049076
MYB Activation 9989795
MYBL2 Unknown 14613485
Gene ontology (GO) Gene ontology information of associated ontologies with gene provided by GO database.
GO ID Ontology Definition Evidence Reference
GO:0001501 Process Skeletal system development IBA 21873635
GO:0001501 Process Skeletal system development IMP 1874719, 8097422, 14976317
GO:0001503 Process Ossification IBA 21873635
GO:0001568 Process Blood vessel development IBA 21873635
GO:0001568 Process Blood vessel development IMP 17211858
Other IDs Other ids provides unique ids of gene in databases such as OMIM, HGNC, ENSEMBLE.
MIM HGNC e!Ensembl
120150 2197 ENSG00000108821
Protein
UniProt ID P02452
Protein name Collagen alpha-1(I) chain (Alpha-1 type I collagen)
Protein function Type I collagen is a member of group I collagen (fibrillar forming collagen).
PDB 1Q7D , 2LLP , 3EJH , 3GXE , 5CTD , 5CTI , 5CVA , 5CVB , 5K31 , 5OU8 , 5OU9 , 7E7B , 7E7D
Family and domains

Pfam

Accession ID Position in sequence Description Type
PF00093 VWC 40 95 von Willebrand factor type C domain Family
PF01391 Collagen 107 163 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 177 238 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 236 295 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 296 355 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 356 415 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 407 476 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 779 838 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 835 898 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 1013 1080 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 1076 1138 Collagen triple helix repeat (20 copies) Repeat
PF01391 Collagen 1133 1195 Collagen triple helix repeat (20 copies) Repeat
PF01410 COLFI 1227 1463 Fibrillar collagen C-terminal domain Family
Tissue specificity TISSUE SPECIFICITY: Forms the fibrils of tendon, ligaments and bones. In bones the fibrils are mineralized with calcium hydroxyapatite.
Sequence
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVC
PDGSESPTDQETTGVEGPKGDTGPR
GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQL
SYGYDEKSTGGISVPGP
MGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEA
GKPGR
PGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGEN
GAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEA
GPQGP
RGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIA
GAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPG
PTGL
PGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGL
TGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGE
QGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGS
QGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGD
KGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGA
KGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPP
GP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGV
VGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGA
EGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPT
GPVGP
VGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP
RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAG
FDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD
KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ
TGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPII
DVAPLDVGAPDQEFGFDVGPVCF
L
Sequence length 1464
Interactions View interactions
Associated diseases Disease information provided by ClinVar, GenCC, and GWAS databases.
Unknown
Disease term Disease name Evidence References Source
Ehlers-Danlos Syndrome Ehlers-Danlos syndrome, classic type, Ehlers-Danlos syndrome GenCC
Osteogenesis Imperfecta Ehlers-Danlos/osteogenesis imperfecta syndrome GenCC
Classical-Like Ehlers-Danlos Syndrome Ehlers-Danlos syndrome, classic type, 1 GenCC
Breast cancer Breast cancer Importantly, breast cancer patients bearing PRC2 LOF mutations displayed significantly worse prognosis compared with PRC2 wild-type patients GWAS, CBGDA
Associations from Text Mining
Disease Name Relationship Type References
Abnormalities Drug Induced Associate 33261612
Adenocarcinoma Associate 30670912, 32255255, 33371142, 36759514
Adenocarcinoma of Lung Associate 30732676, 32669531, 33511215, 37189138, 37287976
Aneurysm Associate 26918470
Aneurysm Ruptured Associate 34290266
Anodontia Associate 32234057
Aortic Aneurysm Thoracic Associate 37640670
Aortic Valve Insufficiency Stimulate 32386768
Arthritis Rheumatoid Associate 22736089, 34290266
Asthma Associate 32512817