GediPNet logo

COL4A1 (collagen type IV alpha 1 chain)

Gene
Entrez ID Entrez Gene ID - the GENE ID in NCBI Gene database.
1282
Gene nameGene Name - the full gene name approved by the HGNC.
Collagen type IV alpha 1 chain
Gene symbolGene Symbol - the official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name.
COL4A1
SynonymsGene synonyms aliases
BSVD, BSVD1, COL4A1s, PADMAL, RATOR
ChromosomeChromosome number
13
Chromosome locationChromosomal Location - indicates the cytogenetic location of the gene or region on the chromosome.
13q34
SummarySummary of gene provided in NCBI Entrez Gene.
This gene encodes a type IV collagen alpha protein. Type IV collagen proteins are integral components of basement membranes. This gene shares a bidirectional promoter with a paralogous gene on the opposite strand. The protein consists of an amino-terminal 7S domain, a triple-helix forming collagenous domain, and a carboxy-terminal non-collagenous domain. It functions as part of a heterotrimer and interacts with other extracellular matrix components such as perlecans, proteoglycans, and laminins. In addition, proteolytic cleavage of the non-collagenous carboxy-terminal domain results in a biologically active fragment known as arresten, which has anti-angiogenic and tumor suppressor properties. Mutations in this gene cause porencephaly, cerebrovascular disease, and renal and muscular defects. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Dec 2014]
SNPsSNP information provided by dbSNP.
SNP ID Visualize variation Clinical significance Consequence
rs75711155 A>C Likely-benign, conflicting-interpretations-of-pathogenicity Intron variant
rs113994104 C>A,T Pathogenic Missense variant, coding sequence variant
rs113994105 C>T Pathogenic Genic downstream transcript variant, missense variant, coding sequence variant
rs113994106 C>T Pathogenic Genic downstream transcript variant, missense variant, coding sequence variant
rs113994107 C>T Pathogenic Genic downstream transcript variant, missense variant, coding sequence variant
miRNAmiRNA information provided by mirtarbase database.
miRTarBase ID miRNA Experiments Reference
MIRT001927 hsa-miR-29c-3p Luciferase reporter assay, Reporter assay;Other 18390668
MIRT001927 hsa-miR-29c-3p Sequencing, PAR-CLIP 20371350
MIRT001927 hsa-miR-29c-3p PAR-CLIP 23592263
MIRT001927 hsa-miR-29c-3p PAR-CLIP 23446348
MIRT001927 hsa-miR-29c-3p Luciferase reporter assay 22745231
Transcription factors
Transcription factor Regulation Reference
LMX1B Unknown 12602071
Gene ontology (GO)Gene ontology information of associated ontologies with gene provided by GO database.
GO ID Ontology Definition Evidence Reference
GO:0001569 Process Branching involved in blood vessel morphogenesis IMP 20818663
GO:0005201 Function Extracellular matrix structural constituent IBA 21873635
GO:0005201 Function Extracellular matrix structural constituent IMP 20818663
GO:0005515 Function Protein binding IPI 12011424
GO:0005576 Component Extracellular region NAS 14718574
Other IDsOther ids provides unique ids of gene in databases such as OMIM, HGNC, ENSEMBLE.
MIM
HGNC
e!Ensembl
Protein
UniProt ID P02462
Protein name Collagen alpha-1(IV) chain [Cleaved into: Arresten]
Protein function Type IV collagen is the major structural component of glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork together with laminins, proteoglycans and entactin/nidogen. ; Arresten, comprising the C-terminal NC1 domain, inhibits angiogenesis and tumor formation. The C-terminal half is found to possess the anti-angiogenic activity. Specifically inhibits endothelial cell proliferation, migration and tube formation.
PDB 1LI1 , 5NAX , 5NAY , 6MPX
Family and domains

Pfam

Accession ID Position in sequence Description Type
PF01391 Collagen
37 105
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
61 138
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
97 162
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
167 225
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
273 334
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
474 533
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
539 592
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
643 690
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
689 737
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
736 801
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
837 896
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
882 941
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
943 1005
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
999 1058
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
1057 1117
Collagen triple helix repeat (20 copies)
Repeat
PF01391 Collagen
1384 1443
Collagen triple helix repeat (20 copies)
Repeat
PF01413 C4
1446 1553
C-terminal tandem repeated domain in type 4 procollagen
Domain
PF01413 C4
1556 1667
C-terminal tandem repeated domain in type 4 procollagen
Domain
Sequence
MGPRLSVWLLLLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGVKGQKGERGLPGLQGVI
GFPGMQGPEGPQGPPGQKGDTGEPGLPGTKGTRGPPGASGYPGNPGLPGIPGQDGPPGPP
GIPGCNGTKGERGPLGPP
GLPGFAGNPGPPGLPGMKGDPGEI
LGHVPGMLLKGERGFPGI
PGTPGPPGLPGLQGPVGPPGFTGPPGPPGPPGPPGEKGQMGLSFQ
GPKGDKGDQGVSGPP
GVPGQAQVQEKGDFATKGEKGQKGEPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGS
PGFPGEPGYPGLIGRQGPQGEKGEAGPPGPPGIV
IGTGPLGEKGERGYPGTPGPRGEPGP
KGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGDRGFPGTSLPGPSGRDGLPGPPGSPG
PPGQPGYTNGIVECQPGPPGDQGPPGIPGQPGFIGEIGEKGQKGESCLICDIDGYRGPPG
PQGPPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAKGEPGEF
YFDLRLK
GDKGDPGFPGQPGMPGRAGSPGRDGHPGLPGPKGSPGSVGLKGERGPPGGVG
FPGSRGDT
GPPGPPGYGPAGPIGDKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGFP
GPQGDRGFPGTPGRPGLPGEKGAVGQPG
IGFPGPPGPKGVDGLPGDMGPPGTPGRPGFNG
LPGNPGVQGQKGEPGVGLPGLKGLPGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGI
RGEPGPPGLPGSVGSPGVPGI
GPPGARGPPGGQGPPGLSGPPGIKGEKGFPGFPGLDMPG
PKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPGSKGEM
GVMGTPGQPGSPGPVGAPG
LPGEKGDHGFPGSSGPRGDPGLKGDKGDVGLPGKPGSMDKV
DMGSMKGQKGDQGEKGQIG
PIGEKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGIS
GTPGAPGLPGPKGSVGGMGLPG
TPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQAGPPGIGIPGLRGEKGDQGIAGFPGSPGE
KGEKGSIGIPGMPGSPGLKGSPGSVGYPGSPGLPGEK
GDKGLPGLDGIPGVKGEAGLPGT
PGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKGDKGSKGEVGFPGLAGSP
GIPGSKGEQGFMGPPGPQGQPGLPGSPGHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPG
IDGVKGDKGNPGWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGFQGPKG
LPGLQGIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEPGLPGPEGPPGLKGLQGLPGPK
GQQGVTGLVGIPGPPGIPGFDGAPGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTP
SVD
HGFLVTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAGSCLRKFSTM
PFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAPITGENIRPFISRCAVCEA
PAMVMAV
HSQTIQIPPCPSGWSSLWIGYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRG
TCNYYANAYSFWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMR
RT
Sequence length 1669
Interactions View interactions

| © 2021, Biomedical Informatics Centre, NIRRH |
ICMR-National Institute for Research in Reproductive Health, Jehangir Merwanji Street, Parel, Mumbai-400012
Tel: +91-22-24192104, Fax No: +91-22-24139412