LAB Assignment#1

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

COMSATS University Islamabad

. . . .

Sahiwal Campus
. .

. . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . LAB Assignment # 01

Submitted By:
Samina Naz
FA20-BCS-127

Subject:
Introduction to Bioinformatics

Section:
C
Q:1:
Gene Name: keratin 18 [Homo sapiens (human)] ODD Roll Numbers
a) Describe the official name and symbol
Official Gene Symbol: KRT18
Official Gene Name: Keratin 18
KRT18 is the official gene symbol used to represent keratin 18 in scientific
literature and databases. Keratin 18 is a type I keratin protein, which is a
structural protein found in epithelial cells, including those of the liver, intestine,
and pancreas.
b) Download its FASTA Sequence and complete dataset

FASTA Sequence:
>NC_000012.12:52948855-52952906 Homo sapiens chromosome 12, GRCh38.p14 Primary
Assembly
ACCTGTCTTCTCCACTGCCTGTACCAGCCCACCTCAGGTGCCTTCTCGCCGGCCTTCCTCACCCACCATG
TCTCGGCAGTCCTCCATCACCTTCCAGTCTGGCAGCCGCAGGGGCTTCAGCACCACCTCGGCCATCACCC
CGGCAGCTGGCCGCTCCCGCTTCAGCTCTGTCTCTGTGGCCCGCTCTGCAGCAGGGAGTGGGGGCCTGGG
AAGGATCAGCAGTGCTGGGGCCAGCTTTGGAAGCCGCAGCCTCTACAACCTGGGGGGTGCCAAGCGGGTC
TCCATCAATGGGTGTGGCAGCAGCTGCCGAAGTGGCTTTGGTGGCAGGGCCAGCAACAGGTTTGGAGTCA
ACAGTGGATTTGGCTATGGGGGTGGAGTTGGAGGAGGCTTCAGTGGCCCCAGCTTCCCCGTGTGTCCCCC
TGGAGGCATCCAAGAGGTCACTGTCAACCAGAGTCTCCTGACTCCTCTTCACCTGCAAATCGACCCCACC
ATCCAGCGGGTGCGGGCCGAGGAGCGCGAGCAGATCAAGACCCTCAACAATAAGTTCGCCTCCTTCATCG
ACAAGGTAAGCAGGGGCTTCATCCACCCCCTTGGGTTTGGGATCAAATAAACTCTTGGAAGGGCCATCCC
ATGGGGGGAGAGCAATAATGCAATGACCCCACTGTGGGAATGAGCACTGTTCAGCACGGGCTCCCAGGGG
CTGAGACCCTTCCAAGTCAGGCCAGCTTGCCCCACAGGACCTGGTAGAAATTTCCTCTCTTTCGGAGCCA
CATGGGCTGGTTCAGTCAACACCAAGGGAAGAGTTTTGTTGATTCTCTACAGGAGAGTTGCTGCTCAGCA
AACTACCCTAACCCAGAGTAGGTGGTGTTGAGAAACTTAACCCAAGAGCAGCTCCCCAACAGAAGCCTCT
AGGCCCCACCACCCGAATCCTATGCAAGCCCTAGGGAACTTTGCGGTAGCTCCATGGACTGCCTCCTTTT
GGGTTGGAGTTGTCAGTTACATTATTGTCAATGGGTGCCAAGTGAAAAAAATATCTTTTTCTTCTCTCCC
TTATTAATAGGAGTGTCTAACTCTGCCTCTCCAACTTCTCAAGGTTTCATTCTCTCTTCTTCCCTCCAGG
TGAGGTTCTTGGAGCAGCAGAACAAGGTCCTGGAGACCAAGTGGGCCCTCCTGCAGGAGCAGGGCTCCAG
GACTGTGAGGCAGAACCTAGAGCCCCTCTTTGATTCCTATACCAGTGAGCTCCGACGGCAGCTGGAAAGC
ATCACCACCGAGAGGGGCAGGCTTGAAGCTGAACTGAGGAACATGCAGGATGTTGTGGAAGATTTCAAAG
TCAGGTAAGTGGGAGACTGGCTTCTGGCCACACACAGCCATCTGAAGGCTCCTTTGTGTGAGGACCAGAG
AGGTGCAAAGGAGCAAATGCCGATATCAGCCGGGAGCTTTGGAACTGCAGCCTTTATCCTGCAAGGTGGA
GACAGCACTGTGTGGGGTAGCACAAGGACCTTCATCTTGTGTATTGCTATGAAGATCCTATTCCTATTTG
TTCACATGACTGCAAAGGGATATCAACATCATCCGACAAAATATTGCTACTCACTCTAGAAATCATAATG
TAATTTCACAGTCAGCATATTGTTTAATTCCATTGGACATGGGCTCAATTATTGAGATGGTTTGCATTTC
CAGGATGGCATTCACTGTAGAGTGAAGAGAGGTAACAAGGAAGAGTTTAAAGGAGGGCAATCTGACTTTT
CTTGTGGGGGGAAACTTTTGACTGCACATCATCCAGGCTGCAGTAGGTGAGGTATCCAGTGGAAAGGGAT
TTGGTCCAGACTAACCCATTAGCCAGCTTGCCCTTATTTCTAAGCTTGAGCTACCCCTAGTTACAAAAAG
CATATTTCTCAGGGGCCACCCTGAGGTTCAGTGAAAATATATCAAAATCTACCCGAAACAGCCTGCTGAA
CAGATAAAGTTTACTCCACGATTTCGGTAGACACTTGAGGTGGAAAAACATTTTAGATTAGCATTGCATA
TCTATCAAAAGATGGAGTTGCATAAGTATATTTTTGTATTCCTTTAAGTTCCATGCTCACCTTCTACTCT
TAGAAAAGTTTATTAAAAACTAGCAGAAGAAATTATCCCCCTTTTATGGAGGGAGAAATAGAGGTTAAAT
GAATCACCCAAAGTCATGCCCAGGTCAGCAGAAGAGCTGGAAATGATTTGAGGGCTCTTAACTCTTGCTA
CACCAGCCTGGAAAGAGTTCAATGCAGACTTGCTCAACAGCACACCCCCCGCTCTGTCCTTATAGGTACG
AAGATGAAATTAACAAGCGCACAGCTGCTGAGAATGAATTTGTAGCCCTGAAAAAGGTGAGTGGGGATGT
TTCTCTCAAAGGAGAAGGTTTAAAATGGAATCTGGAGTGTGGGGTAACCTGACCTCTGACCCTTGGGCCA
CCCAAGAAATGTCACATCAACCTTGAGAATATCTGCAGAGTTCAAACCTCCCAACATGTTCCCTCTACGT
TGATGGCCCCATTAGCCCTACTTGGGTTTCTGTGAGCTCAGGGAATTCAAGCCCCCAGTTCTCCGTAATT
ACCCATCCCCAACCCCAAATCACCCAGACCCAGAGTTTTCTAAAATCCAAACTAGATGGGCTGGGGAGAA
ATCTGCTCAGCTTCTTTTGGACTAGATACTTGGGGCTGCAGACTCAAAGGAGCATCCTGCGCTGTCATTC
CAGGACGTAGATGCTGCCTATATGAACAAGGTGGAGCTGGAAGCCAAGGTCAAATCTCTGCCCGAGGAGA
TCAACTTCATCCACTCAGTCTTTGATGCAGTAAGAGTCTGCAAGTATTTCTGTCTCTCCTAGGTCTGGAG
CCTGGAAAGAAAAGGATGATGCATTGTGCCATTCATTCATTCAGTGCCTCTGCCCAGCATCTTGCTTGGA
TACTTCAAGCTGGGGCTTGGGTGGTAGGGGGACCAGGGAGAACCACTTGGAGCCTTGTCATACTAAACTA
CCCCAGCTCTGATGCTTCTCCCCCGAACTCCCTTATCCCATGGCAGGACAACATCTAACAAGGAGAGAGT
ACTGATTCCCAAATTCTAGGACACTGGGTAATTTCCCAAGGAATCAACACATACCTGCTCCCTTCCCCTT
CCCTGAAAAGTATTAAAAAAAAAAAAAGGCAAGCTGTCCCTCAGCTCATTGGTCAGGGGGCTTCCTACCC
TTTGAAGATGCCTCATTCTGGGCGCTACCCCTCCAAAGGCAGAGACCTGGGTCTGTGGAAAGGGAGAGAG
AGGTAGAAAGTGATGGAGTCACACTGTGCAGGGGGAAGCAGTGTCCCAGATGTCCCCAATGCTCTTAAAG
AAGCCGTTTATGTTGGCATACTGAACAGACACGTGTCACATTCAAATCTATGTAATTCCAATTCAGAGGA
TGCTTACCCACCCTCATGCCTGGGGAAAGCACACAGATGGGAGCCTTGAGAAGTAAGGCTGGGAAAGATT
TTCACCACACTGATCTTTTGGGATGAATGGGAATACCTAGGGAATAGCTGGGAGAATCTGCCAGGAATAC
CAAGGAAAGGATTCCTGACCTAGATGGGTAGTTAACCACGAAGATTTAAGATTCTTCATCTTATGCCTTG
GTGATGCTGAGTTTACTGCCCTGCAGGAGCTGTCCCAGTTGCAGACCCAGGTCGGTGACACATCCGTGGT
GCTGTCCATGGACAACAACCGCAACCTGGACCTGGATAGTATCATCGCCGAGGTCAAAGCACAATACGAG
GACATTGCCAACCGCAGCCGGGCCGAGGCTGAGTCCTGGTACCAGACCAAGGTGAGCATGGACACCTCCA
TGAGAGGTTCCAGGGTTAGTGTTCTCTGAGGCTCCACATTATCACTTAACTCAGCCTCAGGAAACGTGTG
AGCACATTCGTTTATTTCAACTTAGCAGGCATGTCTTTGATGCTATGACAACTTAGCTTGAAATGCATGT
GGAAACCGAACCAGACACACTAATACATGGTCAGCCCAATGCTGGGAGCTCAGGACATCCACTGGCCCCA
CATTCCTCAAGATCTGGGTGGGAGCAGGGTGAGACACCAGGACAACCGAGACACAGTCATGAAGCAGTTT
CTAAAAGGCTTATTTATTCTCTATATATTTTCTGAGCTCCTGCTGTATGCCAATCAGGGTTACAGGGTTG
CAAATAAATAAACTGCAAACAGAGAACCCAAGCTCTGGGAGGCCATGAAGTGAATGGACAATCATGGAAG
GGAAAAGATAGCATGAATAAAAAGCTTCCAGGAAGACATGGGGGCTTTGTACAGTTGGGAAGCCATGAGG
GACAAAAGATTGCTGAGGAGTGGGGAGAGGTTTAAGGCTGAACAAGGAGCTGGCAGGCAAGAACAAGCAA
GGGAGTTTATGTCAGGAAGAGGAAGGCTGGGATAAACACAAACAGCTACTGCCCAGAGCTCAGACAGCCG
CAAAGAAGTTTGGCTTTGCGGGGTACAATTGACCCATGATACCAGCTCCCTGTCAAATCCAGACCCCTCT
TGGGGCAGCTTCTCACCTACGAGCAGGTTCCAACTCTTTCCCTGCTCCATACGTTGCCTCATCCCTTCTG
GTCAGGAGTGTGGTGGAAAGGAAGGTGGGTAGCAGGGACCAGGGTTCCACAGGGCAGAGGCAGCGCCTTG
ACGGTGAAAGGAAACATGATGCACTTAACCCCAAGGTGAAGTGGTTGAAATCGATAGCAAACGATTCCTC
ATGTTGTTGGGTTGTTGCTCCATTTAATCATGTATACCTAGAAGCGGGAACCTGAGCTATTCAGCACTTT
CAAGAACCCCACAGATCTTGACTCTGGCAGGGGATCTCCTTTTGTCAGGGAAAGGTGTAGGTTTCACTTC
AGTCTGCTGGAGGAGACAGGGTGTAATTATTGCTCTTAAATTCACATGTCCTGGATATGCACCATTAGAT
TGAGAACTACCTGAGATTGGGAATACTTTTACAAAGTCTTCAAAATTGTGCCTTCCACAGCCTCTGCACC
ATCCCACACCATTCCCCCATATCTCCTCCCTGTTTCCCCAGCACTGGTGTTTGGAGGGCTACCAAAATTC
ATGGGCACAGTTGGTCTGGATGCACGCTCTGTGACCAGGAACTACCCAGGGACCTTGATCAAATCACCGT
CTCACTCCTAGAACTCACCATGTTCCTTCCCTGACCCAGAGTCTTCATGCAGGCTGTTTCCTTTGTCTGG
AATGTTCTCCCCCACAACTGGTCACTTAGGTCCCTCCTTCTCATCCTTCAGACCACAGTTCAAGCATCTC
CATCTCTGGAGAGACTTCTCTGACCACCACTTCCCACTTCCAAATCTAGGTCAGATTCCTTCATTACACT
CTCCCAGGACCCTGTTCATTTCCTAAGGGCACTTATCTTAGTGTGGAACTATACATTTGATAATATGCTA
ATTCAGTTAATGTCTATGTCCCCCAATAAACTGTAAGCTTCAGGGGGAATGAGTGAATGACCAGGAATGA
ATGAGCCTGCTTGTGGCACCCAGGGTGGGTCTGTGTGCACAGCGAGTGCCTGGGCCAGGCATTTGACTCA
GTGACTGGGTTTGCTCTGGTTCTCTCAGTACGAGGAGCTGCAGGTCACCGCAGGCAGACATGGGGATGAC
CTTCGAAACACCAAACAAGAGATCTCTGAAATGAACCGCATGATCCAGAGGCTGAGAGCTGAGATTGACA

Dataset:
a) Explain gene type
Gene type:
protein coding.
The gene type of keratin 18 (KRT18) is a structural gene. Keratin 18 encodes a
protein that is a structural component of epithelial cells in various tissues,
including the liver, intestine, and pancreas. As a structural gene, KRT18's primary
function is to produce the protein necessary for maintaining the structural integrity
of these epithelial cells.

b) Describe its lineage and aliases


Lineage:
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria;
Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo

Aliases:
Some other names or aliases for Keratin 18 may include:
CK18: CK18 stands for Cytokeratin 18, which is another name commonly used to
refer to Keratin 18. Cytokeratins are a subgroup of keratins that are specifically
found in epithelial cells.
KRT18E: This is an abbreviation for Keratin 18 Endo, which indicates the
endogenous form of Keratin 18.
c) Summaries its function and expression in your own words
Function:
Involved in the uptake of thrombin-antithrombin complexes by hepatic cells (By
similarity).
When phosphorylated, plays a role in filament reorganization. Involved in the
delivery of mutated CFTR to the plasma membrane. Together with KRT8, is
involved in interleukin-6 (IL-6)-mediated barrier protection
Expression:
Tissue specificity:
Expressed in colon, placenta, liver and very weakly in exocervix. Increased expression observed in
lymph nodes of breast carcinoma.

d) Describe the location of gene in chromosome


Location: KRT18 gene is found on chromosome 12 at position 12q13.13.

Exon count: KRT18 gene has 8 exons.

e) Name any 10 interacting genes


1. KRT8 (Keratin 8)
2. KRT19 (Keratin 19)
3. KRT7 (Keratin 7)
4. KRT20 (Keratin 20)
5. TP53 (Tumor Protein P53)
6. EGFR (Epidermal Growth Factor Receptor)
7. CASP3 (Caspase 3)
8. CDH1 (Cadherin 1)
9. VIM (Vimentin)
10.FN1 (Fibronectin 1)

f) Give its protein information


Proteomes
Identifier
UP000005640
Component
Chromosome 12
g) Provide information of its reference sequences its RNA and Protein
information
Information about reference sequences, RNA, and protein for Keratin 18 (KRT18)
can be obtained from various databases and resources. Here's a summary of the
information typically available:
Reference Sequences:
Reference sequences for Keratin 18 can be found in databases like GenBank,
RefSeq, and Ensembl. These sequences represent the standard or canonical
sequences for KRT18 in different species, including Homo sapiens (human).

RefSeq status
REVIEWED

The reference sequences usually include the DNA sequence (nucleotide sequence)
encoding KRT18 gene.
RNA Information:
RNA information for Keratin 18 includes mRNA sequences transcribed from the
KRT18 gene. These mRNA sequences can be found in databases such as
GenBank, RefSeq, and Ensembl.
RNA expression data, such as tissue-specific expression patterns, can be obtained
from databases like GTEx (Genotype-Tissue Expression) and TCGA (The Cancer
Genome Atlas).
Protein Information:
Protein information for Keratin 18 includes the amino acid sequence of the KRT18
protein. This sequence can be retrieved from protein databases like UniProt.
Additional information about the protein structure, domains, post-translational
modifications, and functional annotations may also be available in protein
databases and literature.
To access specific information about reference sequences, RNA, and protein for
Keratin 18, you can search these databases using the gene symbol (KRT18) or its
associated identifiers.
Q:2:
a) Protein name of your corresponding gene and its annotation status
and score
Protein names
Recommended name

Keratin, type I cytoskeletal 18


Alternative names

o Cell proliferation-inducing gene 46 protein


o Cytokeratin-18 (CK-18)
o Keratin-18 (K18)

b) Information regarding its name and taxonomy


Recommended Name: Keratin, type I cytoskeletal 18
Alternative Names:
Cell proliferation-inducing gene 46 protein
Cytokeratin-18 (CK-18)
Keratin-18 (K18)
Gene Name:
Name: KRT18
Synonyms: CYK18
ORF Name:
PIG46
Organism Name:
Homo sapiens (Human)
Taxonomic Identifier: 9606 (NCBI Taxonomy)
Taxonomic Lineage:
Cellular organisms > Eukaryota > Opisthokonta > Metazoa > Eumetazoa >
Bilateria > Deuterostomia > Chordata > Craniata > Vertebrata > Gnathostomata >
Teleostomi > Euteleostomi > Sarcopterygii > Dipnotetrapodomorpha > Tetrapoda
> Amniota > Mammalia > Theria > Eutheria > Boreoeutheria > Euarchontoglires >
Primates > Haplorrhini > Simiiformes > Catarrhini > Hominoidea > Hominidae >
Homininae > Homo

c) Involvement of protein in different diseases


Involvement in disease
Cirrhosis (CIRRH)
2 Publications
Note
The disease is caused by variants affecting the gene represented in this entry
Description
A liver disease characterized by severe panlobular liver-cell swelling with Mallory
body formation, prominent pericellular fibrosis, and marked deposits of copper.
Clinical features include abdomen swelling, jaundice and pulmonary hypertension.
See also
MIM:215600
d) Details about its binary interactions
Binary interactions

P05783 has binary interactions with 31 proteins

e) Its structural information


Model Confidence:

 Very high (pLDDT > 90)


 Confident (90 > pLDDT > 70)
 Low (70 > pLDDT > 50)
 Very low (pLDDT < 50)

AlphaFold produces a per-residue confidence score (pLDDT) between 0


and 100. Some regions with low pLDDT may be unstructured in isolation.

f) Domains and their description


Keratin 18 (KRT18) is a protein that belongs to the keratin family,
specifically type I cytoskeletal keratins. These proteins are structural
components of epithelial cells and play a crucial role in maintaining cell
integrity and providing mechanical strength.
The domain structure of keratin 18 typically includes:

Head domain: Located at the N-terminus, this domain is involved in


interactions with other keratin proteins and cytoskeletal elements.
Central α-helical rod domain: This domain comprises the majority of the protein
and consists of multiple α-helical segments that form coiled-coil structures. These
coiled-coil regions facilitate the assembly of keratin filaments, which provide
structural support to cells.
Tail domain: Found at the C-terminus, this domain is involved in regulating
filament formation and interactions with other cellular proteins.
Keratin 18 is primarily expressed in single-layered epithelial tissues, such as those
lining the gastrointestinal tract, liver, pancreas, and lung. It contributes to the
mechanical stability of these tissues and is involved in various cellular processes,
including cell signaling, apoptosis, and response to stress.

Mutations or abnormalities in keratin 18 can lead to disorders affecting epithelial


tissues, such as liver diseases like cirrhosis and certain types of cancer.
Understanding the domain structure and function of keratin 18 is crucial for
elucidating its role in normal cellular physiology and disease pathology.

g) Download its protein sequence

You might also like