Bioinformatics Assignment 1: Accessing Ncbi Databases: International University - Vnu HCMC School of Biotechnology

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

I.

INTERNATIONAL UNIVERSITY – VNU HCMC


II. SCHOOL OF BIOTECHNOLOGY
III.

BIOINFORMATICS

Assignment 1:

ACCESSING NCBI DATABASES

Student name:

Nguyễn Hà Vân Anh BTBTIU18004

Văn Thị Ngọc Dung BTBTIU18047

Nguyễn Ngọc Phương Uyên BTBTIU18269

Trần Hoàng Phương Trinh BTBTIU18252

Date of submission: Sat, Oct 17, 2020

Contents

Question 1: Search Taxonomy database.....................................................................................3

Question 2: Use the name “plague thrips” to search the Nucleotide database...........................6

Question 3: Search PubMed for “Thanh NM” (International University)................................12

Question 4: Search Genome database for Homo sapiens.........................................................14

Question 5: Use accession number “CU329670” to search the Nucleotide database...............16


Question 1: Search Taxonomy database

1) Homo sapiens

2) Heterodoxus macropus,

3) E.coli.

a. What is the common name of the species?


The common name for Homo sapiens is “human”.

Figure 1.1A: Ìnformation for Homo sapiens from NCBI taxonomy database

The common name for Heterodoxus macropus is “wallaby louse”


.

Figure 1.1B: Ìnformation for Heterodoxus macropus from NCBI taxonomy database
Figure 1.1C: Results for “E.coli” from NCBI

The common name for Escheria coli is E.Coli

b. How many nucleotide or protein sequence records do you find (show your search
results in cropped windows)?

Figure 1.2A: Search results for nucleotide sequence in Homo sapiens

There are 27,672,287 results for nucleotide sequence for Homo sapiens
Figure 1.2B: Search results for protein sequence in Homo sapiens

There are 1,423,829 results for protein sequence for Homo sapiens

Figure 1.2C: Search results for nucleotide sequence in Heterodoxus macropus

There are 2 results for nucleotide sequence and 26 results for protein in Heterodoxus
macropus

Figure 1.2D: Search results for nucleotide sequence in E.coli

There are 8,458,194 results for nucleotide sequence in E.coli


Figure 1.2E: Search results for protein sequence in E.coli

There are 60,512,147 results for protein in E.coli

Question 2: Use the name “plague thrips” to search the Nucleotide database.

a. What is the scientific name of the plague thrips?

Fig.2.1. Scientific name of plague thrips


Choose 1 from the finding results, we have the scientific name of “plague thrips” is Thrips
imaginis.
b. How many sequence records do you find?
Fig.2.2. Results sequences
By using the key word “plague thrips” in nucleotide database, I found out 16 sequence
records.
c. Which genes or genomes of the plague thrips have been sequenced?
- Mitochondrial genome:

Fig.2.3. Sequenced Thrip imaginis mitochondrial genome


- Elongation factor 1 alpha (EF1a) gene:
Fig.2.4. Sequenced Thrip imaginis EF1a gene
- Genes coding for 5.8S rRNA, ITS2 rRNA

Fig.2.5. Sequenced Thrip imaginis 5.8S and ITS2 rRNA genes


- 28S ribosomal RNA gene:

Fig.2.6. Sequenced Thrip imaginis 28S ribosomal genes


- 18S ribosomal RNA gene

Fig.2.7. Sequenced Thrip imaginis 18S ribosomal genes


- Histone 3 (H3) gene:

Fig.2.8. Sequenced Thrip imaginis Histone 3 genes


d. Provide information of the most recent publication that reported the
mitochondrial genome of the plague thrips including the authors, year and title of
the publication, title of the journal, volume, and page numbers.
Thrips imaginis mitochondrion, complete genome – ID: AF335993.2
Most recent
publication

Fig.2.9. Sequenced mitochondrial genome with publications


- The most recent publication: Reference 2 ((bases 1 to 15407)
o Authors: Shao,R. and Barker,S.C.
o Title: The highly rearranged mitochondrial genome of the plague thrips, Thrips
imaginis (Insecta: Thysanoptera): convergence of two novel gene boundaries
and an extraordinary arrangement of rRNA genes
o Year: 2003
o Journal: Mol. Biol. Evol., vol 20, page: 362-370
Question 3: Search PubMed for “Thanh NM” (International University).

a. How many publications of Thanh NM were deposited in PubMed?

There are 2 publications of Thanh NM were deposited in PubMed.

b. List the common names of 2 aquatic animals that Thanh NM worked on.

- Striped catfish (Pangasianodon hypophthalmus) 

- Giant freshwater prawn (Macrobrachium rosenbergii)

c. Provide information of publication by Thanh NM: year and title of the publication,
title of the journal, volume and page numbers.

- Publication 1:

 Title of the publication: A transcriptomic analysis of striped catfish (Pangasianodon


hypophthalmus) in response to salinity adaptation: De novo assembly, gene annotation
and marker discovery.
 Year: 2014
 Title of the journal: Comparative Biochemistry and Physiology. Part D, Genomics &
Proteomics
 Page numbers: 52 – 63
 Volume: 10
Figure 3.1: 1st publication by Thanh NM

- Publication 2:

 Title of the publication: Optimizing de novo transcriptome assembly and extending


genomic resources for striped catfish (Pangasianodon hypophthalmus).
 Year: 2015
 Title of the journal: Marine Genomics
 Page numbers: 87 – 97
 Volume: 23

Figure 3.2: 2nd publication by Thanh NM


- Publication 3:
 Title of the publication: A candidate gene association study for growth
performance in an improved giant freshwater prawn (Macrobrachium
rosenbergii ) culture line.

 Year: 2014

 Title of the journal: Marine Biotechnology (New York, N. Y.)

 Page numbers: 161 - 180

 Volume: 16

Figure 3.3: 3rd publication by Thanh NM

Question 4: Search Genome database for Homo sapiens.

a. How many records did your search find?

The search for Genome database for Homo sapiens illustrated 20 most recented records.
Fig.1: Homo sapiens genome records

b. Provide the GenBank accession number for the chromosome 1 of Homo sapiens, the
size of the chromosome 1.

_ The accession number of Chromosome 1 of Homo sapiens on GenBank: CM000663.2


_ The size of chromosome 1: 248.96 Mb (248956422 bp)

Fig.2: Chromosome 1 on Homo sapiens genome


c. Provide information of the most recent publication that reported the chromosome 1
including the authors, year and title of the publication, title of the journal, volume and
page numbers.

_ The most recent publication that reported the chromosome 1 is “The DNA sequence and
biological annotation of human chromosome 1”
AUTHORS Gregory, S., Barlow, K., McLay, K. et al.
TITLE The DNA sequence and biological annotation of human chromosome 1
JOURNAL Nature 441 (vol. 7091), page 315-321 (2006)

Question 5: Use accession number “CU329670” to search the Nucleotide database.

a. What is the type of sequence? What is the length of sequence? What is the name of
database division?

- Type of sequence: Chromosome I, complete sequence (DNA linear)

- Length of sequence: 5579133 bp

- Name of database division: Genbank database division

Figure 5.1: Nucleotide database of ID “CU329670”

b. What is the scientific name of organism?

- The scientific name of organism: Schizosaccharomyces pombe.

c. Name the protein product of the CDS and the length of protein.

_ The name of protein product: RecQ type DNA helicase


_ Length of the protein: 1887 aa
Fig.3: Length of CDS product

d. Write the first four amino acids.

- The first four amino acids are Methyonine, Valine, Valine and Alanine

Fig.4: Product of CDS and the first four amino acidsQuestion 5: Use accession number
“CU329670” to search the Nucleotide database

e. Write the nucleotide sequence of the coding strand that corresponds to these amino
acids.

The first four amino acids are Methionine, Valine, Valine and Alanine

Amino acid Methionine Valine Valine Alanine

Nucleotide ATG GTC GTC GCT


sequence
(coding)
Nucleotide TAC CAG CAG CGA
sequence (non-
coding)

You might also like