Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

1

Course Name: Seminars in Bioinformatics Bioinformatics Program


Couse Code: BNF412 4th Year
Faculty of Computers and Information February 14, 2021
Assiut University Questions Bank

Answer the following questions:


I. Choose the right answer out of the following:
1. Next Generation Sequencing include:
a. Illumina sequencing
b. 454 pyrosequencing
c. SOLiD sequencing
d. All of the above
2. Next Generation Sequencing are defined as:
a. Related terms that describe the DNA sequencing technology which has
revolutionized the biological research.
b. Very short sequences
c. Motifs
d. Domains
3. Single nucleotide polymorphisms are defined as:
a. Common protein families
b. The most common type of genetic variation among people
c. Common protein structures
d. Common gene functions
4. Transcriptome assembly of mRNA is defined as:
a. A powerful tool for detecting variations in gene expression and sequences
between conditions, tissues, or strains/species for both model and non-model
organisms
b. Variations in gene function
c. Variations in metabolic pathway
d. All of the above
5. Advantages of solid sequencing include:
a. High throughput
b. Low cost per base
c. High accuracy when reference genome is available (resequencing)
d. All of the above
6. Disadvantages of PacBio Sequencing include:
a. Long fragments
b. High error rate (but random errors)
c. Average length 1000 bases

1
2

d. No PCR amplification step (no PCR bias)


7. Metagenomics is defined as:
a. A molecular tool
b. Environmental tool
c. A web platform
d. A molecular tool used to analyze DNA acquired from environmental samples,
in order to study the community of microorganisms present, without the
necessity of obtaining pure cultures.
8. Metagenomics approaches include:
a. Functional based Approaches
b. Sequence based approaches
c. A & b
d. Microarray gene expression approaches

9. An operational taxonomic unit (OTU) is defined as:

a. A functional based approach


b. A human microbiome
c. A definition used to classify groups of closely related individuals
d. A virus
10. Metagenomics data analysis include:
a. Assembly (contigs)
b. Gene prediction
c. Count matrix calculation
d. All of the above
11. What are the correct keywords for the following abstract:
Abstract:
Human microbiota plays a key role in human health and growing evidence supports the
potential use of microbiome as a predictor of various diseases. However, the high-
dimensionality of microbiome data, often in the order of hundreds of thousands, yet low
sample sizes, poses great challenge for machine learning-based prediction algorithms.
this imbalance induces the data to be highly sparse, preventing from learning a better
prediction model. Also, there has been little work on deep learning applications to
microbiome data with a rigorous evaluation scheme. to address these challenges, we
propose DeepMicro, a deep representation learning framework allowing for an effective
representation of microbiome profiles. DeepMicro successfully transforms high-
dimensional microbiome data into a robust low-dimensional representation using various
autoencoders and applies machine learning classification algorithms on the learned
representation. In disease prediction, DeepMicro outperforms the current best approaches
based on the strain-level marker profile in five different datasets. In addition, by
significantly reducing the dimensionality of the marker profile, DeepMicro accelerates

2
3

the model training and hyperparameter optimization procedure with 8X–30X speedup
over the basic approach.
a. Human microbiota, machine learning based prediction, DeepMicro, deep learning
representation, and high dimensionality, imbalances.
b. Human microbiota, Next generation sequencing, data mining, hyperparameter,
and speedup
c. Human microbiota, Artificial intelligence, acceleration, model training, and model
building
d. Model training, model building, dimensionality reduction, artificial intelligence,
and marker profile
.
12. Suppose you are looking for a specific drug that has been discovered for curing
COVID-19, which databases you can use:
a. PubMed and using BLAST tool
b. PubMed will be enough
c. PubMed and PDB
d. PubMed to perform data mining for publication literature, drugbank for checking
the chemical compounds and side effects of a suggested drug

13. When you perform data mining algorithm for mining for suggested drugs for
COVID-19 in literature databases, you perform:
a. Text mining
b. Graph mining
c. Data visualization
d. On Hadoop framework
14. When you perform data analysis on biological networks, you perform:
a. Text mining
b. Graph Mining
c. Master-slave architecture
d. A Gap analysis
15. Unstructured data in life sciences means:
a. Data not in a specific format, such as text data
b. Data in a relational databases
c. Data in Excel sheets
d. Data in object-oriented databases
16. NGS provides a paradigm shift from:
a. Complete genomes
b. RNA-seq
c. Single transcripts to whole transcriptomes
d. Complex pools

3
4

17. Resequencing means:


a. Low quality reference sequence available
b. SNP detection (one gene)
c. Genomic rearrangements
d. None of the above
18. Mass spectrometry is defined as:
a. An analytical technique that measures the mass-to-charge ratio of ions.
b. Measurements of microarray gene expression
c. Prediction of 3D protein structure
d. Prediction of protein-protein interaction
19. High throughput RNA sequencing is defined as:
a. A tool for analyzing under-expressed genes
b. A powerful tool that allows us to perform gene prediction and analyze tissue-
specific overexpression of genes
c. A tool for analyzing proteomics data
d. A data mining algorithm for performing text mining analysis
20. Advantages of 454 sequencing include:
a. Long reads 400-750 bp
b. Low throughput (compared to Illumina)
c. Relatively high running costs
d. None of the above

Check whether the following question is True/ false:

1. A SNP may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a
certain stretch of DNA. ( )
2. With SNPs, we can search the genome for small variations ( )
3. Indel is classified among small genetic variations, measuring from 1 to 3000 base pairs in
length ( )
4. For NGS data analysis, Genome annotation is performed as de novo DNA sequencing ( )
5. For NGS data processing reference genome, four main phases are required which are:
input data, genome resequencing, RNA-seq, and metagenomic sequencing. ( )
6. The main steps for input data in NGS data processing reference genome are:
preprocessing, alignment, and statistical inference. ( )
7. In performing metagenomic sequencing for NGS data processing, we perform within
sample normalization ( )
8. Bacteria are not present in every habitat on earth ( )
9. Microbes are necessary for every human ( )
10. Microbiome is called microbiota ( )
11. Microbiomes are essential community of organisms ( )
12. The microbiome is essential for human development, immunity and nutrition. ( )

4
5

13. We study metagenomics to discover homogeneity of life ( )


14. A review article surveys and summarizes previously published studies, rather than
reporting new facts or analysis. ( )
15. An analytical article presents some type of argument, or claim, about what you are
analyzing. ( )
16. In the introduction of the paper, you analyze your results and comment on them ( )
17. In the Conclusion of the paper, you use past tense and explain what you performed during
your analysis ( )
18. The abstract of the paper explains your future work of your paper ( )
19. When writing references, you only have one style to follow ( )
20. The keywords in a paper appear after the references section ( )

Good luck

Prof. Taysir Hassan A. Soliman

You might also like