Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Bioinformatics: Introduction and Methods 生物信息学: 导论与方法

Week 3

Sequence Database Search


Submit your assignment
Due DateNov 9, 7:59 AM CET
Attempts3 every 8 hours
Receive grade
To Pass80% or higher
Grade
20%
We keep your highest score
Sequence Database Search
Graded Quiz • 30 min

Due Nov 9, 7:59 AM CET

Sequence Database Search

Total points 5
1.
Question 1

关于 BLAST 结果中 E-value 的说法,以下不正确的是

Which one of the following options is not correct with respect to the BLAST's E-
value?

1 point

当它确定的时候,相应的 p-value 也是确定的

When it is fixed, the corresponding p-value for this E-value will be fixed as well.

它可以大于 1

It could be larger than 1

它表示了相应 hit 的可信度

It denotes how much we could trust its corresponding "hit" sequence


它的值在接近 1 时,是几乎和相应的 p-value 一样的

When it is near 1, it is nearly identical to its corresponding p-value.

它和一开始输入的查询序列的长度以及数据库总序列长度都有关

It depends on the length of the query sequence AND the size of the database

2.
Question 2

下列选项中,哪个项不能帮助 BLAST 降低假阳性?

Which of the following options cannot reduce the false positives of BLAST?

1 point

提前给数据库建索引

Build an index for the database ahead of time

从最初始找到的 hit 里面去掉一些零散的 hit,只保留 hit cluster

Discard isolated hits and keep only those hits that can form hit clusters

屏蔽重复性的低复杂度区域

Masking the low-complexity regions

使用 E-value 来评估比对的统计显著性

Use E-value to evaluate the statistical significance of alignments

3.
Question 3
下列选项中,哪一项不能帮助 BLAST 提升计算速度?(注意不一定非得是和以
前的双序列比对算法相比有显著提升)

Which one of the following options cannot improve the speed of BLAST? Note that
the improvement need not to be significant compared to previous pairwise sequence
alignment algorithms.

1 point

使用较短的 seed word

Use shorter seed words

选择邻居单字时,只选择高度相似的邻居单字

Choosing only those neighborhood words that are highly similar to the current seed
word

提前给数据库建索引

Build an index for the database ahead of time

不计算 p-value,只计算 E-value

Do not compute the p-value; computer the E-value only

从最初始找到的 hit 里面去掉一些零散的 hit,只保留 hit cluster

Discard isolated hits and keep only those hits that can form hit clusters

对数据库预先屏蔽重复性的低复杂度区域

Masking the low-complexity regions of a database before using it in BLAST

4.
Question 4
Given the following protein sequence, please run BLAST, to find similar protein
sequences:

>Protein Sequence

MVRAPCCEKMGLKKGPWTPEEDQILISYIQSNGHGNWRALPKLAGLLRCGKS
CRLRWTNYLRPDIKRGNFTREEEDSIIQ

LHEMLGNRWSAIAARLPGRTDNEIKNVWHTHLKKRLKNYQPPQSSKRHSKN
KDSKAPCTSQIALKSSNNFSNIKEDGPGL

GSGPNSPQLSSSEMSTVTADSLAVTMDISNSNDQIDSSENFIPEIDESFWTDGLS
TSGGGEELQVQFPFHDMKQENVEKD

VGAKLEDDMDFWYSVFIKSGDLLELPEF

现有如下一条蛋白序列,请通过 BLAST,对其进行分析,寻找与其相似的蛋白
序列:

BLAST:http://blast.ncbi.nlm.nih.gov

Parameters 参数设置:

 Database: Non-redundant protein sequences (nr)


 Algorithm: blastp
 Word size: 3
 Matrix: BLOSUM62
 Gap Costs: Existence: 11 Extension: 1

Other parameters leave as default. 其他参数默认.

Q: Which program listed in BLAST homepage should you use to do the analysis?

Q: 为了完成上述分析,应选择 BLAST 主页上的哪个程序?

1 point

nucleotide blast

protein blast

blastx
tblastn

tblastx

5.
Question 5

In BLAST result of question 4,which species has the highest similarity score?

在第 4 题的 BLAST 结果中,所获得的相似度最高的序列来自于哪个物种?

1 point

Capsicum annuum 辣椒

Datura metel 洋金花

Petunia x hybrida 矮牵牛

Solanum lycopersicum 番茄

You might also like