Practical Outline With Solutions

Intro
This practice session focuses on applying some of the theoretical work covered in previous
courses. The questions asked here closely resemble the exam questions, although the exam
will be supplemented with questions that involve correctly interpreting course material for a
practical application. If you understand the questions in this practical, this should be no
problem.
The use of ChatGPT is absolutely prohibited during the exam!
In this practical session, we will run through two separate cases. Each case involves a
specific protein, and several point mutations in that protein.
Topic 1: Amino Acids

Purely based on the physicochemical properties of the amino acids, would you expect the
following mutations to be detrimental?
● K → D Large positively charged to small negatively charged sidechain, likely

detrimental.
● L → V Small hydrophobic to small hydrophobic, likely not detrimental.
● R → K Large positive charge to large positive charge, likely not detrimental.
● W → V Large hydrophobic (aromatic) to small hydrophobic, likely detrimental.
● N → Q Very similar residues, both polar, Q has a slightly longer sidechain: likely not
detrimental.
● C → H Cys has SH in the sidechain and can make disulfide bridges. Replacing a Cys
might break a disulfide bridge. Histidine has a completely different sidechain with an
imidazole ring, which is larger and sometimes carries a charge: likely detrimental.
Sidenote: obviously the answers above are very crude estimations based on the residues'
general properties. Of course, even if properties are similar overall, it is still possible small
differences have large structural consequences, which is very dependent on the structural
context. In this exercise, we simply want you to consider general amino acid properties and
compare these.
Topic 2: Uniprot
If the protein has more than one isoform, answer the questions using the canonical one!
Case 1:
Protein X GTPase HRas
UniProt identifier P01112
Residues A, B, C and D A59, K5, L52, F78
Mutation M Q61L
Case 2:
Protein X Signal transducer and activator of transcription 3
UniProt identifier P40763
Residues A, B, C and D E760, V349,R350,V366
Mutation M Y640F
Find the gene name of protein X

HRAS.
STAT3.
What is the subcellular location of protein X?
Cell membrane, Golgi apparatus and Golgi apparatus membrane.
Cytoplasm and Nucleus.
What is the FASTA sequence of protein X?
MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDL
AARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS
MAQWNQLQQLDTRYLEQLHQLYSDSFPMELRQFLAPWIESQDWAYAASKESHATLVFHNLLGEIDQQYSRFLQESNVLYQHNLRRIKQFLQSRYLEKPMEIARIVARCLWEESRLLQT
AATAAQQGGQANPTAAVVTEKQQMLEQHLQDVRKRVQDLEQKMKVVENLQDDFDFNYKTLKSQGDMQDLNGNNQSVTRQKMQQLEQMLTALDQMRRSIVSELAGLLSAMEY
VQKTLTDEELADWKRRQQIACIGGPPNICLDRLENWITSLAESQLQTRQQIKKLEELQQKVSYKGDPIVQHRPMLEERIVELFRNLMKSAFVVERQPCMPMHPDRPLVIKTGQFTTKVR
LLVKFPELNYQLKIKVCIDKDSGDVAALRGSRKFNILGTNTKVMNMEESNNGSLSAEFKHLTLREQRCGNGGRANCDASLIVTEELHLITFETEVYHQGLKIDLETHSLPVVVISNICQMPN
AWASILWYNMLTNNPKNVNFFTKPPIGTWDQVAEVLSWQFSSTTKRGLSIEQLTTLAEKLLGPGVNYSGCQITWAKFCKENMAGKGFSFWVWLDNIIDLVKKYILALWNEGYIMGFIS
KERERAILSTKPPGTFLLRFSESSKEGGVTFTWVEKDISGKTQIQSVEPYTKQQLNMSFAEIIMGYKIMDATNILVSPLVYLYPDIPKEEAFGKYCRPESQEHPEADPGSAAPYLKTKFICVTP
TTCSNTIDLPMSPRTLDSLMQFGNNGEGAEPSAGGQFESLTFDMELTSECATSPM
Is this protein an enzyme? If so, what is the EC code?

Yes. The EC code is EC:3.6.5.2
No.
Based on the information in Uniprot, is residue A in an active site/binding site? What does
it bind?
Yes. It binds to GTP.
No.
What is the PDB ID of the best available experimentally determined structure for protein
X and why?
The best PDB is 4Q21, since it covers the entire protein with good resolution.
6TLC (better coverage but worse resolution) or 6NJS (worse coverage but better resolution).
What is the resolution?

2.00 Å.
2.90 Å (6TLC), 2.70 (6NJS).
What is the experimental method?
X-ray crystallography.
X-ray crystallography.
What does this resolution allow you to observe?
Side chains well resolved. The plane of the peptide bond resolved.
Side Chains partially resolved.
What is the coverage of this PDB?
Full coverage (1 - 189).
127-722 (6TLC); 127 - 688 (6NJS).
Is there an AlphaFold structure? What is the coverage of the AlphaFold structure?
Yes. The coverage of the AlphaFold structure is 1 - 189 (full protein).
Yes. The coverage of the AlphaFold structure is 1 - 770 (full protein).
Have a look at the AlphaFold structure in UniProt. What does the pLDDT score tell you?
The pLDDT score is a per-residue metric that indicates the confidence of the prediction.
Residues with a pLDDT scores > 70 are expected to be modeled well. On the other hand, the
position of residues with scores below 50 should not be interpreted, and are expected to be
disordered regions.
Same.
Download the AlphaFold structure from UniProt in PDB-format.
Topic 3: looking at a structure

Open the PDB-file in a protein structure viewer (YASARA, PyMol, Microsoft Paint…). Write
down the protein structure viewer that you are using.
What is the secondary structure of residue A?

Loop/coil.
Loop/coil.
What is the secondary structure of residue B?
Beta-sheet.
Beta-sheet
Is residue B exposed to the surface or buried?
Exposed
Buried
What is the pLDDT score of residue B? What does it mean?
98.72 (see above)
97.68 (see above)
Showing pLDDT scores in Pymol
The pLDDT score shows the level of certainty on the prediction for each amino acid.
In X-ray crystallography, a parameter called "B-factor" gives an indication of the uncertainty
for atom positions. For this reason, the pLDDT scores are stored inside PDB-files in the same
place where X-ray crystallographers put the B-factor. To see the pLDDT score in Pymol, you
therefore have to look for a residue's B-factor, as shown below.
Select the residue, then do (sele) -> L (label) -> b-factor. The pLDDT is now displayed for
each atom (each atom in the residue has the same pLDDT)
Is residue C hydrophobic or hydrophilic? Is it in the core of the structure or at the surface?

Is this what you would expect?
It is hydrophobic. It is in the surface of the protein, while normally hydrophobic residues are
expected to be in the core. One possible explanation is that this is part of a protein-protein
interaction site.
It is hydrophilic, but in the core of the protein. Burying a hydrophilic residue is energetically
unfavorable. However, looking at the surrounding residues, this residue is close to several
negatively charged residues, forming a stabilizing (attractive) interaction. Moreover, it forms
hydrogen bonds which further stabilize it in the core position (see explanation of how to see
this below).
Sphere representation with the R350 residue colored in yellow. To get this representation,
click S -> spheres. To get back to the cartoon view, you may have to use "S → as →
cartoon". This representation shows that the residue is almost completely buried
Using the commands above, you can select all positive and negative residues, and then color
them blue and red, respectively.
In the stick representation, and visualizing the polar contacts, you can see interactions with
three negatively charged residues, which stabilize the positively charged Arg in the core.
Note: Having colored all charged residues, you can now see how they generally tend to be at
the surface, as they are highly hydrophilic.
Is residue D hydrophobic or hydrophilic? Is it at the surface or in the core? Is this what you
would expect?
It is hydrophobic. It is in the core of the protein, as expected.
It is hydrophobic. It is in the core of the protein, as expected.
How many hydrogen bonds does residue D make?
Two hydrogen bonds.
One hydrogen bond.
Showing hydrogen bonds in Pymol:
AF-P0112-F1-mod->Action (A) -> find -> polar contacts -> within selection
Hydrogen bonds for the entire structure shown as yellow dashed lines.
Residue 78 forms two hydrogen bonds, both in the backbone, the sidechain forms no
hydrogen bonds.
Topic 4: Mutation
Does mutation M violate the hydrophobicity rule?
Yes.
No.
Does mutation M violate the secondary structure propensity?
Yes.
No.
Is Mutation M related to cancer? If so, in which tissue is it mostly distributed?
Yes, mostly in the skin.
Yes, hematopoietic and lymphoid.
Is mutation M in an active/binding site?
No, but it is very close to a GTP binding site.
No.
What is the Zvelebil score of mutation M?
7 out of 10.
9 out of 10.
What is the Blosum score of mutation M?
-2.
3.
FoldX gives a ΔΔG of -1.02 kcal/mol or -0.77 kcal/mol for this mutation, what does that
mean?
It means the mutation is stabilizing.
Same.
Provean determines that this variant is pathogenic. Does this correlate to the FoldX
classification? Assuming both are correct, how would you explain this?
No, Provean determines that this modification is pathogenic. While both Provean and FoldX
are used to analyze protein sequences, Provean specifically predicts clinical significance of
human variants based on sequences of diverse organisms across evolution, while FoldX is
primarily employed for studying protein stability and protein-protein interactions, yielding
different outcomes. This mutation does not seem to problematic for intrinsic structure, but
it is predicted clinically detrimental for some other reason. Perhaps it targets an active site
residue. Given that this mutation affects an oncogene, it may also be that this stabilizing
mutation makes it constitutively active somehow, driving tumorigenesis.
Same.
Topic 5: Neural network
Regarding the exam

● The exam is open book
● You need to be able to do and understand all the exercises in this practical for the
exam.
● Additional questions will be added to the exam which require some degree of
interpretation of the course material.
● The use of chatGPT is prohibited – getting caught using it will result in termination of
your exam
● Questions? Software issues?
o bert.houben@kuleuven.be
o ramonenrique.duranromana@kuleuven.be
o gabriele.orlando@kuleuven.be

Practical Outline With Solutions

Uploaded by

Copyright:

Available Formats

You might also like

Practical Outline With Solutions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Practical Outline With Solutions

Uploaded by

Copyright:

Available Formats

Intro

Topic 1: Amino Acids

● K → D Large positively charged to small negatively charged sidechain, likely

Find the gene name of protein X

Is this protein an enzyme? If so, what is the EC code?

What is the resolution?

Topic 3: looking at a structure

What is the secondary structure of residue A?

Is residue C hydrophobic or hydrophilic? Is it in the core of the structure or at the surface?

Topic 5: Neural network

Regarding the exam

You might also like