Professional Documents
Culture Documents
ARIA For Solution and Solid-State NMR
ARIA For Solution and Solid-State NMR
Abstract
In solution or solid-state, determining the three-dimensional structure of biomolecules by Nuclear
Magnetic Resonance (NMR) normally requires the collection of distance information. The interpretation
of the spectra containing this distance information is a critical step in an NMR structure determination. In
this chapter, we present the Ambiguous Restraints for Iterative Assignment (ARIA) program for auto-
mated cross-peak assignment and determination of macromolecular structure from solution and solid-state
NMR experiments. While the program was initially designed for the assignment of nuclear Overhauser
effect (NOE) resonances, it has been extended to the interpretation of magic-angle spinning (MAS) solid-
state NMR data. This chapter first details the concepts and procedures carried out by the program. Then,
we describe both the general strategy for structure determination with ARIA 2.3 and practical aspects of
the technique. ARIA 2.3 includes all recent developments. such as an extended integration of the
Collaborative Computing Project for the NMR community (CCPN), the incorporation of the log-har-
monic distance restraint potential and an automated treatment of symmetric oligomers.
Key words: Ambiguous distance restraint, Structure calculation, Automated assignment, MAS,
Solid-state NMR, CCPN, NOE, ARIA, PDSD, CHHC
1. Introduction
Alexander Shekhtman and David S. Burz (eds.), Protein NMR Techniques, Methods in Molecular Biology, vol. 831,
DOI 10.1007/978-1-61779-480-3_23, Springer Science+Business Media, LLC 2012
453
454 B. Bardiaux et al.
2. Materials
2.1. ARIA Software The following software packages are required to use ARIA.
Package
1. ARIA software package. ARIA (6) is written in the program-
ming language Python (15). The current version is 2.3.
Instructions on how to install ARIA can be found in the ARIA
installation archive, which should be downloaded from http://
aria.pasteur.fr. ARIA can be installed on computers operating
under Linux, Windows or Mac OS X.
2. CNS software. To enable specific features used by ARIA, it is
necessary to compile the CNS program (16) with libraries pro-
vided within the ARIA package.
3. Optional: CCPNmr Analysis software package (version 2 or
later) (17). ARIA uses the CCPN data model to read input
data and to store all results in a general format.
4. Optional: Access to a computer cluster for distributed
calculation.
2.2. Input Data The minimal set of data required by ARIA consists of (see Note 1):
1. Definition of the molecular system.
2. List(s) of chemical shift assignments of 1H (for 2D-NOESY)
and 13C/15N if necessary for 3D-NOESY or for MAS solid-
state NMR spectra (see Note 2).
3. One or more lists of cross-peaks with chemical shift positions
in each dimension and peak volumes/intensities. Individual
peaks can be either fully assigned, partially assigned or com-
pletely unassigned. A list of cross-peaks generally corresponds
to the peaks picked in a particular spectrum. It is recommended
that similar experiments performed with different mixing times
are entered as separated lists.
ARIA also integrates various data types for additional experi-
mental information. All restraints must be in CNS tbl format
(see Note 1).
1. Hydrogen bonds: The distance between hydrogen donor and
acceptor as well as the distance between acceptor and hydrogen.
2. Dihedral angles: Dihedral angle restraints incorporated using a
flat-bottom harmonic-wall potential.
3. J-couplings: Calculated J-couplings are directly refined against
observed J-couplings.
4. Residual dipolar couplings: Residual dipolar coupling (RDC)
data as restraints.
5. Distance restraints: Preformatted distance restraints, e.g., from
manual assignments.
456 B. Bardiaux et al.
2.3. Software Additional software required to analyze the quality of the final
for Structural Quality structure ensembles:
Checks
1. PROCHECK (18).
2. WHAT IF (or WHAT_CHECK) (19).
3. ProSa II (or ProSa 2003) (20).
4. MolProbity suite (21).
3. Methods
3.1. Preparation Phase Before cross-peak assignment and structure calculation, the fol-
lowing steps are automatically performed by ARIA. First, the data
are checked and filtered for errors and inconsistencies. The pro-
gram then creates the molecular topology of the system.
3.1.1. Data Filtering When checking the chemical shift assignments for consistency,
ARIA considers three possible situations:
1. A unique assignment consisting of a single atom and a single
chemical shift.
2. A degenerate chemical shift assignment, where one group of
equivalent atoms is assigned to exactly one chemical shift.
3. An assignment of the two substituents of a prochiral group,
which can have one or two chemical shifts.
In the latter case, floating chirality assignment (22) is used in
the resulting restraints (cf. Subheading 3.3.5). Peaks that lack fre-
quency information or with incorrect/missing peak sizes are
removed (see Note 3).
23 ARIA for Solution and Solid-State NMR 457
a b
ltering
Initial cross-peak assignment
Molecular topology creation
ARIA
Iterative protocol
c
Chemical
Cross-peak lists Calibration
shits assignments
d
Violation Analysis
Molecular Structure
ntion ensemble e
Structure ensemble
Noise peaks removal
f
Partial Assignment
Additional restraints g
(dihedrals, RDC,
Restraints Merging
restraints
Distance
distance)
Structure Calculation
ARIA nition h
(GUI)
Floating chirality assignment
j i
Generation of report
nement in explicit solvent
Quality analysis
Structure quality
statistics
Fig. 1. Description of the ARIA protocol workflow. Rounded rectangles indicate steps performed by ARIA, folded rectangles
correspond to user provided input-data and trapezoids represent results.
3.1.2. Molecular Topology From the definition of the molecular system provided as input data,
Creation ARIA creates a molecular topology file (MTF) with the program
CNS (16). Name, chemical type, charge and mass of each atom as
well as the covalent connectivity are defined in the MTF. An
extended conformation of the molecule is then generated by CNS
and the coordinates are stored in a PDB file (cf. Subheading 3.8).
The molecular topology is created automatically for standard bio-
polymers. If applicable, topological features can be easily defined
by the user through the graphical interface (cf. Subheading 3.7).
3.2. Initial Cross-Peak For every cross-peak, ARIA uses the chemical shift lists from the
Assignment sequential resonance assignment to derive possible assignments. As
illustrated in Fig. 2, the peak position is defined by its frequency
3.2.1. Chemical-Shift
coordinates (c1, c2) in each dimension of the spectrum. To account
Based Assignment
for the limited precision in chemical shift measurements, for the
uncertainty of the cross-peak coordinates and for systematic exper-
imental errors, chemical shift tolerances (d1, d2) are applied around
the peak position. The tolerances should be chosen to be sufficiently
458 B. Bardiaux et al.
dimension 2
c11 c1 c1+1
c2+2
pz
py
c2
px
c22
pa pb pc pd dimension 1
Fig. 2. Illustration of the assignment of a cross-peak. c1,c2 denote the peak coordinates in
frequency space. The assignment frequency window is indicated by the solid black square,
defined from the chemical shifts tolerances d1 and d2. The coordinates of the (hypotheti-
cal) correct assignment are represented by the gray dashed lines (pb, py). Multiple reso-
nances within the tolerance window (pa, pb, pc, pd in dimension 1 and px, py, pz in the other
dimension) give rise to 12 assignment possibilities.
3.2.2. Structural Rules ARIA can use information about the secondary structure organiza-
for Symmetric Oligomers tion of the system under investigation to remove unlikely assign-
ments. ARIA uses simple rules (23) to assign some cross-peaks as
intermonomer before the structure calculation, using the predicted
23 ARIA for Solution and Solid-State NMR 459
3.2.3. Network Anchoring ARIA implements a network anchoring approach (8) to reduce the
number of possibilities of cross-peak assignments prior to structure
calculation. The approach is based on the ranking of each assign-
ment, calculated using the information about the assignments of
neighboring nuclei in 3D space, and is efficient because true assign-
ments form a self-consistent subset of the network of all possible
assignments (see ref. 8, 13 for details). The behavior of network
anchoring is controlled by a set of user-defined parameters:
1. High network-anchoring (NA) score per residue threshold
high
(N res ).
min
2. Minimal NA score per residue threshold (N res ).
min
3. Minimal NA score per residues threshold (N atom ).
A peak is conserved if one of the following rules is verified:
S res N res
high
(1)
S res N res
min
and S atom N atom
min
(2)
where Sres and Satom are respectively the residue-wise and atom-wise
network anchoring score. Even though the network anchoring
approach does not directly rely on 3D structure information, it is
still possible to use it after the first ARIA iteration.
3.3. Iterative Structure The most important idea that underlies the ARIA methodology is
Calculation the concept of Ambiguous Distance Restraints (ADR) (2). In the
framework of the ADR, each NOESY cross-peak is treated as the
3.3.1. Ambiguous Distance
superposition of the signals from each of its multiple assignments
Restraints
possibilities: the NOE intensity depends on the sum of the inverse
sixth power of all the individual protonproton distances that con-
tribute to the signal. An effective distance D is thus derived as:
1
Nc 6
D = dc6 (3)
c =1
3.3.2. Distance Calibration The simplest model to derive distances from NOE signal intensity
is the Isolated Spin Pair Approximation (ISPA), which considers
only the observed spin pair, neglecting spin diffusion through third
nuclei. For short mixing times, ISPA provides a good approxima-
tion to relate an NOE volume (Vij) to the distance dij of two inter-
acting spins i and j:
Vij = Cdij6 (4)
V exp
C= i
(5)
di 6
i
where d is the average effective distance, and V exp and V th are the
experimental and theoretical NOE volumes, respectively. When
using spin-diffusion corrected distances, the distance bounds cal-
culated from the theoretical volume may also be of use for the
structure calculation (25). In ARIA 2.3, the spin-diffusion correc-
tion is performed by the python core of ARIA and not by CNS
routines. It is also important to note that every spectrum is inde-
pendently calibrated. Still, these models are approximate and it is
common practice to restrain the distance to an interval to account
for uncertainties in the distances (see Note 10). This interval is
thus defined by lower and upper distance bounds, L and U:
L = d ,U = d + where = 0.125d 2 (9)
3.3.3. Violation Analysis To identify incorrect assignments and noise peaks, the calibrated
and Noise Peak Removal restraints are treated with a violation analysis, following the struc-
tural consistency hypothesis (3, 26): incorrectly assigned peaks or
noise peaks are not consistent with the 3D structure determined
with all experimental data. To assess whether a particular restraint
follows the general trends imposed on the structures by the entire
data set, the obtained distance bounds are compared to the corre-
sponding distances found in the conformer ensemble. A restraint is
considered as violated if the distance found in the structure lies
outside the bounds by more than a user-defined violation tolerance,
t. To identify systematically violated restraints, each conformer in
the ensemble is analyzed. The fraction, f i , of conformers violating
restraint i is calculated according to:
1 S
fi = max((Li t di(k) ), (di(k) U i t ))
S k =1
(10)
where Li and Ui denote the lower and upper bounds of the i-th
restraint, di(k) designates the distance found in the k-th conformer;
Q is the Heaviside step function and S is the total number of con-
formers analyzed. A restraint is classified as violated if f i exceeds a
user-defined violation threshold (50% by default). The correspond-
ing cross-peak is thus removed from the list of active peaks for the
next iteration. During the course of the protocol, the violation
tolerance, is reduced from iteration to iteration to ensure that most
of the inconsistent peaks are removed.
Nc
w
c =1
c =1 (12)
w
1
c p (13)
3.3.5. Calculation On the basis of the merged restraints list, a new structure ensemble
of Structure Ensemble is calculated with the program CNS (16) through a molecular
dynamics simulated annealing (MDSA) protocol. ARIA provides
two forms of molecular dynamics : in Cartesian or torsion angle
space. Torsion angle molecular dynamics (TAD) (27) reduces the
calculation time and allows for higher MDSA temperatures, while
generally increasing the convergence radius. The molecular struc-
tures obtained with TAD also provide better local geometries. The
MDSA protocol used in ARIA is divided into two phases : an initial
high temperature search phase, and a cooling phase where the tem-
perature slowly decreases. The second part of the cooling stage is
performed in Cartesian coordinates. The length of the cooling
stages determines the slope of the bath temperature cooling func-
tion. It has been shown that this parameter plays an important role
in the convergence properties of the ARIA calculation for highly
ambiguous data (28). The MDSA protocols implemented in ARIA
(3) are optimized for the application of ambiguous distance
restraints and for the violation analysis method. The minimization
protocols are based primarily on separate scaling of different energy
terms with relatively low force constants. Any other structural
23 ARIA for Solution and Solid-State NMR 463
Table 1
Important protocol parameters, their location in the GUI, and
defaults values (if applicable)
3.3.6. Restraint Energy The aim of the MDSA protocol is to find a global energy minimum
Function of an objective function that incorporates experimental data and
physical energy. The latter is quantified by using a molecular
dynamics force field. Experimental data are integrated in the form
of conformational restraints entering the objective function via an
energy potential. For distance restraints, ARIA employs an flat-
bottom harmonic-wall potential with zero-energy between the dis-
tance bounds and linear asymptotes (3). This potential allows for
large distance violations as may occur in an automated assignment
procedure. Nevertheless, it is still difficult to correctly evaluate the
bounds and the relative weight to apply to the data. Recently, we
have introduced an new error-tolerant potential where lower and
upper bounds are replaced by a bounds-free log-harmonic potential
(14). This potential derives from a Bayesian analysis showing that
NOEs and the derived distances ideally follow the log-normal dis-
tribution (29, 30). In ARIA, we also retain another important fea-
ture of this Bayesian approach: automatic determination of the
optimal weight for the experimental data (31). The log-harmonic
potential is applied during the second cooling stage of the MDSA
and during water refinement. The weight for the distance restraints,
wdata , is iteratively evaluated as:
n
wdata = (14)
(X )
2
3.3.7. Symmetric The symmetry of the system is maintained during the calculation
Oligomers by adding a symmetry target function to the objective energy func-
tion (32). This target function contains terms that ensure the
symmetry relation between the monomers and keep them in the
vicinity of each other (Packing, see Note 11).
3.3.8. Floating Chirality The treatment of unassigned prochiral groups is realized with a
Assignment floating chirality assignment approach (22). The two substituents
of a prochiral center (methylene protons or methyl protons of iso-
propyl groups) are often difficult to assign stereo-specifically, in
terms of chemical shifts. In each proton dimension, a resonance
23 ARIA for Solution and Solid-State NMR 465
3.4. Solvent The simplified force field parameters for nonbonded contacts
Refinement applied to structure calculations in vacuo often produce structures
that contain artifacts (unrealistic side-chain packing and unsatisfied
hydrogen bond donors or acceptors). Therefore, the final struc-
tures of the last ARIA iteration are automatically refined in a shell
of explicit solvent (water or DMSO molecules). This refinement
consists in a short MD with a complete force field, which includes
coulombic and Lennard-Jones potentials. The covalent parameters
used in the refinement (33) are consistent with the force field used
for structure calculation and validation, thus avoiding systematic
differences that could influence validation results. It has been
shown that the refinement in solution significantly improves the
quality of the structure (3335).
3.5. Results Export
At the end of the ARIA protocol, assigned peak lists, restraint lists,
and Generation
along with violations, and final structure ensembles (last iteration
of Output Files
and solvent refined) are automatically exported into a CCPN proj-
3.5.1. Export to CCPN ect (see Fig. 3). Data exchange, further analysis of results, and
management of ARIA runs are then facilitated through the use of
the CCPN program suite (cf. Subheading 3.11).
ARIA
Cross-peak assignments
CCPN
project CCPN Analysis
Fig. 3. Communication interface between ARIA and CCPN for import of input data and export of results.
466 B. Bardiaux et al.
3.5.2. Report Files For every iteration, ARIA creates the following report files:
1. report summarizes analyses of the restraint lists and the
structure ensemble (number of restraints applied, violations,
ensemble precision).
2. noe_restraints.unambig, noe_restraints.ambig
tabulates information about unambiguous and ambiguous
restraints, respectively. For each restraint, the reference cross-
peak, restraint bounds and the average distance found in the
ensemble are provided. The result of violation analysis is also
given here (see Note 12).
3. noe_restraints.violations lists all violated restraints.
4. noe_restraints.assignments lists the tentative assign-
ments corresponding to every restraint. The nature of the
assignment(s) is also given (fully, partially or unassigned cross-
peaks).
5. noe_restraints.xml, noe_restraints.pickle stores
the complete list of cross-peak based distance restraints in
XML format and Python binary format. The latter is required
for further assignment analysis in the ARIA GUI (cf.
Subheading 3.10.2).
3.5.3. Quality Checks To evaluate the structural quality of both the final set of structures
and the solvent-refined ensemble, ARIA makes use of the programs
WHAT IF (19), PROCHECK (18), ProSa (20) and MolProbity
(21). Separate report files are generated for every program, named
quality_checks.*, and are stored in the directories of the
respective ensembles (last iteration and solvent-refined). Overall
quality scores are tabulated in the file quality_checks, whereas
WHAT-IF score profiles along the molecular sequence are gener-
ated in both textual and graphical forms (cf. Subheading 3.10.3).
3.5.4. CNS Analyses CNS scripts calculate restraint energies, ensemble RMSDs, an opti-
mal superposition of the final structure ensemble (with automated
determination of flexible and rigid regions), and an unminimized
average structure. Analyses of restraints from complementary
experimental data are also given. Results are stored in the directory
analysis/.
In the following sections, we detail the typical procedure to be
followed by a user to perform an ARIA calculation. In a structure
determination project, the general procedure consists of repeated
ARIA runs using revised results from a previous calculation as input
data (Fig. 4).
3.6. Conversion Since most NMR software packages use proprietary formats for
of Input Data data storage, the interconversion step required to transfer data with
other applications such as ARIA can lead to a loss of information.
23 ARIA for Solution and Solid-State NMR 467
Fig. 4. A series of ARIA runs in a typical structure determination project, with several cycles of structure calculations and
cross-peak assignments punctuated by manual inspection and correction of experimental input data.
3.7. Specification 1. Project creation. All program parameters and locations of the
of ARIA Project input data are stored in single project file (in XML format). To
Parameters conveniently change or review the project settings, ARIA pro-
vides a Graphical User Interface (GUI) (Fig. 5). Entering the
following command will start the GUI and load the project
definition from project.xml (see Note 15)
aria2 --gui project.xml
Fig. 5. Graphical User Interface of ARIA 2.3 for project management, where data and protocol settings can be modified
graphically.
23 ARIA for Solution and Solid-State NMR 469
angle restraints (see Note 8), RDC (see Note 16) and J-couplings
(see Note 17) should be defined.
6. Symmetry. ARIA can treat oligomers with C2, C3, C5 or D2
symmetry (see Note 11).
7. Specifying topology patches. By default, ARIA supports the fol-
lowing cases: Disulfide bridges (unambiguous or ambiguous)
(2), Histidine protonation states, cis-proline and tetrahedral
coordination of Zinc ions. In the case of nonstandard residues
or other chemical compounds, manual intervention of the user
is required (see Note 18).
8. Iteration parameters. The mode of restraint calibration has to
be specified : ratio of average (default), spin-diffusion correction
or fixed bounds (see Note 10). For every iteration, default val-
ues are provided for protocol parameters (Table 1) and the
network-anchoring thresholds (see Note 19).
9. Job Manager. Distributing structure calculations to multiple
processors speeds up the ARIA protocol. ARIA provides sup-
port for several job submission modes (see Note 20). The
appropriate command should be entered and the correct path
to the remote CNS program executable should be specified.
10. Structure calculation parameters. The remaining parameters
are related to the molecular dynamics simulated annealing, and
in particular the number of steps, restraint force constants and
potential shape (flat-bottom-harmonic-wall and log-harmonic).
3.8. Project Setup At this point, the project must be set up with the following
command.
aria2 --setup project.xml
The project is then validated and ARIA creates the directory
tree for the project (directory run1). As shown in Fig. 6, the results
of the successive iterations are stored in structures/, each iteration
having its own subdirectory, e.g., structures/it0/. Experimental
data files are copied into their respective directory in data/ (see
Note 21). Report files for the cross-peak filtering procedure are
stored in data/spectra/. All data, protocols, parameters, and topol-
ogy files used by CNS reside in the cns/ subdirectory.
3.9. Starting an ARIA It is now possible to launch the ARIA calculation, using the follow-
Run ing command:
aria2 project.xml
ARIA will then automatically perform all the steps listed in
Subheadings 3.13.5. The main ARIA job will be executed on the
local machine where it has been started. According to the job man-
ager settings of the project, the structure calculations will be
23 ARIA for Solution and Solid-State NMR 471
begin le
it1
analysis Various analysis results (performed by CNS)
run1 structures ...
graphics les (PostScript)
ARIA run directory les for each it8
iteration are stored here molmol le to visualize restraints
Last iteration
cns les
Fig. 6. Illustration of the directory tree of an ARIA project and details about the content. Final results can be found in the
directories marked in gray.
3.10. Checking In the next paragraphs, we list the points of interests when inspect-
the Results ing the calculation results, along with some guidance on how to
correct input data and adapt the protocol parameters.
3.10.2. Automated 1. The report files listed in Subheading 3.5 provide analyses on all
Assignments restraints and particularly which restraints have been classified
as violated. Restraints showing consistent violation greater
than 0.1 should be inspected manually. Restraints with large
upper-bound violations (5 ) in the majority of the conform-
ers (85%) usually result from incorrect assignments. Restraints
detected as such should not be used in a later ARIA run and
the corresponding cross-peak removed from its respective
spectrum. Other assignments should be considered as reli-
able in a subsequent run.
2. Analyzing text files for violations and assignments can be a
tedious task. ARIA also provides ways to investigate this in a
graphical manner (37). Postscript files describing the restraints,
based on the RMS of violations are generated automatically
during a run. These values are displayed at the residue level, in
the form of a profile along the protein sequence, or as a contact
map for the RMS of violations per residue pair (Fig. 7a). The
contact map displays the sum of the RMS of violations per resi-
due pair. In the profile, the sum of the RMS of violations per
residues is plotted along the protein sequence. In addition, the
program provides an interactive tool to browse assignments at
the residue level (Peak map). A peak-map can be viewed for all
iterations in the ARIA GUI (Fig. 8). Clicking on a contact
Fig. 7. Per-residue quality plots. (a) Contact map displaying the sums of RMS deviations and a profile of the RMS deviations.
(b) WHATIF score profiles along the sequence. The RMS deviations are plotted on a color scale (figure adapted from ref. 25).
23 ARIA for Solution and Solid-State NMR 473
Fig. 8. Interactive peak map. Right panel of the ARIA 2.3 GUI showing the interactive peak map at iteration 8 of an ARIA run.
Each pixel of the map located between residues i and j is clickable and opens an assignment report, which contains the
list of peaks that exist between residues i and j, along with their contributions (figure adapted from ref. 25).
Fig. 9. Screenshot of CCPNmr Analysis windows showing the result of an ARIA run.
3.11. Preparing To use the result of an ARIA run to further improve the structure,
a New Run it may be necessary to correct the input data. At this stage, we rec-
ommend preparing a new ARIA project for better bookkeeping.
CCPNmr Analysis also offers a utility to manage the input and
output of successive ARIA runs (Fig. 9). The same CCPN project
can be used in multiple ARIA runs.
3.11.2. Adjusting In the new project file, protocol parameters may also be changed
Parameters according to the result of a previous calculation. We list here the
most important parameters that ought to be adapted.
1. The number of dynamic steps required for convergence is
determined by the system size and the level of ambiguity or
incompleteness of the input data. Default values work well for
systems up to about 100 residues studied with NOESY.
However, for larger systems (e.g., symmetric oligomers) or
when MAS solid-state NMR data are used, it might become
necessary to increase the number of steps in the cooling stage
476 B. Bardiaux et al.
4. Notes
15. A user can also choose the New item in the GUI menu
Project to create a new project. As an alternative, the follow-
ing command
aria2 --project_template project.xml
will create a new project file.
16. Residual dipolar coupling data can be incorporated as restraints
following two alternative approaches: direct (SANI) or indirect
(VEAN). For SANI, the user has to specify the rhombicity and
magnitude of the alignment tensor (65). Several methods exist
to predict these parameters, from the distribution of the RDC
values (66) or from the shape of the molecule (67). VEAN
uses intervector projection angle restraints which must be gen-
erated with a separate program (68).
17. The correlation between a three-bond measured J-coupling
and the corresponding dihedral angle is modeled by the Karplus
curve. Default values for the parameters of the Karplus curve
are given for 3J(HNHa).
18. An MTF can be specified in the project file. Changes must be also
made to the CNS topology, linkage, and parameter files. Definitions
of the additional residues or compounds must be added to the
ARIA dictionary (files atomnames.xml and iupac.xml).
A detailed explanation is given on the ARIA Web site.
19. We recommend the use of the network-anchoring only for the
first 3 iterations. Too stringent thresholds or an application of
network-anchoring during more ARIA iterations may bias the
assignment process toward an incorrect structure (13).
20. Jobs can be submitted via ssh commands or with the follow-
ing batch queuing systems: PBS (69), SGE (70) or Condor
(71). Alternatively, CCPN users can submit their ARIA calcu-
lation to the CCPNGrid portal server at http://www.webapps.
ccpn.ac.uk/ccpngrid/.
21. Only local copies of data files are used for structure calculation.
Changes in the original files will thus become active only in the
next project setup.
22. For systems of about 100 residues, well converged ensembles
show average energies of the order of 1,000 kcal/mol. Normal
energy variation is about 10%, the total average energy scaling
is approximately linear with the system size.
23. Others methods are available to estimate the credibility of the
structures, notably by scrutinizing the information content of
the data (72). For instance, the completeness (73) of a restraint
set provides insight into the local reliability of each structure.
The completeness is the ratio between the number of observed
restraints and the number of expected restraints. We recom-
mend the method AQUA (73) to perform such analysis.
480 B. Bardiaux et al.
Acknowledgments
References
1. Wuthrich, K. (1986) NMR of Proteins and 6. Rieping, W., Habeck, M., Bardiaux, B.,
Nucleic Acids, Wiley-Interscience New York. Bernard, A., Malliavin, T., and Nilges, M.
2. Nilges, M. (1995) Calculation of protein struc- (2007) ARIA2: automated NOE assignment
tures with ambiguous distance restraints. and data integration in NMR structure calcula-
Automated assignment of ambiguous NOE tion. Bioinformatics 23, 381382.
crosspeaks and disulphide connectivities. J. Mol. 7. Castellani, F., van Rossum, B., Diehl, A.,
Biol. 245, 645660. Schubert, M., Rehbein, K., and Oschkinat, H.
3. Nilges, M. and ODonoghue, S. I. (1998) (2002) Structure of a protein determined by
Ambiguous NOEs and automated NOESY solid-state magic-angle-spinning NMR spec-
assignment. Prog. NMR Spec. 32, 107139. troscopy. Nature 420, 98102.
4. Linge, J. P., ODonoghue, S. I., and Nilges, M. 8. Herrmann, T., Gntert, P., and Wthrich, K.
(2001) Automated assignment of ambiguous (2002) Protein NMR structure determination
nuclear overhauser effects with ARIA. Methods with automated NOE assignment using the
Enzymol. 339, 7190. new software CANDID and the torsion angle
5. Linge, J. P., Habeck, M., Rieping, W., and dynamics algorithm DYANA. J. Mol. Biol. 319,
Nilges, M. (2003) ARIA: automated NOE 209227.
assignment and NMR structure calculation. 9. Fossi, M., Castellani, F., Nilges, M., Oschkinat,
Bioinformatics 19, 315316. H., and van Rossum, B. (2005) SOLARIA: a
23 ARIA for Solution and Solid-State NMR 481
protocol for automated cross-peak assignment Arendall, W. B., Snoeyink, J., Richardson, J. S.,
and structure calculation for solid-state magic- and Richardson, D. C. (2007) MolProbity: all-
angle spinning NMR spectroscopy. Angew. atom contacts and structure validation for pro-
Chem. Int. Ed. Engl. 44, 61516154. teins and nucleic acids. Nucleic Acids Res. 35,
10. Loquet, A., Bardiaux, B., Gardiennet, C., W375383.
Blanchet, C., Baldus, M., Nilges, M., Malliavin, 22. Folmer, R. H., Hilbers, C. W., Konings, R. N.,
T., and Bckmann, A. (2008) 3D Structure and Nilges, M. (1997) Floating stereospecific
Determination of the Crh Protein from Highly assignment revisited: application to an 18 kDa
Ambiguous Solid-State NMR Restraints. protein and comparison with J-coupling data.
J. Am. Chem. Soc. 130, 35793589. J. Biomol. NMR 9, 245258.
11. Manolikas, T., Herrmann, T., and Meier, B. 23. Duggan, B., Legge, G., Dyson, H., and Wright,
(2008) Protein structure determination from P. (2001) SANE (Structure Assisted NOE
(13)C spin-diffusion solid-state NMR spectros- Evaluation): an automated model-based
copy. J. Am. Chem. Soc. 130, 39593966. approach for NOE assignment. J. Biomol. NMR
12. Wasmer, C., Lange, A., Melckebeke, H. V., 19, 321329.
Siemer, A., Riek, R., and Meier, B. (2008) 24. Grler, A. and Kalbitzer, H. R. (1997) Relax, a
Amyloid fibrils of the HET-s(218289) prion flexible program for the back calculation of
form a beta solenoid with a triangular hydro- NOESY spectra based on complete relaxation
phobic core. Science 319, 15231526. matrix formalism. J. Magn. Reson. 124,
13. Bardiaux, B., Bernard, A., Rieping, W., Habeck, 177188.
M., Malliavin, T. E., and Nilges, M. (2009) 25. Linge, J., Habeck, M., Rieping, W., and Nilges,
Influence of different assignment conditions on M. (2004) Correction of spin diffusion during
the determination of symmetric homodimeric iterative automated NOE assignment. J. Magn.
structures with ARIA. Proteins 75, 569585. Reson. 167, 334342.
14. Nilges, M., Bernard, A., Bardiaux, B., Malliavin, 26. Mumenthaler, C. and Braun, W. (1995)
T., Habeck, M., and Rieping, W. (2008) Automated assignment of simulated and exper-
Accurate NMR structures through minimisa- imental NOESY spectra of proteins by feedback
tion of an extended hybrid energy. Structure filtering and self-correcting distance geometry.
16, 13051312. J. Mol. Biol. 254, 465480.
15. van Rossum, G., http://www.python.org/. 27. Stein, E. G., Rice, L. M., and Brnger, A. T.
16. Brnger, A. T., Adams, P. D., Clore, G. M., (1997) Torsion-angle molecular dynamics as a
DeLano, W. L., Gros, P., Grosse-Kunstleve, new efficient tool for NMR structure calcula-
R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., tion. J. Magn. Reson. 124, 154164.
Pannu, N. S., Read, R. J., Rice, L. M., 28. Fossi, M., Oschkinat, F., Nilges, M., and Ball,
Simonson, T., and Warren, G. L. (1998) L. (2005) Quantitative study of the effects of
Crystallography and NMR system (CNS): A chemical shift tolerances and rates of SA cool-
new software suite for macromolecular struc- ing on structure calculation from automatically
ture determination. Acta Cryst. sect. D 54, assigned NOE data. J. Magn. Reson. 175,
905921. 92102.
17. Vranken, W. F., Boucher, W., Stevens, T. J., 29. Rieping, W., Habeck, M., and Nilges, M. (2005)
Fogh, R. H., Pajon, A., Llinas, M., Ulrich, E. Modeling errors in NOE data with a log-normal
L., Markley, J. L., Ionides, J., and Laue, E. D. distribution improves the quality of NMR struc-
(2005) The CCPN data model for NMR spec- tures. J. Am. Chem. Soc. 127, 1602616027.
troscopy: development of a software pipeline. 30. Rieping, W., Habeck, M., and Nilges, M.
Proteins 59, 687696. (2005) Inferential Structure Determination.
18. Laskowski, R. A., MacArthur, M. W., Moss, D. Science 309, 303306.
S., and Thornton, J. M. (1993) PROCHECK: 31. Habeck, M., Rieping, W., and Nilges, M.
a program to check the stereochemical quality (2006) Weighting of experimental evidence in
of protein structures. J. Appl. Cryst. 26, macromolecular structure determination. Proc.
283291. Natl. Acad. Sci. USA 103, 17561761.
19. Vriend, G. (1990) WHAT IF: a molecular 32. Nilges, M. (1993) A calculation strategy for the
modeling and drug design program. J. Mol. structure determination of symmetric dimers
Graph. 8, 5256. by 1 H NMR. Proteins 17, 297309.
20. Sippl, M. J. (1993) Recognition of errors in 33. Linge, J. P., Williams, M. A., Spronk, C. A.,
three-dimensional structures of proteins. Bonvin, A. M., and Nilges, M. (2003)
Proteins Struct. Funct. Genet. 17, 355362. Refinement of protein structures in explicit sol-
21. Davis, I. W., Leaver-Fay, A., Chen, V. B., Block, vent. Proteins Struct. Funct. Genet. 20,
J. N., Kapral, G. J., Wang, X., Murray, L. W., 496506.
482 B. Bardiaux et al.
34. Linge, J. P. and Nilges, M. (1999) Influence of contour diagrams., J. Magn. Reson. 95,
non-bonded parameters on the quality of NMR 214220.
structures: a new force-field for NMR structure 46. Kjr, M., Andersen, K. V., and Poulsen, F. M.
calculation. J. Biomol. NMR 13, 5159. (1994) Automated and semiautomated analysis
35. Nederveen, A., Doreleijers, J., Vranken, W., of homo- and heteronuclear multidimensional
Miller, Z., Spronk, C., Nabuurs, S., Guntert, nuclear magnetic resonance spectra of proteins:
P., Livny, M., Markley, J., Nilges, M., Ulrich, the program PRONTO. Methods Enzymol.
E., Kaptein, R., and Bonvin, A. M. (2005) 239, 288308.
RECOORD: a REcalculated COORdinates 47. Bartels, C., Xia, T.-H., Billeter, M., Gntert,
Database of 500+ proteins from the PDB using P., and Wthrich, K. (1995) The program
restraints from the BioMagResBank. Proteins XEASY for computer-supported NMR spectral
59, 662672. analysis of biological macromolecules. J. Biomol.
36. The World Wide Web Consortium (2008), NMR 5, 110.
Extensible Markup Language (XML) 1.0 (Fifth 48. Gntert, P., Braun, W., and Wthrich, K.
Edition), http://www.w3.org/TR/xml/. (1991) Efficient computation of three-dimen-
37. Bardiaux, B., Bernard, A., Rieping, W., Habeck, sional protein structures in solution from
M., Malliavin, T., and Nilges, M. (2008) nuclear magnetic resonance data using the pro-
Graphical analysis of NMR structural quality gram DIANA and the supporting programs
and interactive contact map of NOE assign- CALIBA, HABAS and GLOMSA. J. Mol. Biol.
ments in ARIA. BMC Struct. Biol. 8, 3034. 217, 517530.
38. Spronk, C. A. E. M., Nabuurs, S. B., Krieger, 49. Hall, S. R. and Cook, A. P. F. (1995) STAR
E., Vriend, G., and Vuister, G.W. (2004) dictionary definition language: Initial specifica-
Validation of protein structures derived by tion. J. Chem. Inf. Comput. Sci. 35, 819825.
NMR spectroscopy. Progress in Nuclear 50. Markley, J. L., Bax, A., Arata, Y., Hilbers, C. W.,
Magnetic Resonance Spectroscopy 45, 315337. Kaptein, R., Sykes, B. D., Wright, P. E., and
39. Saccenti, E. and Rosato, A. (2008) The war of Wthrich, K. (1998) Recommendations for the
tools: how can NMR spectroscopists detect presentation of NMR structures of proteins
errors in their structures? J. Biomol. NMR 40, and nucleic acids. J. Mol. Biol. 280, 933952.
251261. 51. Gntert, P., Mumenthaler, C., and Wtrich, K.
40. Nabuurs, S., Krieger, E., Spronk, C., Nederveen, (1997) Torsion Angle Dynamics for NMR
A., Vriend, G., and Vuister, G. (2005) Strucutre Calculation with the New Program
Definition of a new information-based per-res- DYANA. J. Mol. Biol. 273, 283298.
idue quality parameter. J. Biomol. NMR 33, 52. Wthrich, K., Billeter, M., and Braun, W.
123134. (1983) Pseudo-structures for the 20 common
41. Nabuurs, S., Spronk, C., Vuister, G., and Vriend, G. amino acids for use in studies of protein con-
(2006) Traditional biomolecular structure deter- formations by measurements of intramolecular
mination by NMR spectroscopy allows for major proton-proton distance constraints with nuclear
errors. PLoS Comput. Biol. 2, e9. magnetic resonance. J Mol Biol 169, 949961.
42. Kraulis, P., Domaille, P. J., Campbell-Burk, S. 53. Lange, A., Luca, S., and Baldus, M. (2002)
L., van Aken, T., and Laue, E. D. (1994) Structural constraints from proton-mediated
Solution structure and dynamics of ras p21. rare-spin correlation spectroscopy in rotating
GDP determined by heteronuclear three- and solids. J. Am. Chem. Soc. 124, 97049705.
four-dimensional NMR spectroscopy. 54. Szeverenyi, N., Sullivan, M., and Maciel, G.
Biochemistry 33, 35153531. (1982) Observation of spin exchange by two-
43. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, dimensional fourier transform 13 C cross polar-
G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a ization-magic-angle spinning. J. Magn. Reson.
multidimensional spectral processing system 47, 462475.
based on UNIX pipes. J. Biomol. NMR 6, 55. Castellani, F., van Rossum, B., Diehl, A.,
277293. Rehbein, K., and Oschkinat, H. (2003)
44. Johnson, B. A. and Blevins, R. A. (1994) Determination of solid-state NMR structures
NMRView: A computer program for the visu- of proteins by means of three-dimensional
alization and analysis of NMR data. J. Biomol. 15 N-13 C-13 C dipolar correlation spectros-
NMR 4, 603614. copy and chemical shift analysis. Biochemistry
45. Garrett, D., Powers, R., Gronenborn, A., and 42, 1147611483.
Clore, G. (1991) A common sense approach to 56. Takegoshi, K., Nakamura, S., and Terao, T.
peak picking two-, three- and four-dimensional (2003) 13 C-1 H dipolar-driven 13 C-13 C recou-
spectra using automatic computer analysis of pling without 13 C rf irradiation in nuclear
23 ARIA for Solution and Solid-State NMR 483
magnetic resonance of rotating solids. J. Chem. 67. Zweckstetter, M. and Bax, A. (2000) Prediction
Phys. 118, 23252341. of sterically induced alignment in a dilute liquid
57. Lewandowski, J. R., Pape, G. D., Eddy, M. T., crystalline phase: Aid to protein structure
and Griffin, R. G. (2009) (15)N-(15)N proton determination by NMR. J. Am. Chem. Soc.
assisted recoupling in magic angle spinning 122, 37913792.
NMR. J. Am. Chem. Soc. 131, 57695776. 68. Meiler, J., Blomberg, N., Nilges, M., and
58. Fossi, M., Linge, J., Labudde, D., Leitner, D., Griesinger, C. (2000) A new approach for
Nilges, M., and Oschkinat, H. (2005) Influence applying residual dipolar couplings as restraints
of chemical shift tolerances on NMR structure in structure calculations. J. Biomol. NMR 16,
calculations using ARIA protocols for assigning 245252.
NOE data. J. Biomol. NMR 31, 2134. 69. Jones, J. P. (2002) PBS: portable batch system,
59. Wishart, D. S. and Sykes, B. D. (1994) The 13 C Beowulf cluster computing with Linux, MIT
chemical-shift index: a simple method for the Press, Cambridge, MA, USA, 369390.
identification of protein secondary structure 70. Gentzsch, W. (2001) Sun Grid Engine: Towards
using 13 C chemical-shift data. J. Biomol. NMR creating a compute power grid, CCGRID 01:
4, 171180. Proceedings of the 1st International Symposium
60. Cornilescu, G., Delaglio, F., and Bax, A. (1999) on Cluster Computing and the Grid,
Protein backbone angle restraints from search- IEEE Computer Society, Washington, DC,
ing a database for chemical shift and sequence USA, 35.
homology. J. Biomol. NMR 13, 289302. 71. Thain, D., Tannenbaum, T., and Livny, M.
61. Cheung, M.-S., Maguire, M. L., Stevens, T. J., (2005) Distributed computing in practice: the
and Broadhurst, R. W. (2010) DANGLE: A Condor experience. Concurr. Comput.: Pract.
Bayesian inferential method for predicting pro- Exper. 17, 323356.
tein backbone dihedral angles and secondary 72. Nabuurs, S., Spronk, C., Krieger, E.,
structure. J. Magn. Reson. 202, 22333. Maassen, H., Vriend, G., and Vuister, G.
62. Loquet, A., Gardiennet, C., and Bckmann, A. (2003) Quantitative evaluation of experimental
(2010) Protein 3D structure determination by NMR restraints. J. Am. Chem. Soc. 125,
high-resolution solid-state NMR. Comptes. 1202612034.
Rendus - Chimie 13, 423430. 73. Doreleijers, J. F., Raves, M. L., Rullmann, T.,
63. Gardiennet, C., Loquet, A., Etzkorn, M., and Kaptein, R. (1999) Completeness of NOEs
Heise, H., Baldus, M., and Bckmann, A. in protein structure: a statistical analysis of
(2008) Structural constraints for the Crh pro- NMR data. J. Biomol. NMR 14, 123132.
tein from solid-state NMR experiments. J. 74. Bhattacharya, A., Tejero, R., and Montelione,
Biomol. NMR. 40, 239250. G. T. (2007) Evaluating protein structures
64. LeMaster, D. M. and Kushlan, D. M. (1996) determined by structural genomics consortia.
Dynamical mapping of E. coli thioredoxin via Proteins 66, 778795.
13 C NMR relaxation analysis. J. Am. Chem. 75. Doreleijers, J. F., Vranken, W. F., Schulte, C.,
Soc. 118, 92559264. Lin, J., Wedell, J. R., Penkett, C. J., Vuister, G.
65. Tjandra, N., Garrett, D. S., Gronenborn, A. W., Vriend, G., Markley, J. L., and Ulrich, E. L.
M., Bax, A., and Clore, G. M. (1997) Defining (2009) The NMR restraints grid at BMRB for
long range order in NMR structure determina- 5,266 protein and nucleic acid PDB entries.
tion from the dependence of heteronuclear J. Biomol. NMR 45, 389396.
relaxation times on rotational diffusion anisot- 76. Jehle, S., Rajagopal, P., Bardiaux, B., Markovic,
ropy. Nature Struct. Biol. 4, 443449. S., Khne, R., Stout, J. R., Higman, V. A.,
66. Clore, G., Gronenborn, A., and Bax, A. (1998) A Klevit, R. E., van Rossum, B.-J., and Oschkinat,
robust method for determining the magnitude of H. (2010) Solid-state NMR and SAXS studies
the fully asymmetric alignment tensor of oriented provide a structural basis for the activation of
macromolecules in the absence of structural alphaB-crystallin oligomers. Nat. Struct. Mol.
information. J. Magn. Reson. 133, 216221. Biol. 17, 10371042.