Detection of transient sub-pockets in crystal structures
As an example we will use p38 MAP kinase
There are three transient pockets (a, b, and c, see Fig.1) that arise from the motion of the beta-sheet (ATP-binding site, denoted as a), reorganization of the activation loop containing the highly conserved Asp-Phe-Gly (DFG) motif (c), and flipping of the M109 residue with small backbone shifts (b). Sub-pocket c is closed in crystal structures 1a9u and 1ouy but open in the crystal structure 1kv1.
The activation loop (residues above the DFG loop with numbers between 170 and 185) is missing in many structures where the transient sub-pocket C behind the DFG loop is open (inactive state II). The missing residues do not contribute to the binding site.
2. To run these examples, you need the following files and parameters:
- prot.pdb The reference structure (generated by grep ATOM 1a9u.pdb > prot.pdb from the 1a9u pdb file)
- ligand.pdb The ligand that will be used for defining the binding site. As the ligand, we use just one ring of the ligand in the PDB 1a9u structure, that is positioned at about the pocket center.
- 17 PDB structures of the protein extracted from the archive PDB_structures.tar
Since we know that some residues that are in the vicinity of the binding site are missing in some structures, we use a relatively small pocket radius to be sure that the binding site is complete. Taking into account, that our ligand is defined as one aromatic ring, we set pocket radius = 8 Angstroms.
3. Simulation procedure and results
- Step 1: Uploading reference structure and defining input parameters
You have to upload the reference structures of the protein and ligand and click Upload trajectories at the next step before you go to the Next Step.
- Step 2: Uploading PDB structures
In the next page Pocket analysis will be started with the following parameters you must be able to see the JSmol visualization of the reference structure and an identified binding pocket. You can then upload all PDB structures (as PDB trajectories) sequentially and launch TRAPP-structure/analysis. Since no method for generation of new structures was selected in the previous step, only analysis of the uploaded structures (trajectories) will be done .
- Step 3 : Checking input data and starting simulations
On the page Upload Custom Trajectories: Check if a list of PDB files at the bottom of the page contains all structures you want to analyze. Then click Launch TRAPP structure/analysis.
- Step 4: Viewing simulation results of TRAPP-analysis
When the job is completed you can open a page with preliminary results (link View Analysis). Note that TRAPP-analysis runs in this step with default parameters (backbone atoms are used for RMSD calculations and a fast hierarchical clustering of the binding site conformations is carried out with a threshold of 3 Angstroms).
These parameters may not be well suited to our particular case. Specifically, selected binding site residues may not completely represent the binding pocket, or, alternatively, may include residues that are not related to the binding site regions (use the link View cluster representatives in JSmol for a view of selected binding site residues that are shown by sticks). This can be corrected by changing parameters and clicking re-run TRAPP-analysis. Look in the Log_file to see if any residues in the binding pocket are missing (for example, the line TRAPP:------trj: 165LEU(ref: 166PHE) -> mutated means that PHE166 is probably absent in your PDB structure and instead the next residue will be used for alignment and RMSD calculations)
- Step 5: Re-run TRAPP-analysis using k-means clustering
We add residues 166, 169 and 170 (particularly, Phe169 in original PDB structure, from the DFG loop) to the list and residue 28 from the flexible beta-sheet (transient pocket A).
We also choose k-means clustering, and increase RMSD to be used for preliminary hierarchical clustering of structures to 5 Angstroms in order to reduce the number of clusters. We also increase Maximum value of deviation to be shown in plots to 10 Angstroms. Then click Re-run TRAPP-analysis (Note the binding site residues are renumbered as shown in the residue mapping and summary input page).
- Step 6: Editing TRAPP-pocket parameters and running TRAPP-pocket
For our example the Trapp-pocket parameters are adjusted as follows:
- Check Use TRAPP-analysis based binding site residues: To make sure that all binding site residues are included in pocket simulations.
- Check H-atoms to be generated for the reference structure and H-atoms to be generated for structures in trajectories to ensure correct pocket shape. There is another (faster, but less accurate) method of taking into account hydrogen atoms in pocket calculations by increasing the Lennard-Jones radii used for pocket identification (for this check Only heavy atoms to be used in Advanced parameters for the pocket identification procedure).
- Change Analysis will be done at the pocket occurrence of (%): to 6, 10, 12, 18, 30 (we have 17 structures, so 1 structure corresponds to 6% occurrence). Check run PCA and run Clustering.
- Launch TRAPP-pocket.
- Step 7: Understanding TRAPP-pocket results
- JSmol.html with a summary of simulation results can be opened using the link "Combined trajectories"
- Conserved/transient regions:
- Link Clustering of transient regions opening in a percentage of snapshots
yellow - pocket population (see Fig.11a and Fig.11b).
red- transient pocket regions that are absent in the reference structure, but appear in some other structures (see Fig.12a and Fig.12b)
Transient regions are shown at 20% occurrence, i.e. they are open in 3-4 structures (since we uploaded 17 structures).
This, however, does not mean that all transient regions shown are open simultaneously in the snapshots.
To find in which snapshot/structure a particular transient region is open,
one has to use the tab Pocket Characteristics
Here opening transient pocket regions are split into three pockets and
the largest one (# 2) is split once more into sub-regions(sub-pockets).
Pocket #3 arises from the motion of Y35 (sub-pocket A) and
is mostly open in structures 3lfa, 1wfc, and 4f9y
(see left plot of Pocket occurrence in the trajectory shown also in Fig.b
and use a link shown under the plot to get original data).
Sub-pockets #4 and #6 corresponds to transient pocket C arising from motion of the DFG loop.
They are mostly open (to different degrees) 3nnw, 3nnu, 3roc,1kv2, and 1kv1.