Calculation of the Binding Pocket Druggability
Two models to predict protein binding pocket druggability are implemented in TRAPP: Logistic regression (LR) and Convolution Neural Network (CNN).
All the snapshots of a trajectory are first aligned and superimposed with the user-defined reference structure. Then the pocket druggability is computed for each snapshot and plotted to show its variation.
TRAPP-LR provides a linear model for pocket druggability trained with logistic regression using global descriptors of the binding pockets (such as the pocket volume and pocket hydrophobicity). TRAPP-CNN uses a convolutional neural network to process a spatial grid representation of the properties of the binding pockets. The NRDLD dataset (Krasowski et al. JCIM 2011,51, 2829–2842) was used for training the TRAPP-LR model. Note, that to perform druggability calculations, a fixed grid size/step are used (grid edge length of 24 Å, grid spacing of 0.75 Å). TRAPP-CNN was trained on a larger dataset (DaPB, obtained from the PDBbind refined set by thresholding the properties and the binding affinity) and augmented by the negative (less druggable) class identified using the TRAPP-SVM (support vector machine with a linear kernel) model, which was itself trained on the NRDLD dataset and uses global descriptors.