3DLigandSite: It is an automated method which can predict the ligand binding sites. One can submit a sequence or a protein structure and once submitted Phyre is run to predict the structure. The structure can then be used to search a structural library in order to identify homologous structures with bound ligands. These ligands are then superimposed onto the protein structure in order to predict a ligand binding site [14]. It can be accessed at http://www.sbg.bio.ic.ac.uk/3dligandsite.
1.2.3 Molecular Modeling
When desired structure of a target is not available, determining of the structure experimentally becomes difficult. In such conditions, designing of protein structure from pre-existing data and sequence becomes necessary. In designing of drugs, protein-ligand binding plays an important role. So, it is important to have a 3D structure of a protein. The 3D structures of protein are searched in a widely used database called PDB which provides a repository for all the known protein 3D structures [15]. X-ray crystallography and NMR spectroscopy are the two important techniques which determine the proteins 3D structure experimentally. This can be performed using in silico approach which provides a “homology-based modeling” method for protein modeling also referred as molecular modeling. It is an important computational technique which helps in designing the structure of a novel compound. It plays an important role in the study of various biological pathways which includes protein folding and stability, enzyme catalysis, identification of novel proteins, and other macromolecules [16]. This methodology works on the basis of sequence similarity, i.e., “proteins with similar sequences have similar structures”. The models which are generated usually bear template significant sequence of more than 30% [17]. It is an accurate method allowing the researches to obtain an authenticated structure which might be useful as a drug after further validations. Because of this, virtual screening is necessary and has become an important part of drug discovery process [18]. The most common modeling softwares along with their description has been depicted in Table 1.1. Some of the important steps involved in molecular modeling are as follows:
1 a) Recognition of template and sequence alignment: Recognition of the template is the beginning step in homology modeling. To identify the homologous sequences of unknown protein, one can search the unknown against the pre-existing ones whose structure is known and identified. The homologous sequences can be identified by similarity searches which can be performed using sequence alignment programs such as BLAST (Basic Local Alignment Search Tool).
2 b) Model building: Some of the methods involved in building a model are spatial restraint, rigid-body assembly, segment matching, and artificial evolution.
3 c) Refinement modeling: Refinement of model involves addition, deletion, and substitution of amino acid residues, which includes loop modeling and side-chain modeling. This kind of modeling is based on molecular dynamics simulations, genetic algorithms, and Monte Carlo methods. AMBER, CHARMM22, and MM3 are commonly used force fields for energy minimization of modeled structures.1. Loop modeling: In homologous protein sequences, insertion and substitution of amino acid residues in variable portion of the protein are referred as loops.2. Side-chain modeling: It involves substitution of the side chains on the backbone structure of the protein. The substitution is analyzed by Root Mean Square Deviation (RMSD) values.
4 d) Validation of modeled protein structure: The protein structure obtained after homology modeling needs to be validated in order to check the accuracy of the modeling. This can be performed using web servers like WHATCHECK, WHAT IF, VADAR, and PROCHECK.
5 e) Small molecule databases: Screening of compounds in drug discovery to identify novel and drug-like properties can be performed using small molecule databases like NCBI, PubChem, and ChEMBL [19].
Table 1.1 The list of modeling softwares that are generally used in protein modeling is represented in tabular form.
S. no. | Name of software | Description | Reference |
1 | MODELLER | It involves homology modeling of the three-dimensional structures of the target protein. | [20] |
2 | UCSF Chimera | It helps in the visualization and analysis of molecular structures. | [21] |
3 | SWISS PDB VIEWER | It allows to analyze and model proteins. | [22] |
4 | Geno3D | It is an automatic web server for protein molecular modeling. | [23] |
5 | SWISS MODEL | Automated comparative modeling of protein structures can be performed. | [24] |
6 | CCP4 | It helps in macromolecular structure determination. | [25] |
7 | Abalone | It is a modeling program which involves molecular dynamics of biopolymers | [26] |
8 | Tinker | It performs molecular mechanics and dynamics along with some unique features for biopolymers. | [27] |
MODELLER: It is a computer program which models 3D structures of proteins and their assemblies. This program is the most frequently used program for homology modeling. In order to construct, one needs to provide aligned sequence which will be modeled with known structures [27]. The program will then easily construct/build a model with no hydrogen atoms (https://salilab.org/modeller/).
SWISS PDB VIEWER: The Swiss PDB Viewer is a free molecular graphics program that helps us to evaluate various proteins at the same time. The proteins can be placed one on top of another to reason the structural alignments and compare their active sites [28]. This program can be easily accessed at https://spdbv.vital-it.ch/.
SWISS MODEL: It is a protein structure homology-modeling server which is fully automated. One can easily obtain it through ExPASy web server or from Swiss PDB Viewer. Their main aim is to model protein and make it easier to all researchers of life sciences [29]. It can be accessed at https://swissmodel.expasy.org.
1.2.4 Virtual Screening
Virtual screening is an in silico method used in drug designing. They are involved in identifying active compounds using chemical databases. It helps in identifying the structure of those compounds that may act as lead compounds with maximum affinity