A few years ago, the National Institutes of Health (NIH) created the Biomedical Information Science and Technology Initiative (BISTI) to examine the current state of bioinformatics in the United States. BISTI’s working definition of bioinformatics included its use in biomedical research, in
particular for drug discovery and development programs. Bioinformatics was seen as an emerging field with the potential to significantly improve how drugs are found, brought to clinical trials and
eventually released to the marketplace.
Computer-Aided Drug Design (CADD) is a specialized discipline that uses computational methods to simulate drug-receptor interactions. CADD methods are heavily dependent on bioinformatics tools,
applications and databases. As such, there is considerable overlap in CADD research and bioinformatics.
Bioinformatics can be thought of as a central hub that unites several disciplines and methodologies.
On the support side of the hub, Information Technology, Information Management, software applications, databases and computational resources all provide the infrastructure for bioinformatics. On
the scientific side of the hub, bioinformatic methods are used extensively in molecular biology, genomics, proteomics, other emerging areas (i.e. metabolomics, transcriptomics) and in CADD
There are several key areas where bioinformatics supports CADD research.
Virtual High-Throughput Screening (vHTS). Pharmaceutical companies are always searching for new leads to develop into drug compounds. One search method is virtual high-throughput
screening. In vHTS, protein targets are screened against databases of small-molecule compounds to see which molecules bind strongly to the target. If there is a “hit” with a
particular compound, it can be extracted from the database for further testing. With today’s computational resources, several million compounds can be screened in a few days on sufficiently
large clustered computers. Pursuing a handful of promising leads for further development can save researchers considerable time and expense. ZINC
is a good example of a vHTS compound library.
Sequence Analysis. In CADD research, one often knows the genetic sequence of multiple organisms or the amino acid sequence of proteins from several species. It is very useful to determine
how similar or dissimilar the organisms are based on gene or protein sequences. With this information one can infer the evolutionary relationships of the organisms, search for similar sequences
in bioinformatic databases and find related species to those under investigation. There are many bioinformatic sequence analysis tools that
can be used to determine the level of sequence similarity.
Homology Modeling. Another common challenge in CADD research is determining the 3-D structure of proteins. Most drug targets are proteins, so it’s important to know their 3-D
structure in detail. It’s estimated that the human body has 500,000 to 1 million proteins. However, the 3-D structure is known for only a small fraction of these. Homology modeling is one
method used to predict 3-D structure. In homology modeling, the amino acid sequence of a specific protein (target) is known, and the 3-D structures of proteins related to the target (templates)
are known. Bioinformatics software tools are then used to predict the 3-D structure of the target based on the known 3-D structures of the templates. MODELLER is a well-known tool in homology modeling, and the SWISS-MODEL Repository is a database of protein
structures created with homology modeling.
Similarity Searches. A common activity in biopharmaceutical companies is the search for drug analogues. Starting with a promising drug molecule, one can search for chemical compounds with
similar structure or properties to a known compound. There are a variety of methods used in these searches, including sequence similarity, 2D and 3D shape similarity, substructure similarity,
electrostatic similarity and others. A variety of bioinformatic tools and search engines are available for this work.
Drug Lead Optimization. When a promising lead candidate has been found in a drug discovery program, the next step (a very long and expensive step!) is to optimize the structure and
properties of the potential drug. This usually involves a series of modifications to the primary structure (scaffold) and secondary structure (moieties) of the compound. This process can be
enhanced using software tools that explore related compounds (bioisosteres) to the lead candidate. OpenEye’s WABE is
one such tool. Lead optimization tools such as WABE offer a rational approach to drug design that can reduce the time and expense of searching for related compounds.
Physicochemical Modeling. Drug-receptor interactions occur on atomic scales. To form a deep understanding of how and why drug compounds bind to protein targets, we must consider the
biochemical and biophysical properties of both the drug itself and its target at an atomic level. Swiss-PDB is an excellent tool for doing this. Swiss-PDB
can predict key physicochemical properties, such as hydrophobicity and polarity that have a profound influence on how drugs bind to proteins.
Drug Bioavailability and Bioactivity. Most drug candidates fail in Phase III clinical trials after many years of research and millions of dollars have been spent on them. And most fail
because of toxicity or problems with metabolism. The key characteristics for drugs are Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) and efficacy—in other words
bioavailability and bioactivity. Although these properties are usually measured in the lab, they can also be predicted in advance with bioinformatics
Benefits of CADD
CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.
Cost Savings.The Tufts Report suggests that the cost of drug discovery and development has reached $800 million for each drug successfully brought to
market. Many biopharmaceutical companies now use computational methods and bioinformatics tools to reduce this cost burden. Virtual screening, lead optimization and predictions of bioavailability
and bioactivity can help guide experimental research. Only the most promising experimental lines of inquiry can be followed and experimental dead-ends can be avoided early based on the results of
Time-to-Market. The predictive power of CADD can help drug research programs choose only the most promising drug candidates. By focusing drug research on specific lead candidates and
avoiding potential “dead-end” compounds, biopharmaceutical companies can get drugs to market more quickly.
Insight. One of the non-quantifiable benefits of CADD and the use of bioinformatics tools is the deep insight that researchers acquire about drug-receptor interactions. Molecular models of
drug compounds can reveal intricate, atomic scale binding properties that are difficult to envision in any other way. When we show researchers new molecular models of their putative drug
compounds, their protein targets and how the two bind together, they often come up with new ideas on how to modify the drug compounds for improved fit. This is an intangible benefit that can help
design research programs.
CADD and bioinformatics together are a powerful combination in drug research and development. An important challenge for us going forward is finding skilled, experienced people to manage all the
bioinformatics tools available to us, which will be a topic for a future article.
More bioinformatics articles, resources, news and events are available in the Business Intelligence Network's Bioinformatics Channel
. Be sure to visit today!
Recent articles by Dr. Richard Casey