Bioinformatics in Computer-Aided Drug Design

Originally published May 10, 2005

A few years ago, the National Institutes of Health (NIH) created the Biomedical Information Science and Technology Initiative (BISTI) to examine the current state of bioinformatics in the United States.  BISTI’s working definition of bioinformatics included its use in biomedical research, in particular for drug discovery and development programs. Bioinformatics was seen as an emerging field with the potential to significantly improve how drugs are found, brought to clinical trials and eventually released to the marketplace.

Computer-Aided Drug Design (CADD) is a specialized discipline that uses computational methods to simulate drug-receptor interactions. CADD methods are heavily dependent on bioinformatics tools, applications and databases. As such, there is considerable overlap in CADD research and bioinformatics.

Bioinformatics Hub

Bioinformatics can be thought of as a central hub that unites several disciplines and methodologies.


On the support side of the hub, Information Technology, Information Management, software applications, databases and computational resources all provide the infrastructure for bioinformatics. On the scientific side of the hub, bioinformatic methods are used extensively in molecular biology, genomics, proteomics, other emerging areas (i.e. metabolomics, transcriptomics) and in CADD research.

There are several key areas where bioinformatics supports CADD research. 

  • Virtual High-Throughput Screening (vHTS). Pharmaceutical companies are always searching for new leads to develop into drug compounds. One search method is virtual high-throughput screening. In vHTS, protein targets are screened against databases of small-molecule compounds to see which molecules bind strongly to the target. If there is a “hit” with a particular compound, it can be extracted from the database for further testing. With today’s computational resources, several million compounds can be screened in a few days on sufficiently large clustered computers. Pursuing a handful of promising leads for further development can save researchers considerable time and expense. ZINC  is a good example of a vHTS compound library.
  • Sequence Analysis. In CADD research, one often knows the genetic sequence of multiple organisms or the amino acid sequence of proteins from several species. It is very useful to determine how similar or dissimilar the organisms are based on gene or protein sequences. With this information one can infer the evolutionary relationships of the organisms, search for similar sequences in bioinformatic databases and find related species to those under investigation. There are many bioinformatic sequence analysis tools that can be used to determine the level of sequence similarity.    
  • Homology Modeling. Another common challenge in CADD research is determining the 3-D structure of proteins. Most drug targets are proteins, so it’s important to know their 3-D structure in detail. It’s estimated that the human body has 500,000 to 1 million proteins. However, the 3-D structure is known for only a small fraction of these. Homology modeling is one method used to predict 3-D structure. In homology modeling, the amino acid sequence of a specific protein (target) is known, and the 3-D structures of proteins related to the target (templates) are known. Bioinformatics software tools are then used to predict the 3-D structure of the target based on the known 3-D structures of the templates.  MODELLER is a well-known tool in homology modeling, and the SWISS-MODEL Repository is a database of protein structures created with homology modeling.
  • Similarity Searches. A common activity in biopharmaceutical companies is the search for drug analogues. Starting with a promising drug molecule, one can search for chemical compounds with similar structure or properties to a known compound. There are a variety of methods used in these searches, including sequence similarity, 2D and 3D shape similarity, substructure similarity, electrostatic similarity and others. A variety of bioinformatic tools and search engines are available for this work.
  • Drug Lead Optimization. When a promising lead candidate has been found in a drug discovery program, the next step (a very long and expensive step!) is to optimize the structure and properties of the potential drug. This usually involves a series of modifications to the primary structure (scaffold) and secondary structure (moieties) of the compound. This process can be enhanced using software tools that explore related compounds (bioisosteres) to the lead candidate. OpenEye’s WABE is one such tool. Lead optimization tools such as WABE offer a rational approach to drug design that can reduce the time and expense of searching for related compounds.
  • Physicochemical Modeling. Drug-receptor interactions occur on atomic scales. To form a deep understanding of how and why drug compounds bind to protein targets, we must consider the biochemical and biophysical properties of both the drug itself and its target at an atomic level. Swiss-PDB is an excellent tool for doing this. Swiss-PDB can predict key physicochemical properties, such as hydrophobicity and polarity that have a profound influence on how drugs bind to proteins. 
  • Drug Bioavailability and Bioactivity. Most drug candidates fail in Phase III clinical trials after many years of research and millions of dollars have been spent on them. And most fail because of toxicity or problems with metabolism. The key characteristics for drugs are Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) and efficacy—in other words bioavailability and bioactivity. Although these properties are usually measured in the lab, they can also be predicted in advance with bioinformatics software.       

Benefits of CADD

CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.

  • Cost Savings.The Tufts Report suggests that the cost of drug discovery and development has reached $800 million for each drug successfully brought to market. Many biopharmaceutical companies now use computational methods and bioinformatics tools to reduce this cost burden. Virtual screening, lead optimization and predictions of bioavailability and bioactivity can help guide experimental research. Only the most promising experimental lines of inquiry can be followed and experimental dead-ends can be avoided early based on the results of CADD simulations.
  • Time-to-Market. The predictive power of CADD can help drug research programs choose only the most promising drug candidates. By focusing drug research on specific lead candidates and avoiding potential “dead-end” compounds, biopharmaceutical companies can get drugs to market more quickly. 
  • Insight. One of the non-quantifiable benefits of CADD and the use of bioinformatics tools is the deep insight that researchers acquire about drug-receptor interactions. Molecular models of drug compounds can reveal intricate, atomic scale binding properties that are difficult to envision in any other way. When we show researchers new molecular models of their putative drug compounds, their protein targets and how the two bind together, they often come up with new ideas on how to modify the drug compounds for improved fit. This is an intangible benefit that can help design research programs.

CADD and bioinformatics together are a powerful combination in drug research and development. An important challenge for us going forward is finding skilled, experienced people to manage all the bioinformatics tools available to us, which will be a topic for a future article.

Editor's note: More bioinformatics articles, resources, news and events are available in the Business Intelligence Network's Bioinformatics Channel. Be sure to visit today!


  • Dr. Richard CaseyDr. Richard Casey

    Richard is the Founder and Chief Scientific Officer of RMC Biosciences Inc., a firm that offers services in Bioinformatics and Computer Aided Drug Design. Dr. Casey received a Ph.D. in Biological Sciences from Colorado State University. He has 20-plus years experience in Computational Sciences, Information Technology and High-Performance Computing. He has held corporate and academic positions at Hewlett-Packard, Boeing Computer Services, Arizona State University, Colorado State University, the Alabama Supercomputer Center, and the Institute for Computational Studies at CSU and was the founder of a software consulting firm, Alpine Computing Inc. He holds a Project Management Professional Certificate and a Bioinformatics Certificate from Stanford University. Richard can be reached at

Recent articles by Dr. Richard Casey



Want to post a comment? Login or become a member today!

Be the first to comment!