An expert review of potentially reactive features

In several recent posts1,2 we highlighted the usefulness of an expert review of potentially reactive features. This is particularly important when an out-of-domain result is returned or when an area of the test chemical is not being considered by the computational model. The US Food and Drug Administration (FDA) recently published a paper showing that expert knowledge plays an important role in increasing the reliability of (Q)SAR predictions3. Software to support various expert review methodologies is needed to facilitate thorough computational assessments.  

Examining how often different substructural features present in the test chemical appear in historical collections of chemicals with experimental results will tell us whether such a feature is potentially problematic (i.e., there are more toxic examples than you would generally expect) or whether there is no concern (i.e., the proportion of toxic chemicals is either similar to or less than the proportion observed in the whole database). The blog “The use of chemical analogs in expert reviews”2 includes some nice examples.

The Leadscope products make use of a dictionary of over 27,000 substructural fragments often described as the ‘Leadscope feature hierarchy’4. Matching any of these predefined structural features, that are also present in your test chemical, against a database is a great starting point. However, there may still be questions about specific chemical fragments that are not included in this list.

Over the last year, we have been working hard to implement a brand-new chemical structure drawing package which is being integrated into the Leadscope tools. Although one of the more common applications of this tool will be to draw a chemical structure on which to apply computational models, the tool is also ideal for the assessment of ad hoc potentially reactive features.

Please get in touch with me (Glenn Myatt, if you would like to discuss this approach in more detail.


  1. How an expert-review could be used to resolve out-of-domains
  2. The use of chemical analogs in expert reviews
  3. Jayasekara, S et al., Assessing the impact of expert knowledge on ICH M7 (Q)SAR predictions. Is expert review still needed?, Regulatory Toxicology and Pharmacology, 125,  2021, 105006
  4. Roberts G, et al., Leadscope: Software for Exploring Large Sets of Screening Data. J. Chem. Inf. Comput. Sci., 2000, 40, 1302-1314.

Expanding the use of in silico toxicology

The application of in silico toxicology is constantly increasing as we better understand how such methods can support different applications (such as the assessment of genotoxic impurity, extractables and leachables, chemicals requiring classification and labelling, and so on). Position papers are critical to support this expansion. We have reported in some recent blog posts progress in the development of such publications that outline protocols for using such methods1,2,3, expansion of our knowledge around structure-activity relationships4 as well as publications assessing whether in silico methods are fit-for-purpose5,6.

In the development of such publications, we have learned that it is essential to thoroughly understand the context into which such methods are being applied. In addition, some common themes include the importance of high-quality toxicology databases, using multiple in silico methodologies and access to transparent information to perform an expert review. We are currently working hard on the development of new and updated models to support these, and newer applications based on these best practices.

Since the prediction of mutagenicity, sensitization, irritation/corrosion may be used to support extractables and leachables as well as for classification and labelling, we are finalizing new and updated statistical-based and expert rule-based transparent models to cover these endpoints. Prediction of endocrine activity is another area of focus for us as this supports many important applications.

We are also working on models to support some new applications for in silico approaches. These include new bioactivation alerts to support the FDA guidance for industry on in vitro drug interaction studies7 and a new database and expert alerts to support the assessment of abuse liability.

Please contact me (Glenn Myatt; if you would like to discuss any of these models or collaborative initiatives.


  1. Predicting organ toxicity
  2. Endocrine activity in silico protocol
  3. In silico toxicology consortia: impact and future direction
  4. Instem’s Computational Toxicology and Genetic Toxicology Groups at GTA 2021
  5. New acute toxicity (Q)SAR manuscript
  6. Are (Q)SAR models fit-for-purpose for classification and labelling?
  7. Cross-industry development of structural alerts to support the FDA Guidance on in vitro drug interaction studies

In silico toxicology vis-à-vis new therapeutics

In silico toxicology has its established place in the assessment of therapeutics and is used routinely to assess toxicity endpoints with application in research, discovery, and regulatory submissions. During the pandemic we have gotten a glimpse of innovative therapeutics and the public now has an appreciation for how important these are in advancing patient care. Looking back at the past year, we have seen applications of in silico toxicology supporting development in traditional ways and are now exploring non-traditional applications of in silico methods. New therapeutics require innovation in not only the drug design, but also may require new delivery systems, containers, and packaging materials. While systems are in place to assess the safety of new therapeutics, the question as to how we can leverage the low-cost predictivity, and rapid nature of computational approaches to flag any potential concerns at an early stage is a valuable one. Given the novelty involved, existing methods may be utilized in new application areas, additionally new approaches may be considered. Collaboration and creativity around the use of in silico models will be required to address this question.

During this week’s Extractables and Leachables conference, I presented some additional content on this topic and provided an update on a working group which we are currently hosting. If you are wondering how early-stage predictions can support your process (even in non-traditional ways), please feel to get in touch with me at

Predicting organ toxicity

The ability to predict organ toxicity directly from a chemical structure would support many applications throughout the product life cycle, from screening candidates to formulating testing strategies and assessing non-genotoxic impurities.

A recent cross-industry working group, as part of the in silico toxicology protocol project1, was initiated to understand the needs and challenges for in silico prediction of organ toxicity, focusing on the liver, heart, lung, and kidney, with a parallel effort focusing on neurotoxicity. This includes an assessment of current experimental approaches (such as off-target panels from secondary pharmacology batteries), the state-of-the-art in in silico modelling (including sources of training data), as well as how to potentially combine both experiment and in silico results as part of a defendable hazard assessment framework.

Although significant progress has been made in understanding the mechanistic basis for many of these adverse effects, there are still gaps that may hamper the development of robust in silico models. In addition, many in silico approaches predict the presence or absence of specific adverse effects, whereas an indication of the safe dose would be more valuable in many applications. Additional effort is also needed to better define how to follow-up from any in silico signal for the different specific contexts of use (e.g. regulatory frameworks).

This information has been summarized in three cross-industry publications that we are getting ready to submit. We hope these papers will help drive the development of the next generation of in silico models to accelerate the development of new chemical products as well as to support the 3Rs.

If you would like more information on these initiatives, please contact me (


  1. Myatt, G.J., Ahlberg, E., Akahori, Y., Allen, D., Amberg, A., Anger, L.T., Aptula, A., Auerbach, S., Beilke, L., Bellion, P., Benigni, R., Bercu, J., Booth, E.D., Bower, D., Brigo, A., Burden, N., Cammerer, Z., Cronin, M.T.D., Cross, K.P., Custer, L., Dettwiler, M., Dobo, K., Ford, K.A., Fortin, M.C., Gad-McDonald, S.E., Gellatly, N., Gervais, V., Glover, K.P., Glowienke, S., Van Gompel, J., Gutsell, S., Hardy, B., Harvey, J.S., Hillegass, J., Honma, M., Hsieh, J.-H., Hsu, C.-W., Hughes, K., Johnson, C., Jolly, R., Jones, D., Kemper, R., Kenyon, M.O., Kim, M.T., Kruhlak, N.L., Kulkarni, S.A., Kümmerer, K., Leavitt, P., Majer, B., Masten, S., Miller, S., Moser, J., Mumtaz, M., Muster, W., Neilson, L., Oprea, T.I., Patlewicz, G., Paulino, A., Lo Piparo, E., Powley, M., Quigley, D.P., Reddy, M.V., Richarz, A.-N., Ruiz, P., Schilter, B., Serafimova, R., Simpson, W., Stavitskaya, L., Stidl, R., Suarez-Rodriguez, D., Szabo, D.T., Teasdale, A., Trejo-Martin, A., Valentin, J.-P., Vuorinen, A., Wall, B.A., Watts, P., White, A.T., Wichard, J., Witt, K.L., Woolley, A., Woolley, D., Zwickl, C., Hasselgren, C., 2018. In silico toxicology protocols. Regulatory Toxicology and Pharmacology 96, 1–17.

Instem at QSAR 2021

QSAR 20211 is around the corner and after last year’s cancellation, we are excited to be in attendance. We are pleased to be contributors to this year’s program through participation in two platform sessions in addition to poster presentations.

If you are a frequent reader of this blog, you have read about our work on the development of in silico toxicology protocols, the nitrosamine SAR working group, and the fit-for-purpose evaluation of acute toxicity models for GHS classification. If you haven’t read our blog entries on these previous topics, then I would like to encourage you to have a look through past topics.

During this conference, we are taking a wide-ranging look at the successes and challenges of developing protocols for various toxicological endpoints such as skin sensitization, genetic toxicity, acute toxicity, carcinogenicity, and organ toxicity (Dr. Arianna Bassan’s poster takes a closer look at this). We discuss how knowledge of adverse outcome pathways, integrated approaches to testing and assessment (IATA) and defined approaches are useful in defining the rules and principles which are used to combine mechanistic information and toxicological effects. The extent to which this information is defined differs for various endpoints. However, we can identify areas where in silico prediction of a specific effect or mechanism (acute oral toxicity, for example) is supported by robust data sets and fit for purpose evaluations- Dr. Glenn Myatt’s poster will provide details on this. 

Dr. Kevin Cross’ talk on “Predicting N-Nitrosamine Activity from Structure-Activity Relationships’ will address the question “Can we do better at predicting N-nitrosamine carcinogenicity potency?” He will present an overview on the features which have been observed to lead to a reduction or elimination of potency of dialkyl-N-nitrosamines and how these features could be used to assist in predicting N-nitrosamine potency.

We hope to chat with you there, or contact me at for further information.


Cross-industry development of structural alerts to support the FDA Guidance on in vitro drug interaction studies

A recent FDA guidance for industry, titled “In Vitro Drug Interaction Studies – Cytochrome P450 Enzyme- and Transporter- Mediated Drug Interactions”1 includes the following statement:

 “A lower cut-off value for the metabolite-to-parent AUC ratio may also be considered for metabolites with structural alerts for potential mechanism-based inhibition (Orr, 2012; Yu, 2013; Yu, 2015)”

Today, a structure activity relationship (SAR) assessment of mechanism-based inhibition (MBI) of cytochrome P450 (CYP450) enzymes without computational support would involve reading a series of publications and visually comparing the alerts mentioned in the papers against your chemicals of interest. The biological and chemical context of the alerts will need to be carefully studied to ensure the validity of such matches. Since this would likely take a considerable amount of time to complete for a single chemical, a more efficient approach would be to use a computational system. However, there are many challenges in developing such a solution.

Firstly, there are numerous publications – at least 14 – that discuss such alerts. These publications describe general bioactivation alerts leading to the formation of reactive metabolites, some of which would cause MBI of CYP450 enzymes. Although the identification of these different types of alerts are helpful for a variety of applications, the FDA guidance specifically singles out “structural alerts for potential mechanism-based inhibition” and so these two classes of alerts need to be clearly differentiated.

Secondly, the different publications describe similar alerts and may use subtly different structural definitions. It is, therefore, challenging to harmonize these alerts over the different sources.

Thirdly, the context of such alerts, including specifics of the structure-activity relationships, is important when performing an expert review to conclude the alert is relevant for the chemicals of interest. This information should ideally be summarized for rapid review, including the reaction schemes underlying bioactivation.

Finally, the process of qualifying and refining these alerts using data is critical. Most of this inhibition data is generated by different companies and cannot easily be shared because of confidentiality concerns. However, it is still possible to use this data, without sharing proprietary information on individual chemicals or study results, using a similar approach to SAR fingerprinting2.

If you are interested in working with us on this collaborative project, please get in touch (Glenn Myatt,


  2. Ahlberg, E., Amberg, A., Beilke, L.D., Bower, D., Cross, K.P., Custer, L., Ford, K.A., Gompel, J.V., Harvey, J., Honma, M., Jolly, R., Joossens, E., Kemper, R.A., Kenyon, M., Kruhlak, N., Kuhnke, L., Leavitt, P., Naven, R., Neilan, C., Quigley, D.P., Shuey, D., Spirkl, H.-P., Stavitskaya, L., Teasdale, A., White, A., Wichard, J., Zwickl, C., Myatt, G.J., 2016. Extending (Q)SARs to incorporate proprietary knowledge for regulatory purposes: A case study using aromatic amine mutagenicity. Regulatory Toxicology and Pharmacology 77, 1–12. doi:10.1016/j.yrtph.2016.02.003 

New book on Mutagenic Impurities

We have been really happy to contribute to a number of chapters in an important new book edited by Dr. Andrew Teasdale: “Mutagenic Impurities: Strategies for Identification and Control”1

This will be an essential read for students and professionals in the field of genotoxic impurities (GTIs) assessment covering topics including a history of the regulatory guidelines, a detailed examination of the use of in silico models, an examination of mutagenic and carcinogenic potential of widely used reagents, issues related to N-nitrosamines, and other topics related to the toxicological and analytic assessment of GTIs.

Congratulations to Andrew for pulling such an important resource together!


  1. Mutagenic Impurities: Strategies for Identification and Control | Wiley

Instem’s Computational Toxicology and Genetic Toxicology Groups at GTA 2021

We are pleased to be attending this year’s virtual GTA meeting1 and will be presenting on several topics throughout the course of the event.

On Thursday May 6th, Dr. Kevin Cross will be presenting “Predicting N-Nitrosamine Activity from Structure-Activity Relationships” as part of a Symposium on “The 3Rs and in Silico Modeling”. This presentation addresses whether N-Nitrosamine carcinogenicity can be better predicted for regulatory purposes using mechanism-specific structure-activity relationships to identify potency categories and establish more precise methods for selection of analogs.

We are also presenting a poster entitled “Using 50,000 bacterial mutagenicity study results to improve expert alerts”. This poster describes a technique for improving the performance of an expert alert system, which is referred to as SAR-fingerprinting. This method enables the identification of structure-activity relationship knowledge from both public and, importantly, proprietary databases without the necessity to exchange confidential information on individual chemical structures and their individual results. The presented work will discuss SAR-fingerprinting in relation to bacterial mutation and describe how the data from multiple public and proprietary databases is analyzed and used to improve the latest version of the Leadscope expert alerts v8.

Throughout the event, scientists from both the Computational Toxicology and the Genetic Toxicology groups at Instem will be available at the Instem booth. Our genetic toxicology solutions make it easier to run regulatory assays by bringing together experimental set up, data acquisition, and reporting into one easy to use, fully GLP compliant system.  They optimize Ames, Micronucleus, Chromosomal Aberrations, Comet, Neutral Red Uptake, and other assay workflows whilst improving regulatory compliance. Visit the booth to pick up a fact sheet or find out more about how to optimize your regulatory assay workflows.

We look forward to meeting you there.

Please send me, Dr. Glenn Myatt (, a note if you would like to discuss any of these topics in more detail.



How an expert-review could be used to resolve out-of-domains

One of the more challenging outcomes from a (Q)SAR model is the out-of-domain (OOD) result. This result is possible since (Q)SAR models are, in many situations (such as the ICH M7 guideline), required to perform an applicability domain analysis to satisfy OECD validation principles1.  Although a (Q)SAR model may still generate a prediction, the OOD result generally means that the test chemical is outside the chemical space which the model makes a prediction with a given reliability.  An expert-review is helpful to understand and reassess the reliability of an OOD result.

The procedures discussed in the blog entry titled “The use of chemical analogs in expert reviews” could be utilized in such an expert-review; however, understanding the reason why the prediction is OOD in the first place could cue the assessor into what may be the focus of the review. This is especially important because different model types vary in their definition of its applicability domain. A prediction is considered within the applicability domain of Leadscope’s statistical models based on the inclusion of at least one structural feature in the model’s prediction and a sufficiently similar analog in the training set. Meeting these criteria indicates that the model ‘knows’ something about your test chemical and has a basis for making a prediction. If one of these criteria is not met; for example, there is no analog with sufficient similarity to the test chemical in the training set; but there are structural features used in the prediction, the result is considered OOD. In some of these cases, the lack of a close neighbor is due to the inclusion of a sub-structure in the test chemical which is not familiar to the model.  In addition to assessing potentially reactive features, and the relevancy of the model features, the prediction for the core sub-structure which is within the applicability domain of the model could serve as part of an expert review.  Assessing whether this sub-structure is potentially reactive is a good starting point, i.e., are there sufficient negative examples in the database to not consider the sub-structure a concern? If so, this review would support a reassessment of the prediction’s reliability.

Readily available parameters, such as the prediction probability are also helpful in such cases. A previous analysis by Amberg et al., 2019 showed that the risk of missing a mutagenic impurity given an OOD statistical result with a probability <0.2 and a negative expert rule based result is approximately the same for both methodologies predicting negative2.

In another instance, there may be sufficiently similar analogs in the training set, but the statistical model’s prediction is out of domain due to an absence of model features. Here an advantage of using complementary statistical and expert-rule based approaches is observed since the expert-rule based prediction will likely be within the applicability of domain because its applicability is not defined by model features but rather on the sufficiency of analogs. The content of the blog entry on the use of chemical analogs in expert reviews are helpful in these situations. In resolving OOD assessments, an expert-review is used. Despite the conduct of such a review, there remains a computational aspect to the analysis as the model/software provides information that facilitates the review.

Please contact me if you would like to discuss this further:

  1. OECD (2014), Guidance Document on the Validation of (Quantitative) Structure-Activity Relationship [(Q)SAR] Models, OECD Series on Testing and Assessment, No. 69, OECD Publishing, Paris,
  2. Amberg, Alexander et al. 2019. “Principles and Procedures for Handling Out-of-Domain and Indeterminate Results as Part of ICH M7 Recommended (Q)SAR Analyses.” Regulatory Toxicology and Pharmacology.

The use of chemical analogs in expert reviews

Computational tools offer a rapid, cost-saving advantage to toxicologists assessing the hazard of chemicals. The predictivity of a model for a group of structures is one aspect to be considered in a computational assessment. However, given the universe of chemicals, there are structural classes which a model will predict with a higher level of reliability than others. There may also be chemicals which are not confidently reflected in the model’s chemical space and these chemicals are typically not in the model domain (more on this topic in a later blog). It is important to assess the reliability of the model’s prediction as part of a computational assessment. The expert review is the method that allows the toxicologist to gain confidence in an assessment. I like to think of this process as being analogous to the review of experimental data through controls, statistical analyses, and the determination of false negative or positive results.

The degree to which a user can interact with the computational platform and understand how an assessment was made relates to the level of review which could be performed practically. Access to descriptors that are used in the prediction and the ability to support the use of the descriptors through analysis of the underlying data which substantiate the descriptor use is important. Parameters such as the diversity of the training set examples, the extent that any structural descriptors can be linked to a mechanism or whether there are other structural moieties that explain the activity of the training set examples are used to evaluate the assessment.

In addition to the above, here are a couple of items that I like to evaluate as part of a computational assessment.

  • Are there any potentially reactive features that are not considered by the model? This provides added information to support negative predictions in cases where the structural features of a statistical model does not consider the entire structure. In Figure 1, the entire structure is considered as a feature. This feature maps to two examples which are assessed as negative.
Figure 1. An evaluation of LS-167087 for potentially reactive features

Figure 2 shows features not considered in the analysis of LS-181651. An analysis of potentially reactive features considers the 1,3,5-triazine,2-phenyl- feature. This feature mapped to 4 examples, which are all negative for sensitization hazard. One of these examples (LS-181621) is a close chemical analog. Such reviews support a negative prediction.

Figure 2. An evaluation of LS-181651 for potentially reactive features
  • Is an alerting fragment represented in a known negative example structure and how does the chemical environment of the alerting fragment compare to the target structure? Figure 3 shows an aromatic nitro indeterminate alert which matched the target structure. A search for analogs showed that the indeterminate alert is also present in LS-188180, a known negative.  The system performs a comparison of the target and analog structures and indicates that the alerting sub-structure is within the same chemical environment in both structures. Such a review supports a negative assessment for the target structure, which is also assessed as negative by a statistical model.
Figure 3. Assessment of an analog to support an expert-rule based prediction

Expert reviews give added reliability to an assessment. Transparent models and platforms facilitate such reviews and mitigate any black-box concerns around in silico tool use.

Please send me a note at if you would like to discuss in more detail.