In silico toxicology consortia: impact and future direction

In a previous blog entry, Dr. Glenn Myatt discussed the impetus for developing a consortium to define best in silico practices around toxicological endpoints, such as, genetic toxicity, skin sensitization, carcinogenicity, neurotoxicity and acute toxicity, to name a few. The aim of these protocols is to reduce the burden on industry and regulators to justify their use, as well as ensure in silico assessments are performed in a consistent and reproducible manner to support good in silico practices. The consortia’s activities support a number of emerging or existing regulatory guidelines such as the ICH M7: DNA reactive (mutagenic) impurities in pharmaceuticals.

Group activities result in either protocols (which define implementable rules and principles), position papers (which describe the current state of science and the extent to which in silico tool use is feasible), case studies, structure-activity relationships, or fit-for-purpose evaluations. To date, the reliability scoring paradigm1, which serves as a useful extension of the Klimisch scoring of experimental data has been cited by the World Health Organization, EHC240: Principles and Methods for the Risk Assessment of Chemicals in Food, subchapter 4.5. Genotoxicity2.  Further, elements of the ‘In silico toxicology protocols’1 and ‘Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses’3 have been cited by the European Medicines Agency (EMA)’s reflection paper on the qualification of non-genotoxic impurities4 and the ICH guideline M7 on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk – questions & answers Step 2b5.

Several working groups are in progress, and new working group activities including an expansion of the carcinogenicity position paper6 to develop case studies in support of the development of new approaches and protocols using information from target safety assessments7, in silico approaches and in vitro/in vivo data are being formed. Additional working groups, including the assessment of biomolecule reactivity and drug/drug interaction are on the horizon. If you are interested in any of these topics, or have a comment, question, or problem that you are facing please contact gmyatt@leadscope.com or cjohnson@leadscope.com.

References

  1. Myatt, G.J., Ahlberg, E., Akahori, Y., et al. (2018), In Silico Toxicology Protocols. Regul. Toxicol. Pharmacol. 98, 1-17. doi:10.1016/j.yrtph.2018.04.014. Open  access: https://doi.org/10.1016/j.yrtph.2018.04.014
  2. World Health Organization & Food and Agriculture Organization of the United Nations (2020), Principles and methods for the risk assessment of chemicals in food. Subchapter 4.5 Genotoxicity. Environmental health criteria 240
  3. Amberg, A., Andaya, R.V., Anger, L.T., et al. (2019) Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses. Regul. Toxicol. Pharmacol. 102, 53–64. 10.1016/j.yrtph.2018.12.007
  4. European Medicines Agency (2018) Reflection paper on the qualification of non-genotoxic impurities
  5. European Medicines Agency (2020) ICH guideline M7 on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk – questions & answers Step 2b
  6. Tice at al., In Silico Approaches In Carcinogenicity Hazard Assessment: Current Status and Future Needs, submitted to Regulatory Toxicology and Pharmacology
  7. https://www.instem.com/solutions/knowledgescan/index.php

What customer support question do we hear the most?

One of the most common questions we are asked is how an overall toxicity assessment for a given chemical is arrived at, especially when there are conflicting study results available.

To answer this question, it is often helpful to review the process of producing the content in Leadscope’s databases, which is overseen by Leadscope’s Manager of Database Content, Dave Bower.

The starting point for this production process is the original data sources. For the genetic toxicity database, these sources include CCRIS, CDER, CFSAN, CPDB, DSSTox, EPA-Genetox, NTP, publications, donated company information, and many more.

Converting this information into an integrated database initially involves two parallel processes: (1) the chemical structure processing workflow and (2) the content (toxicity study) building process.

To ensure all studies for the same chemical are linked together, each chemical (test article) is compared against our existing database. It is either registered as a new chemical (and given a new Leadscope ID) or it is linked to a previously registered chemical. This process can be difficult when only a chemical name has been reported, particularly when a chemical is historically referred to by different names. In situations when the chemical structure is displayed within the source material, issues related to the depiction of its stereochemistry as well as aromaticity and tautomerism may need to be taken into consideration. Mixtures, salt forms are often linked to SAR-forms of the chemical to ensure the studies are readily accessible and to support computational modelling efforts.

The content building process is also challenging since the underlying information may or may not be an electronic form that is suitable for processing automatically. In certain situations, it is necessary to enter the information by hand. In others, it is possible to develop computational tools to read the content directly into the electronic database. An essential step here is to map the data elements described in the source material onto standardized terms. For example, a species and strain in one study may be reported as “S. typhimurium 100” and in another as “Sal. TA100” yet both need to be mapped onto a standardized species and strain terms (“Salmonella typhimurium” and “TA100”). In generating the content, multiple QA steps are included to ensure the integrity of the information.

Once the chemical structure processing is complete and harmonized study records are linked to these chemicals, a process of grading the chemicals can then take place. For example, an overall call for bacterial mutagenicity can be derived from the multiple data sources. This process involves an examination of the overall study calls and the underlying individual test results. Some factors that are taken into consideration include whether the data source is trusted or authoritative and whether or not the study is compliant with accepted test protocols. Since the overall study calls for different studies may be conflicting, the weight of the evidence needs to be considered into generating an overall grade for an individual chemical. However, the individual studies are always reported alongside any overall calls to support an expert review of the individual calls.

Dave recently put together a slide deck explaining the process in detail, including a series of case studies illustrating the process. Please get in touch with me (gmyatt@leadscope.com) if you would be interested in learning more about this process or would like a copy of Dave’s slide deck.

Are (Q)SAR models fit-for-purpose for classification and labelling?

We were recently involved in a cross-industry project to determine whether (Q)SAR models were fit-for-purpose for classification and labelling. To test this hypothesis, a series of companies across different industrial sectors each compiled a data set of chemicals with experimental acute rat oral data. These chemicals were run against the first version of the Leadscope acute rule-based and statistical-based models. The experimental results along with the predictions generated by the (Q)SAR models were then shared (no information on the individual chemical structures was shared) and the performance of the models based on this blind data set was then quantified.

We calculated a number of statistics to determine whether these models were fit-for-purpose. This included an assessment of whether the (Q)SAR models predicted either the correct category or a more conservative or potent category.

The absolute percentage of correct or more conservative predictions was approximately 95%.

These results are part of a manuscript that was just accepted for publication. The paper also covers the performance of the different (Q)SAR methodologies, the performance over different industrial sectors as well as the impact of an expert review on the results.

Please get in touch with me (gmyatt@leadscope.com) if you would like a copy of a recent poster from the ACT meeting on this topic or would like to talk in more detail.

What does the harmonization of in silico safety assessments mean for extractables and leachables?

The ICH M7 guideline1 presents the types of methodologies that are relevant to assess mutagenic impurities. The guideline goes further than simply mentioning that QSAR methods and analogs should be considered prior to conducting an experimental study, and it details two complementary methodologies (statistical and expert rule-based), and how the results should be considered to derive an assessment of mutagenicity. These principles could be used in the safety assessment of extractables and leachables to assess mutagenicity, if warranted. However, there are additional areas of toxicology which may be relevant for the safety assessment of extractables and leachables which could benefit from such procedural descriptions such as skin sensitization, and dermal and ocular irritation.  Databases and tools are in place for utilizing existing information to make reliable predictions of a leachable’s toxicity.

The procedures described in the ICH M7 guideline1 for mutagenicity allow the assessor to obtain accurate and consistent predictions while reaping practical benefits. A harmonized guideline which details the strategic use of in silico methods for safety assessments will reduce the burden on assessors and regulators to perform or review an in silico toxicological assessment. Given our experience conducting assessments, here are some aspects of best practice considerations for in silico assessments.

  • An important step is assessing the adequacy and reliability of data in the public domain, including conflict resolutions. This may circumvent the need for in silico testing.
  • Two complementary methodologies (statistical and expert rule-based) offer an advantage. While statistical based methodologies assess all areas of the molecule and the structural basis for the prediction can be explained, the alerts identify reactive groups that are linked to a mechanism. However, it can be difficult to support negative predictions using alerts alone. Rules should be defined to derive a consensus based on the two methodologies which generate conservative predictions.
  • Criteria illustrated with case-studies should be defined for an expert review, such as, how to assess a chemical which shares an alert with a known negative.

There is much anticipation for the ICH Q3E2 guideline in the assessment of extractables and leachables.

Please get in touch if you would like to have a copy of a recent poster that was presented at ACT on this topic (cjohnson@leadscope.com).

References

 1. ICH. M7 Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals  to limit potential carcinogenic risk (R1). 2017.

2. Final Concept Paper ICH Q3E: Guideline for Extractables and Leachables (E&L). 2020. https://database.ich.org/sites/default/files/ICH_Q3E_ConceptPaper_2020_0710.pdf

Understanding false negatives

For toxicological tests, a false negative (i.e., predicting a chemical is negative when it is in fact positive) is one type of error. It is often desirable to minimize the number of false negatives to decrease the risk of missing a toxic chemical.

Computational toxicology is a rapid high-throughput test of multiple chemicals, and like all tests, it can generate false negative results. To illustrate, the ICH M7 pharmaceutical impurities guidelines1 recommends the use of two complementary (Q)SAR methodologies to predict the results of the bacterial mutagenicity test. By selecting the most conservative outcome, this approach reduces the number of false negatives compared to using a single methodology. The guideline also highlights the use of an expert review, especially when the results from the two methodologies are conflicting or inconclusive. The subsequent expert review additionally minimizes the risk of missing a mutagenic impurity.

For the different (Q)SAR outcomes (including positive, negative, out-of-domain, and indeterminate) generated by each computational method, is it possible to understand the risk of missing a mutagenic impurity?

The quantification of this risk would be helpful in addressing the scope of any subsequent expert review.

To understand this risk, computational models typically used in assessing pharmaceutical impurities were run over a series of proprietary collections where the results of the bacterial mutagenicity test were known. For each collection, the experimental results and the (Q)SAR predictions (from the two methodologies) were then shared with us. In total, information on approximately 16,000 chemicals was shared. We then grouped the chemicals by the different combinations of (Q)SAR outcomes for the two methodologies (“statistical” and “expert”), such as negative(statistical)&negative(expert) or out-of-domain(statistical)&negative(expert).

For each combination of outcomes, the proportion of positive experimental results was then calculated. For example, 7,978 chemicals were predicted as negative(statistical)&negative(expert) and 8.1% were mutagenic. By computing the percentage of mutagenic chemicals for each combination of outcomes, it is possible to understand the risk of missing a mutagenic impurity. This in turn can be used to determine the scope of any expert review and may also be used as part of the weight-of-the-evidence.

The full results from this analysis, along with a series of case studies, have been reported in a recent publication2 which was cited in the 2020 European Medicine’s Agency questions & answer step 2b3.

Please contact me (gmyatt@leadscope.com) for more information on how this approach can be used as part of an expert review.

References

1. ICH M7, 2017 (R1) (2017) Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk.

2. Amberg et al., 2019. Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses. Regulatory Toxicology and Pharmacology 102, 53–64. doi:10.1016/j.yrtph.2018.12.007 

3. European Medicine’s Agency ICH guideline M7 on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk – questions & answers Step 2b. 2 July 2020. EMA/CHMP/ICH/321999/2020

Why is it important to define confidence?

As Candice described in her recent blog “So many pieces of information!” – making actionable decisions is extremely complicated, especially when integrating the results from in silico models alongside historical in vivo studies and in vitro experiments.

When is there sufficient information to make a decision?

Well, it depends the type of decision and the associated risk of making an incorrect prediction.

  • Do I have enough information to support filling data gaps or performing prioritization?
  • Do I have sufficient information to include in a regulatory submission?
  • When should I generate additional experimental data?

Ultimately, this can be answered through an evaluation of the overall confidence in the final assessment. Lower confidence assessments may be sufficient for filling data gaps or the prioritization of chemicals, whereas regulatory submission will require a high degree of confidence in the outcome.

Frameworks that include rules and principles for deriving such assessments of confidence support both a transparent and reproducible evaluation and will ultimately lead to increased acceptance of these methods. One such framework is detailed in the in silico toxicology protocol1 which considers the reliability, relevance and completeness of the information for a set of defined effects/mechanisms.

The framework describes a series of criteria to define what in silico models to include in the first place. Experimental data and/or in silico results may be used to assess these defined effects/mechanisms. A harmonized scoring system (referred to as the Reliability Score) is used to quantify the reliability of the information (based on both experimental data and in silico results) for each defined effect/mechanism.

The framework then outlines the rules and principles for combining this information on the effects and mechanisms based on their defined reliability, their relevance to the final assessment and the completeness of the information.

Development of a consensus for defining these principles is really hard, yet essential to increase the acceptance of these alternative methods across different applications.

The in silico toxicology protocol publication was a good first step; however, we are continually working on these issues through working groups and the publication of case studies.

If you are interested in collaborating on these topics, please reach out to me at gmyatt@leadscope.com or Candice at cjohnson@leadscope.com.

1. Myatt, G.J., Ahlberg, E., Akahori, Y., et al. (2018) In Silico Toxicology Protocols. Regul. Toxicol. Pharmacol. 98, 1-17. doi:10.1016/j.yrtph.2018.04.014

Open access: https://doi.org/10.1016/j.yrtph.2018.04.014

Establishing potency categories for Nitrosamine impurities

Nitrosamine impurities currently belong to a “cohort of concern” because of their potential to be potent mutagenic carcinogens, as described in the ICH M7 guideline1.

They are also coming under increasing regulatory scrutiny with the US Food and Drug Administration and European Medicines Agency recently issuing new guidelines2,3 for examination of this class. These guidelines include computational approaches to establish limits for N-Nitrosamine impurities.

In supporting these important guidelines, Kevin Cross from Leadscope (an Instem company) in combination with others has established a Nitrosamine SAR working group comprising over 46 members and 20 companies. One important activity is the development of predictive strategies for N-Nitrosamine carcinogenicity potency, including documenting a comprehensive understanding of the structure-activity relationships (SAR) for N-Nitrosamines. This assessment is based upon analysis of carcinogenicity and genotoxicity data, reaction mechanisms and structural similarity. It is leading to the definition of categorical alerts to predict several carcinogenic potency categories.

SAR analysis is focused on understanding the impact of several reactivity sites near the N-Nitrosamine functional group that involve different reaction mechanisms leading to differences in potency severity. These reactivity sites include the nitrogen-nitrogen bond, the α-carbon and the β-carbon. Access to these sites (e.g. steric hinderance), along with electronic effects impacts reactivity and consequently, potency. By considering both reaction mechanism and specific substituents at these positions, N-Nitrosamines may be assigned to very-high, high, medium, or low potency categories4.

Through examination of both structural similarity and an examination of the dominant mechanism, categorical alerts can be established to predict N-Nitrosamine potency. This approach will provide a scientifically defendable methodology for establishing potency categories and corresponding exposure limits and will also avoid “activity cliffs” where the structural similarity concept breaks down.

If you would like to learn more, or collaborate on this important project, please reach out to me at gmyatt@leadscope.com or Kevin at kcross@leadscope.com.

References

1. Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk M7(R1)

2. US Food and Drug Administration’s Guidance for Industry: US FDA Control of Nitrosamine Impurities in Human Drugs, September 2020

3. European’s Medicines Agency, Nitrosamine impurities in human medicinal products, EMA/369136/2020, 25th June 2020

4. K.P. Cross “Predicting Nitrosamine Activity from Structure-Activity Relationships”, Informa Nitrosamines Impurities Forum.  26th August 2020. Slides with audio presentation available info@leadscope.com.

So many pieces of information!

The movement of toxicology away from an observational-based paradigm and towards a mechanism-based one is ongoing. One pertinent question is how to combine data across mechanistic pathways to derive an overall assessment of hazard and what level of confidence should be placed in such a result. Further, where data gaps exist, how could in silico tools be used to support an assessment.

The area of skin sensitization is exemplary. Hinging on knowledge of the adverse outcome pathway (AOP) for skin sensitization, key events across the AOP could be assessed and integrated to derive an overall assessment. But what if information is missing? Could I utilize the power of existing knowledge- stored in a database- to analyze reactive features, statistical correlations, chemical and mechanistic similarity to facilitate an assessment? Could I really ‘pull out all the stops’ as they say? And if I did (congrats to you on getting this far!), and managed to standardize reporting for the various assessments, would I be able to reproducibly justify the overall conclusion across hundreds of chemicals and effectively (don’t forget reproducibly) communicate the confidence in the results?

There is so much to unravel here. A major plus is that we have a starting point. Documents such as the OECD’s Guidance Document on the Reporting of Defined Approaches and Individual Information Sources to Be Used within Integrated Approaches to Testing and Assessment (IATA) for Skin Sensitization1  are an excellent resource. The recently published skin sensitization in silico protocol2 is also a good resource for expert review considerations and guidelines on how to assess the reliability and confidence of an in silico assessment. If you would like to talk more about implementing the principles outlined in the skin sensitization in silico protocol, we would like to hear from you. Together, we could explore the solutions to many of the questions above.

  1. OECD. Guidance Document on the Reporting of Defined Approaches and Individual Information Sources to Be Used within Integrated Approaches to Testing and Assessment (IATA) for Skin Sensitisation. OECD; 2017. doi:10.1787/9789264279285-en  
  2. Johnson C, Ahlberg E, Anger LT, et al. Skin sensitization in silico protocol. Regul Toxicol Pharmacol. 2020;116:104688. doi:https://doi.org/10.1016/j.yrtph.2020.104688

Can the burden on industry and regulators be reduced?


About 6 years ago, the International Conference on Harmonization (ICH) published the M7 guideline “Assessment and Control of DNA reactive (mutagenic) Impurities in Pharmaceuticals to limit Potential Carcinogenic Risk”.[1] This was a landmark moment for computational toxicology, as it was the first time such methods were recognized by the ICH as a regulatory test.

An implementation period followed the guideline’s publication in which pharmaceutical companies used computational assessments of impurities in their submissions to regulatory authorities. During this period, many questions on how such an assessment should be performed and documented were raised by the stakeholders. And after a series of discussions, we decided to form a consortium to address these issues through the development of a protocol outlining how computational toxicology assessments aligned with ICH M7 should be performed. In 2016 this protocol was published “Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses” [2] and we were encouraged to see how well received the paper was. It is consistently in the most downloaded articles in Regulatory Toxicology and Pharmacology and is often cited in publications and presentations by regulators and industry scientists. A follow-on paper was published in 2019 to address issues related to handling inconclusive results “Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses”. [3]

Based on the success of this work, we began to wonder – could we repeat this experience for other toxicology endpoints?  This would reduce the burden on industry and regulators to justify their use, as well as ensuring in silico assessments are performed in a consistent and reproducible manner to support good in silico practices.

Propelled by an NIH grant and enthusiasm from a wider consortium of over 60 members, we began to develop a framework for such protocols which was published in 2018 “In silico toxicology protocols”. [4] Individual working groups focusing on the development of the endpoints-specific protocols were established and in 2019, the “Genetic toxicology in silico protocol” [5] was published, followed by the “Skin sensitization in silico protocol” [6] in 2020.

Work continues on the development of other protocols and supporting position papers. It is clear that for endpoints, such as skin sensitization, where there are generally accepted Adverse Outcome Pathways (AOPs), Integrated Approaches to Testing and Assessment (IATAs), Defined Approaches (DAs), etc., that the process of putting together an accepted protocol is more straightforward than other areas, such as carcinogenicity. As such, position papers reflecting the current state-of-the-art, as well as gaps in our current knowledge, are important steps towards in silico protocols for complex toxicological endpoints.

This year also marked another important landmark in the evolution of this project with the complete implementation of the published protocols within Leadscope’s products. The solution provides access to computation methodologies and toxicity databases outlined in the protocols which are incorporated within a visual decision framework to support an inspection of the results, along with the ability to perform an expert review and document the entire assessment process.

We are excited with the progress to date and the momentum of this project to support the more widespread application of in silico methods through adoption of good in silico practices.

Please get in touch if you’d like to collaborate on this project.

References

[1] ICH M7, 2017 (R1) (2017) Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk. https://database.ich.org/sites/default/files/M7_R1_Guideline.pdf

[2] Amberg, A., Beilke, L., Bercu, J., et al. (2016) Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses. Regul. Toxicol. Pharmacol. 77, 13–24. doi:10.1016//j.yrtph.2016.02.004

Open access: https://doi.org/10.1016/j.yrtph.2016.02.004

[3] Amberg, A., Andaya, R.V., Anger, L.T., et al. (2019) Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses. Regul. Toxicol. Pharmacol. 102, 53–64. 10.1016/j.yrtph.2018.12.007

Open access: https://doi.org/10.1016/j.yrtph.2018.12.007

[4] Myatt, G.J., Ahlberg, E., Akahori, Y., et al. (2018) In Silico Toxicology Protocols. Regul. Toxicol. Pharmacol. 98, 1-17. doi:10.1016/j.yrtph.2018.04.014

Open access: https://doi.org/10.1016/j.yrtph.2018.04.014

[5] Hasselgren, C., Ahlberg, E., Akahori, Y., et al. (2019) Genetic toxicology in silico protocol. Regul. Toxicol. Pharmacol.  107, 104403. doi:10.1016/j.yrtph.2019.104403 

Open access: https://doi.org/10.1016/j.yrtph.2019.104403

[6] Johnson, C., Ahlberg, E., Anger, L.T., et al. (2020) Skin sensitization in silico protocol. Regul. Toxicol. Pharmacol.  116, October 2020, 104688. doi: 10.1016/j.yrtph.2020.104688 

Open access: https://doi.org/10.1016/j.yrtph.2020.104688