We were recently involved in a cross-industry project to determine whether (Q)SAR models were fit-for-purpose for classification and labelling. To test this hypothesis, a series of companies across different industrial sectors each compiled a data set of chemicals with experimental acute rat oral data. These chemicals were run against the first version of the Leadscope acute rule-based and statistical-based models. The experimental results along with the predictions generated by the (Q)SAR models were then shared (no information on the individual chemical structures was shared) and the performance of the models based on this blind data set was then quantified.
We calculated a number of statistics to determine whether these models were fit-for-purpose. This included an assessment of whether the (Q)SAR models predicted either the correct category or a more conservative or potent category.
The absolute percentage of correct or more conservative predictions was approximately 95%.
These results are part of a manuscript that was just accepted for publication. The paper also covers the performance of the different (Q)SAR methodologies, the performance over different industrial sectors as well as the impact of an expert review on the results.
Please get in touch with me (firstname.lastname@example.org) if you would like a copy of a recent poster from the ACT meeting on this topic or would like to talk in more detail.