written by manuj vangipurapu, CEO and founder of Clinion
“The future of Clinical Research is AI”! It’s common place to hear this now a days but what does it mean? We have all heard of how AI is being applied in basic research in identifying molecules, in finding disease patterns in potential patient populations and in Virtual Trials. In this article I will briefly touch upon the various well known and a few lesser-known applications of AI and Automation in the clinical trials process.
Machine Learning (ML) is a branch of AI which deals with applying algorithms to data, enabling the system to ‘learn’ and improve. ML allows users to process large quantities of data and make smart inferences and predict outcomes. These insights can be used to automate parts of the system leading to a faster and a more efficient clinical trial system. Automation allows the ML predictions to be fed back into the system and specific actions to be taken, reducing the need for human intervention, and improving quality and speed. ML and automation can be applied across every stage of the trial process.
Machine Learning can be applied to protocol design and language translation. Using existing protocol data and health libraries for specific therapeutic areas, a protocol for a new study can be generated by the system. The ML algorithms would be able to design an optimal protocol from the knowledge base, leading to reduced design times and protocol amendments and study disruptions. Language translation could also be done quickly and easily and with a greater degree of accuracy than traditional methods since the ML model would have a domain specific language knowledge base to learn from.
ML can be used to automate the design and set up of the case report form and study database. Using a library of CRFs for specific therapies and study designs, based on the protocol, the ML model can be trained to design an optimal CRF along with edit checks. Automation allows this output to be translated into actual study setup and validation, allowing database designers to tweak the design as and where required. This approach leads to an optimal design which also incorporates edit checks which otherwise might be missed out if being designed by a human. Automation also allows this ML designed study to be set up and validated. The validation report provides the necessary inputs to designers to apply the finishing touches before go-live. ML can also be used to automate SDTM mapping or create SDTM annotated studies.
A lot of automation involving machine learning is possible in trial management. Some of the obvious use cases are site selection, patient enrolment, Risk Based Monitoring (RBM) and Chatbots.
Site Selection: Optimal site selection is possible using machine learning models. These models can be trained to review site parameters such as Enrolment, Safety, Compliance and Data Quality and predict which sites would be good candidates for a new study for a particular specialty. Prioritization of these parameters depends upon type of trial and CRO/Sponsor. The algorithm could be trained on previous study data and would be able to predict site performance for a new study
Patient Enrolment: Predictive Analytics for patient enrolment is a popular use case. This utilizes variables such as therapeutic area, study duration, disease prevalence (from Health Economics), study complexity, adverse events, randomization, multi-centric etc. The ML algorithm would review all the above variables and select those which have the most impact (relevant). The finalized model could then be used for future studies to predict patient enrolment. Even though this is a popular use case, the large no of external factors makes this prediction very challenging and of low probability.
Risk Based Monitoring: Risk Based Monitoring or RBM in clinical trials can be applied at various stages of the trial to identify and mitigate risks affecting a clinical trial. One type of RBM utilizes some of the components of site selection like Enrolment, Safety, Compliance and Data Quality and other variables like therapeutic area, multi-centricity of trials etc. to predict site performance during a clinical trial. These predictors can be used to de-risk the trial by identifying risk in advance and working to alleviate it or closing some sites and opening new ones or focusing on the sites which are doing well.
Chatbots: Chatbots are some of the simplest and easiest to develop examples of the power of machine learning. Different chatbots can be deployed for different types of users – site users, CRO staff, patients etc. Users can interact with them by text and voice and the chatbots understand natural language and context and are able to revert with very accurate responses. This improves user experience and reduces the burden on the support team.
Data Management offers tremendous scope for AI enabled automation. Some of them are listed below:
Smart Queries: In smart querying, the machine learning algorithm reads the entered trial data and identifies potential queries which can be raised for various field items. This identification is possible through a combination of previous study data and the therapeutic area. The algorithm learns the potential value ranges for a particular data point with regards to a therapeutic area and raises a query if it identifies a deviation. This query is then vetted by a data manager and qualified as a legitimate query or discarded. The ML algorithm also learns from this decision and improves its classification going ahead.
Medical Coding: Medical coding terms using WHODD and MedDRA dictionaries can be automatically coded using regular programming only to a certain percentage above which a medical coder has to review the data and manually code the remaining terms. ML algorithms can learn from coding libraries for various therapeutic areas. They can then match the required verbatim text of the study with the correct dictionary term for that specialty. Machine Learning can accomplish this coding to a high degree of accuracy.
Query Management: Thousands of queries are raised for every clinical trial and a large amount of time is spent responding to these queries. Many these queries are redundant and raised because of misconfiguration of edit checks in the EDC. These can be identified using machine learning and managed in bulk or appropriate edit checks can be configured mid-study to address the issue going forward. Machine learning uses clustering to identify clusters of queries which can be grouped together, and issue identified. These clusters can also be dealt in bulk.
Smart SDV: Organizations spend a lot of time and money in trial monitoring. CRAs must travel to sites to monitor the study and also do Source Data Verification (SDV). Machine learning can greatly reduce all these manual efforts. Site personnel can take an image of the source documents and upload to the server. Machine learning algorithms can extract the text from these images and send them to the EDC. The EDC then compares this data to the entered data and marks them as source data verified if there’s a match. Otherwise, it raises a query which would need to be manually verified.
Machine Learning can provide many insights into clinical data during and after the trial. Classification, clustering and prediction are some of the techniques which can be used in data analysis to bring out critical insights into large datasets. Patient behavior, adverse events etc. can be predicted using machine learning.
Regulatory submission in clinical trials requires a large amount of documentation. These can be templatized and automated using machine learning.
CSR Automation: In CSR Automation, the Clinical Study Report can be generated automatically using machine learning by reading the Study Protocol and the Study Analysis Report (SAR). Following ICH GCP templates, most of the CSR can be generated. Natural Language Processing (NLP) algorithms can be used to change the language of the CSR and can also be used to generate the narratives. These can then be reviewed by the Medical Writer and edited to arrive at the final CSR. All of this is possible in 2-3 days. This process shortens the regulatory submission process drastically and improves submission quality.