Computation and Data-Driven Discovery
RadBio-AI: Exploration of the Potential for Artificial Intelligence and Machine Learning to Advance Low-dose Radiation Biology Research
Comprehending the impact of low doses of radiation on biological systems, encompassing the human body, presents a particularly intricate challenge. This problem has been scrutinized for several decades yet, of late, has exhibited promising signs of being amenable to investigation via artificial intelligence and machine learning (AI/ML). RadBio’s primary aim is to assess the potential of AI/ML in advancing comprehension of low-dose radiation biology across three overarching domains:
- Discerning, identifying, and characterizing the molecular markers of radiation-induced damage in biological tissues, along with determining the potential differentiation of such markers in terms of radiation types, doses, and exposure patterns.
- Automatically extracting pertinent information from scientific literature and medical records to estimate an individual’s radiation dosages, exposure history, radiation types, and associated comorbidities.
- Methodically seeking innovative approaches for establishing prior information to enhance the sensitivity of detecting molecular markers, evidence of radiation exposure, and eventual outcomes.
In each domain, a set of inquiries has been formulated, which can be expeditiously addressed employing extant datasets and predominantly established AI methodologies. What distinguishes this endeavor is the extensive deployment of ML and the integration of diverse strategies. Specifically, predictive models are being developed to identify molecular markers, information is being gleaned from literature and medical records to formulate computational hypotheses, and prior information is being harnessed to enhance sensitivity and hasten the learning process. To accelerate these exploratory projects, the team is using existing datasets collected and curated through collaborations between the DOE and NCI Cancer, as well as other databases germane to low-dose research. Notably, this project will not generate novel data.
As part of this multi-institutional effort, Brookhaven Lab’s core objective is to distill insights from the scientific and biomedical literature that can advance comprehension of low-dose radiation biology, create novel hypotheses for computational testing, and forge Bayesian priors capable of ameliorating the predictive capabilities of ML methods, particularly when confronted with limited new data. Principal components of this research encompass: natural language processing (NLP)-based extraction of knowledge from biomedical literature pertinent to low-dose radiation, construction of knowledge-driven priors for Bayesian ML, methods for efficient sampling from intricate knowledge-based priors, ML-fueled prediction and analysis of the molecular signatures underpinning responses to low-dose radiation, and expansion of NLP/ML capabilities through high-performance computing. Key undertakings and accomplishments encompass:
- Automated extraction of context-specific gene regulatory relationships from biomedical literature via NLP
- Formulation of knowledge-based priors grounded in biological information, especially gene regulatory relationships culled from existing literature
- Employment of Bayesian ML in tandem with knowledge-based priors to enhance the precision and resilience of studies pertaining to low-dose radiation response, particularly in situations marked by data scarcity
- Deployment of efficient techniques to enable large-scale sampling from intricate knowledge-based priors.