Offering clinicians an online artificial intelligence guidance system to bridge research gap between medicine and engineering
With the advent of deep learning, artificial intelligence (AI) has steadily grown in its use and application in the medical field. Today, AI is being used in several areas of medicine, including disease diagnosis, electronic health records, medical image analysis, and even epidemic outbreak tracking and detection. More than ever, clinicians would like to be able to develop and train their own model for use in their research studies. However, with the plethora of choices available – from the much beloved U-Net, to its many derivations, including the popular Attention U-Net – the choices have become endless. While there are guidelines on how to present AI-driven clinical reports (i.e., CONSORT-AI [1]), no such guidelines or assistance is provided in the choice of AI models or the factors that contribute to the successful deployment of said models.
The objective of this project is to provide an online platform in which clinicians can simply type out the requirements of their research and the type of AI tool they need. The platform, using a large language model (LLM) will then provide the clinician with a report that highlights (1) the most suitable AI architecture for their task, and (2) guidance on how to execute the training, including suggesting the most suitable hyperparameters. The selected student will be responsible in the development of the LLM.
The following steps are expected to be completed as part of the project:
1. Background review
o Understanding the gap between how engineers vs. clinicians approach AI modelling
o Summary of AI models, their tasks, and common area of applications in medicine
2. Data collection and training LLM:
o Identifying an already existing trained LLM, for predicting the next word, on a large-scale text dataset.
o Collect two types of instruction-based datasets, and fine-tune the model:
Freely accessible high-quality general instruction-based dataset (e.g. Open Assistant)
Specialised instruction-based datasets grounded in medical AI texts, health records and guidelines (this dataset will be generated by the student).
o Fine-tune the model using the collected dataset.
3. GitHub Release:
o The final work will be shared on GitHub, and any accompanying publications completed will be listed in the GitHub release.
Results from this project are expected to be published as a journal or conference paper.