Common Sense Validation and Quality Control for Artificial Intelligence

As we discussed in our last “Mosaic Medicine” blog post, there are many new modalities and paradigm-shifting technologies in various stages of development and deployment that will revolutionize the clinical research and medical product development industries. Underpinning many of these advances (and the hottest topic on the Internet right now) is artificial intelligence or AI. Techniques like machine learning, deep learning, large language models, automatic speech recognition, agentic AI, vision language models and other approaches can advance fundamental discovery and clinical studies, improve disease detection, produce a range of smarter devices and improve healthcare in many other ways.

While the algorithms are quite complex, at their root, AI is simply math. Training data is used to create a function, which is then used to make predictions based on new data. This oversimplification is useful because it highlights where this technology can go wrong. Biased training data, for example, will create biased algorithms, compromising their efficacy and potentially reinforcing existing fairness issues in healthcare. As the cliché goes: garbage in, garbage out.

This tension between AI’s potential promise and its obvious vulnerabilities is dominating the current conversation. Companies are ready to dive headlong into the AI scrum and develop unique and valuable technologies. On the other hand, the technology itself can be a black box, with few people understanding how the predictions are made and whether they are even accurate, precipitating regulatory caution.

To complicate matters, AI is not like a traditional medical device or a small molecule drug – it can change over time. That is often what makes AI, AI: the ability to continuously learn from the data it’s handling. This makes the technology adaptable, but it also drives risk. If an algorithm learns the wrong lessons from the data and makes bad predictions, that could endanger patients.

To advance this technology, especially in the wake of recent government actions such as the White House’s AI Action Plan and related Executive Orders, we must combine thoughtful innovation with regulatory adaptation. While regulation is sometimes considered a dirty word, it’s important to remember that good governance clears the way for innovation. Sound guidelines give industry well-delineated pathways to success. Alternatively, poor governance creates commercial and public uncertainty, which can kill markets.

When considering AI-based healthcare products from a regulatory standpoint, there are three main categories. The first is software that’s regulated as a medical device. Algorithmically-based, closed loop systems that detect blood glucose and drive insulin administration are good examples. Other applications include the programs that pinpoint danger in radiology images and the AI-based complex biomarkers discussed above.

The second category is AI that may not be considered a medical product and is therefore unregulated. That might include products that manage rote administrative tasks, such as ambient listening systems to transcribe notes. Not surprisingly, some AI falls into regulatory gray areas, such as laboratory developed tests (LDTs). The recent court decision nullifying the U.S. Food & Drug Administration’s (FDA’s) LDT rules (American Clinical Laboratory Association v. U.S. Food and Drug Administration) could muddy that landscape even further.

The third category is AI for evidence generation. This could be evaluating clinical trial results or converting real-world data into real-world evidence. However, it also includes using AI to help develop the large datasets used to generate, evaluate and monitor AI. In other words, we’re using AI to help manage AI, which lends an ouroboros element to this problem.

If AI is being used to clean up data from electronic health records, for example, it is continuously adapting as it does the job. This is great, in theory, but how will users know that the algorithm is not hallucinating from bad or misinterpreted data and producing errors?

That leads to the overriding regulatory question: What mechanisms are in place to continuously crosscheck the data? The average car has more than 50 sensors to monitor its function and ensure safety. Many AI algorithms have the capacity to change from one minute to another, how do we have confidence they are making the correct changes?

Ultimately, AI needs to show its work, providing consistent, real-time validation and performance data. If we’re expecting oncologists to rely on a sophisticated algorithm to help them develop the most precise and personalized treatment plans, we need them to have complete faith that it’s doing exactly what it’s supposed to do – always. Given the U.S. Government’s focus on winning the global AI race and accelerating the country’s AI work, industry must decide how to best develop safe and reliable products.

To get there, the combined AI and healthcare ecosystems must develop the underlying infrastructure to ensure compliance. That means creating the multimodal, longitudinal and linked datasets required to enhance transparency, detect bias, support accountability and create the most reliable systems. Stakeholders must also enhance AI literacy and workforce training to ensure proper implementation.

AI has tremendous potential but it must be deployed responsibly. That means policymakers, healthcare professionals, and tech developers must collaborate to develop the appropriate standards to support public confidence and maximize innovation.

This may seem like a heavy lift – and it is – but it is not unprecedented. For decades, the Federal Aviation Administration and the National Transportation Safety Board have worked closely with the airlines to ensure safe travels and industry health. The healthcare ecosystem, software developers, professional medical associations and government regulators must all join forces to bring this technology – in its best possible form – to patient care.

Read Mosaic Medicine part 4 on New Modalities here, Mosaic Medicine Part 3 on the Rise of Data Mosaicism here, Mosaic Medicine Part 2 on The Snail here, and Part 1 on A Vision of the Shifting Clinical Evidence Landscape here.

Mosaic Medicine

Institute | Resources

Public Policy

Technology

Mosaic Medicine

New Modalities and the Evolving Evidence Generation Infrastructure

Mosaic Medicine

Common Sense Validation and Quality Control for Artificial Intelligence

Related articles

New Modalities and the Evolving Evidence Generation Infrastructure

The Rise of Data Mosaicism