Home > Blog >

Research & Innovation

Variant Interpretation

Variant Interpretation and Advances in Genomic Medicine

Scott Kahn

June 18, 2025
8 min read

Research & Innovation

Table of Contents

The Core Challenge

Some of the largest challenges in connecting a patient’s health status to the genomic data at hand are due to the lack of an aggregated summary of published research data. This is an area that is ideally suited to the application of AI methods such as Machine Learning (ML) and Large Language Models (LLMs) even though few clinical applications of such LLMs are in common use today because of:

the limited coverage that existing models have, and
the lack of knowledge around the sensitivity and specificity of LLMs in a clinical setting.

Potential of AI in Clinical Settings

ML and LLMs that can interpolate from aggregated research data should perform better than those that are extrapolating results from a data corpus. General purpose LLMs might also suffer from:

confounding observations,
lack of topical or disease focus.

In short, while the applications of domain-specific AI to clinically interpret a patient’s genomic information hold promise, the field lacks a rigorous and objective framework to assess the state-of-the-art.

The Role of PhenoPackets

The development of PhenoPackets [1] has provided an opportunity to construct a necessary and objective framework to evaluate different variant ranking and interpretation methods to assess their “clinical” performance. PhenoPackets catalog known genomic mutations that have been associated with disease conditions observed in patients; these mutations can be seeded into virtual genome-wide allelic profiles that are constructed from a healthy cohort such as 1KGP [2] to create virtual patients with a known disease and the disease’s known genomic variant.

Assessing Sensitivity and Specificity

This framework of “diseased” virtual patients can be used to assess a variant ranking method’s ability to:

identify disease within a patient (sensitivity)
identify the disease-causing mutation within the genomic data (specificity)

Since many approaches typically return a rank ordered list of variants it is likely that the “specificity” should be evaluated using the top ranked variant, within the top five variants, etc., for a large enough cohort of virtual patients to gather statistical representations of performance for comparative purposes.

Real-World Application: InheriNext® System

This PhenoPacket-derived evaluation framework was recently used to evaluate the performance of the InheriNext® system from Compass Bioinformatics and four publicly available systems in common use. The results [3] reveal that InheriNext’s ML ranking algorithm led the way with sensitivities for:

the top variant – 84.6%
top 5 variants – 95.0%
top 10 variants – 98.6%

Importantly, the time required to receive this quality of result from each raw exome sequencing data was approximately 5 minutes.

LLMs and Interpretation

From here, interpretation of the top ranked variants is an area where LLMs can be explored & used to evaluate these variants within the context of a patient’s presentation. The use of LLMs is critical in converting these ranked variants into useful variant interpretations that can help guide disease diagnosis and treatment.

Opportunities for Further Improvement

In the RUO product, InheriNext® Edge has introduced Expert—a domain-specific LLM chatbot trained on medical texts—that is already delivering useful results for variant interpretation. However, there are still many areas of potential improvement in the application of AI methods broadly, and LLMs specifically, to address the clinical interpretation challenges in genomic medicine.

One opportunity for improvement involves:

the integration of LLMs with ranking methods to maintain very high sensitivity in the top ranked variant (or even the top 3 variants) rather than the top 5 or top 10 variants. A more succinct result would simplify clinical application.

Another advance would be to use patient presentation to select a more focused LLM that considers a patient’s medical context to help diagnose subtle differences in the course of disease that could influence therapy decisions.

An additional improvement would be:

to leverage a patient’s genetic ethnicity to focus an LLM around those factors known to be specifically relevant within specific ethnicities and not others.
This would continue to remove the historic bias in clinical knowledge that has been derived primarily from European populations.

It is fortunate that these are all directions of ongoing research and development in the broader field of AI being applied to improve the many applications of genomic medicine.
We should anticipate and encourage the ongoing work by all groups working to improve AI’s impact on genomic medicine.

REFERENCES

Danis, D., et al. (2025). A corpus of GA4GH Phenopackets: Case-level phenotyping for genomic diagnostics and discovery. Human Genetics and Genomics Advances, 6(1), 100371.
1000 Genomes Project – Wikipedia
Preprint: Chang, Ju-Yuan, et al. Evaluating a Standard Benchmark for Gene Prioritization: The InheriNext® Algorithm’s Integration of Genomic and Phenotypic Information. bioRxiv (2025): 2025-02.

PRODUCT PORTFOLIO

CONTACT US

ABOUT

FAQs

Product Modules

Blog

News

Research & Innovation