RARE Daily

AI System Has Potential to Shorten Rare Disease Diagnostic Odyssey

March 25, 2026

Rare Daily Staff

A new artificial intelligence system could help shorten the diagnostic odyssey families face for a rare disease, according to a new study in the journal Nature.

Researchers in China and the United States report that their system, called DeepRare, matched or beat leading diagnostic software and even outperformed experienced rare disease specialists at identifying the correct condition from complex symptom lists. The tool is designed as a diagnostic copilot for doctors, not a replacement, and emphasizes showing its work with step-by-step explanations and links to medical evidence.

Rare diseases collectively impact more than 300 million people worldwide, most of them with genetic conditions. Yet patients typically spend more than five years moving between doctors, undergoing tests, and receiving misdiagnoses before getting the correct answer. The authors say this lag can delay treatment, increase medical costs, and deepen emotional strain on families.

DeepRare uses a large language model to coordinate more than 40 specialized tools and up-to-date medical knowledge sources. It can read free-text clinic notes, structured symptom codes, and results from genetic testing, then produce a ranked list of possible diagnoses alongside a narrative explaining why each disease is likely or unlikely for that patient. The system cross-checks its own answers in a self-reflection loop intended to reduce overconfident, incorrect guesses, a common problem with AI.

To test the system, the team evaluated DeepRare on 6,401 clinical cases drawn from nine datasets across Asia, North America, and Europe, covering 2,919 different rare diseases in 14 medical specialties, including neurology, endocrinology, and immunology. Using standardized symptom data, DeepRare correctly placed the true diagnosis in the top spot more than 57 percent of the time, outperforming the next-best AI method by nearly 24 percentage points. When both symptoms and genetic sequencing data were available from two children’s hospitals in China, its top-ranked answer was correct in about 69 percent of cases, compared with nearly 56 percent for a widely used tool called Exomiser in one cohort.

In one real-world test, the researchers compared DeepRare against five physicians with at least a decade of rare disease experience, all working from the same structured symptom profiles and allowed to use search engines but not AI. DeepRare’s leading diagnosis matched the true disease 64.4 percent of the time, versus 54.6 percent for the doctors; looking at each group’s top five suggestions, the system reached 78.5 percent accuracy, compared with 65.6 percent for clinicians. The authors stress that the tool is meant to support specialists, particularly by rapidly surfacing obscure conditions and relevant literature, not to replace their judgment.

One feature the team highlights is traceability. DeepRare provides an evidence trail for each recommendation, citing research papers, case reports, and rare disease databases such as Orphanet and OMIM. Ten senior rare disease physicians reviewed 180 cases and rated the accuracy of these reference lists, finding that 95.4 percent of the cited evidence was both reliable and directly relevant to the AI’s conclusions. Errors tended to arise when the system generated realistic-sounding but nonexistent links, or when it chose the wrong diagnosis and, in turn, pulled the wrong supporting sources.

The study notes that the system still struggles in some areas. In a review of 200 cases where DeepRare missed the correct diagnosis in its top five, specialists found the most common issues were placing too much weight on nonspecific symptoms, or confusing diseases that look clinically similar, such as different genetic syndromes with overlapping features. Fundamental factual errors in reasoning or misuse of external evidence were uncommon, suggesting the main challenge is clinical nuance rather than basic knowledge.

The authors argue that agent-based AI systems like DeepRare — which break a problem into subtasks, call specialized tools, and then stitch results together — may be especially well suited to rare disease medicine, where data are sparse, knowledge changes quickly, and explanations matter. They note that DeepRare maintained strong performance across body systems, doing particularly well in kidney and urinary disorders, but showing weaker accuracy in lung and breathing conditions, highlighting areas where more work is needed.

DeepRare is already deployed as a web-based application, allowing clinicians to enter a patient’s history, refine symptoms, upload genetic files, and receive a structured report of likely diagnoses and supporting evidence.

The team envisions expanding the framework beyond diagnosis into suggesting treatments and forecasting disease progression, and exploring how similar systems might help non-specialists recognize when a rare disease should be suspected in the first place. They also acknowledge the need for more validation in diverse, real-world clinics, as well as careful oversight to ensure the technology reduces disparities rather than widening them.

Stay Connected

Sign up for updates straight to your inbox.

FacebookTwitterInstagramYoutube