Overcoming the Barriers to Federated Data Systems

September 10, 2020

by Danny Levine

John Wilbanks Portrait by Nick Vedros

While there are technical, governance, and cultural barriers to establishing federated data systems in healthcare, John Wilbanks, chief commons officer at Sage Bionetworks and a government and policy advisor to RARE-X, says two of those challenges have been tackled.

“The old joke is data mining means the “data are mine,’” said Wilbanks. “That means that people don’t want to contribute their fragmented data to any kind of centralized repository. Federation offers the opportunity of allowing certain kinds of centralized access without having to hand over governance of your data to a third party.”

Federated data systems allow health information owners to share their data without giving up ownership and control. Culling together collections of health data in this way promises to accelerate biomedical research by providing larger pools of information with which researchers can work.

In the last five to eight years, a series of technologies have matured to overcome federating data’s technology challenges, says Wilbanks. This includes such things as cloud computing, data standards, authentication, and the security necessary in healthcare.

On the governance side, Wilbanks notes that the Health Insurance Portability and Accountability Act, the health privacy law known as HIPAA, has both made owners of data fearful, but also provided them with a convenient excuse to do nothing.

“No one ever got fired for not doing the new thing. Policy and compliance folks are conservative and hesitant by their nature,” he said. “Data is perceived to be a competitive advantage by executive branch folks. You have a lot of people on the executive side of hospitals who simply choose to hide behind the governance as a way to avoid doing it.”

That points to the cultural challenge, which is that “nobody likes sharing stuff in the health space,” he said.

The technical and legal barriers have been significant enough in the past that it’s allowed people to avoid addressing the cultural issues, but that’s changed.

Consider work Wilbanks has been doing with the National Institutes of Health’s National Center for Advancing Translational Sciences (NCATS) on its National COVID Cohort Collaborative (N3C) to build a centralized data resource for the research community to study COVID-19. Wilbanks is a co-chair of the data governance and partnerships working group of N3C, which brings together 65 healthcare institutions to share data about the COVID-19 virus.

“They have avoided data sharing in almost any form, against the wishes of their funders, for a decade or more. We have at this point 56 of the 65 awardees who have signed agreements to deposit and share their data, as well as to use a federated query architecture,” said Wilbanks, who has been convincing these research institutions to sign data transfer agreements, as well as working on issues like community guidelines and publication policies for the group with NCATS and NIH. “It took COVID to break the cultural barrier, but it only took a hundred days to solve the technology and the governance once we got past the cultural barrier.”

One of the benefits of federating the data is having much larger data sets to query, allowing for greater validity in the findings. “It enables the kinds of queries everyone wants to run with enough data to be meaningful. We have a lot of unreliable claims that come out of data science, mainly because the data sets are small and non-representative,” said Wilbanks. “And so simply the scale, the size of it means that those queries go from unreliable to reliable because you’ve gotten over the hump of local minimum data amounts.”

In the case of RARE-X, writing a set of queries where users can plug in variables could allow someone who is not a data scientist to perform powerful interrogations of large data sets easily.

“It’s hard to use health data effectively to answer interesting questions. Part of that is because the kind of platform RARE-X is building has only been built in the past by venture-backed corporations, large hospital systems, or large technology companies,” said Wilbanks. “What those platforms do is lower the cost and the difficulty of doing data science. They do that for their employees and their stakeholders. RARE-X, because of its business model as a nonprofit organization, can make that kind of capacity publicly available to patients.”

Want to learn more about Federated Data?  Watch our video RARE-X and Federated Data.

John Wilbanks Portrait by Nick Vedros. Taken in Kansas City, MO, 14 September 2009, by Nick Vedros. Copyright by Nick Vedros, licensed to the public under CC BY SA

Stay Connected

Sign up for updates straight to your inbox.