The data revolution and AI in the Health sector

The data revolution and AI in the Health sector

Article by Natalia de la Figuera – Co-Founder and COO of GENESIS Biomed

• The data and artificial intelligence revolution is driving deep changes in medical research

• The increasing collection of health data poses significant challenges to the protection of patients’ privacy.

• The business model for the sale of patient data is a subject of debate.

Since the beginning of the industrial revolution in 1760, humankind has been exposed to periods where technological changes have induced profound transformations in society, economy and work. After the industrial, then the technological and now the data and AI revolution has arrived. The latter has had a major impact on medical research through the ability to collect large volumes of health data. From medical records to genetic test results, data allows us to gain a deeper understanding of diseases, improve existing treatments and develop new therapies.

But the data collected from patients or individuals, whatever their source (clinical history, clinical trials, etc.), are often raw data that cannot be used directly but must be processed and transformed into prepared and usable data. This process includes, among others, pseudo-anonymisation, cleaning and structuring of the data through OMOP (Observational Medical Outcomes Partnership). Furthermore, these data can even be extended through so-called “synthetic data”, a term used to refer to data generated by algorithms to populate a database by following trends or patterns such as statistical or behavioural rules to emulate the existing structure, distribution and correlations of real data.

Privacy and pseudo-anonymisation: the challenge of protecting confidentiality

As the use of health data grows, so do concerns about patient privacy. Pseudo-anonymisation, the process of removing personal identifiers from data, is essential to protect confidentiality. However, despite efforts to anonymise data, there is a risk that data can be re-identified, especially when combined with other information sets. For example, whole genome sequencing of an individual’s genome used in the development of personalised medicine carries certain risks because it constitutes a ‘genetic fingerprint’, which uniquely defines the individual.

Therefore, as health data becomes an increasingly valuable currency, it is crucial to establish clear ethical principles to guide its use. It is therefore imperative that clear and concise “Informed Consents” are written to patients or individuals before their data is used. These consents should indicate what the health data will be used for, present clear and accessible information, allow withdrawal of consent when the patient so requires, allow rectification of the data, indicate whether these data can be provided to third parties, who will be responsible for processing them, the duration of use of these data, the guarantee of cybersecurity in the access to the data, etc….

These guidelines ensure that consent is aligned with the principles of the GDPR (General Data Protection Regulation) and Regulation (EU) 2019/881 (cybersecurity) on confidentiality and the protection of personal data from cyber threats.

Commercialisation of health data

Debates around the commercialisation of health data have also gained momentum. Today, health data, once pseudo-anonymised, can also be sold to technology or pharmaceutical companies, without patients receiving compensation or, in many cases, being informed. Companies involved in the sale of data in the health sector are called “data brokers” and collect, buy and sell pseudo-anonymised medical data. For this reason, a more practical approach could include financial compensation for patients whose data is used.

This business model, known as DaaS (Data as a Service), for the moment is usually reduced to a report that can be used for health planning (NHS) or to design inclusion criteria for a clinical trial (pharma or medtech industry). On the other hand, the model for the sale of data, as such in volume, is still unresolved.

The biomedical industry is particularly interested in this data for the development of new products. For example, the pharmaceutical company Roche acquired Flatiron Health, an oncology data analytics company, to leverage its database of millions of cancer patients.

AI and machine learning: the future of data-driven healthcare

Finally, the emergence of artificial intelligence and machine learning are driving the next big wave of innovation in healthcare. These technologies make it possible to analyse large volumes of data quickly and efficiently, identifying patterns that previously went unnoticed. In medical research, AI algorithms can process data from thousands of patients to uncover relationships between genetic factors, treatments and clinical outcomes. This has the potential to accelerate the discovery of new treatments and optimise personalised medical care. All these changes are accompanied by a new regulatory framework with the recent approval of Regulation 2024/1689 for AI. Certification of an AI medical device must therefore continue to comply with the previous MDR 2017/745 (MD-Medical Device) or 2017/746 legislation, depending on the product concerned (Medical Device or in vitro diagnostic medical device), in addition to 2024/1689 on AI. Annex IV in the technical documentation and ISO 42001 in the quality system, among others, are also incorporated.

In short, health data is transforming research, personalisation of treatments and innovation in the biomedical sector. However, the massive collection and use of this data also poses significant ethical challenges, especially in terms of privacy and pseudo-anonymisation. Technologies such as AI and machine learning have a key role to play in this change, but their success will depend on finding a balance between innovation and the protection of patients’ rights.