Precision medicine aims to provide health professionals with the information they need to act as quickly as possible, with the most effective interventions. The gathering and analysis of general intelligence (biomedical, pharmacological, socio-economic…), as well as data that is specific to each individual (medical records, family history, sensor measurements…), should enable screening, prevention, diagnosis and treatment of diseases to be refined and personalized. This in turn should improve the state of health of whole populations, by minimizing the need for major medical interventions.
When it comes to the various data sources that can be considered as part of this information gathering, genomics is of paramount importance. This discipline – which studies the role of the genome in living processes – clearly demonstrates that the genetic profile is key in determining the onset, development and inheritance of diseases. Understanding in the genome of individuals and incorporating them into medical records enables healthcare professionals to intervene early and tailor the management of each patient, as well as to follow them up more effectively. For example, people with a genetic predisposition to diabetes can be regularly monitored, enabling appropriate interventions as soon as any clinical signs appeared. Or healthcare providers can check before prescribing a drug whether the patient is likely to suffer secondary effects to it because of a genetic predisposition.
As these examples show, one of the key challenges of precision medicine is to translate advances in genomics into information that’s actually usable in a clinical setting. But this is an especially complicated problem…
Genomics is turning to computer science
The human genome is both a very simple and a highly complex structure, consisting of 3.2 billion pairs of its four basic molecular bricks. It was first fully sequenced in 2003, at a cost of around $2 billion. With the introduction of high-throughput sequencing in the mid-2000s, that cost has been slashed to about $1,000 today, and in the near future it could fall as low as $100. But with high-throughput sequencing – which involves decrypting hundreds of thousands of DNA segments in parallel – genomics has effectively become branch of information technology, because the real difficulty now is how to access the many terabytes of unstructured data produced by sequencers.
The first task is to order and assemble these pieces of information, using reference genomes. The aim is to identify variations from reference genomes (alignment), after, in a second step, understanding their role and influence (annotation and interpretation). To do all this, the bioinformaticians use algorithms that are semi-automated in workflows (pipelines). This demands considerable processing and analytics capacity, which means that genomics is one of the major users of high-performance computing (HPC) and Big Data solutions. The National Centre for Genomic Analysis (CNAG) in Barcelona, Spain, has set up a robust analytics platform that relies on a supercomputer with over 3,400 computational cores and 7.6 petabytes of storage. This state-of-the-art Bull system – supplied and implemented by Atos – enables CNAG to turn the sequences into really valuable insight.
Power, storage, integration and analytics: the four big challenges of genomics
As well as sheer computing power, data storage is another major issue in genomics, not only when it comes to capacity and performance, but also security. Of course, the data in question is highly sensitive and confidential. So it has to be encrypted because, even if it is anonymized, it might be possible to identify an individual by cross-referencing from various sources. And very strict rules around the protection of medical data have to be followed, which these vary from country to country and are often subject to change.
Finally, the third major challenge is to enable the medical profession to truly benefit from the knowledge gained. That means integrating that information into hospital systems, comparing it with other data and presenting the findings in a clear, digestible format in order to make them easy to use in a clinical setting. In Valencia, Spain, the Prince Felipe Research Centre (CIPF) teamed up with Atos to develop just these kinds of tools: to extract and visualize useful information from genomic data. And in so doing, to lay the last cornerstone of one of the fundamental pillars of precision medicine.
A challenge which Atos addresses with its Atos Codex offerings.