Designer Medicine Needs More Than Big Data, It Needs New Science

Human genome — Genomes don't translate to an understanding of disease.

(Image credit: Shaury Nash, CC BY-SA)

This article was originally published at The Conversation. The publication contributed the article to Live Science's Expert Voices: Op-Ed & Insights.

Science rests on data, of that there can be no doubt. But peer through the hot haze of hype surrounding the use of big data in biology and you will see plenty of cold facts that suggest we need fresh thinking if we are to turn the swelling ocean of "omes" — genomes, proteomes and transcriptomes — into new drugs and treatments.

The relatively meagre returns from the human genome project reflect how DNA sequences do not translate readily into understanding of disease, let alone treatments. The rebranding of "personalized medicine" — the idea that decoding the genome will lead to treatments tailored to the individual — as "precision medicine" reflects the dawning realization that using the -omes of groups of people to develop targeted treatments is quite different from using a person's own genome.

Latest Videos From

Because we are all ultimately different, the only way to use our genetic information to predict how an individual will react to a drug is if we have a profound understanding of how the body works, so we can model the way that each person will absorb and interact with the drug molecule. This is tough to do right now, so the next best thing is precision medicine, where we look at how genetically similar people react and then assume that a given person will respond in a similar way.

Most importantly, the fact that "most published research findings are false," as famously reported by John Ioannidis, an epidemiologist from Stanford University, underlines that data is not the same as facts; one critical dataset — the conclusions of peer reviewed studies — is not to be relied on without evidence of good experimental design and rigorous statistical analysis. Yet many now claim that we live in the "data age." If you count research findings themselves as an important class of data, it is very worrying to find that they are more likely to be false (incorrect) than true.

The worship of big data downplays many issues, some profound. To make sense of all this data, researchers are using a type of artificial intelligence known as neural networks. But no matter their "depth" and sophistication, they merely fit curves to existing data. They can fail in circumstances beyond the range of the data used to train them. All they can, in effect, say is that “based on the people we have seen and treated before, we expect the patient in front of us now to do this."

Still, they can be useful. Two decades ago, one of us (Peter) used big data and neural networks to predict the thickening times of complex slurries (semi-liquid mixtures) from infrared spectrums of cement powders. But, even though this became a commercial offering, it has not brought us one iota closer to understanding what mechanisms are at play, which is what is needed to design new kinds of cement.

The most profound challenge arises because, in biology, big data is actually tiny relative to the complexity of a cell, organ or body. One needs to know which data is important for a particular objective. Physicists understand this only too well. The discovery of the Higgs boson at CERN's Large Hadron Collider required petabytes of data; nevertheless, they used theory to guide their search. Nor do we predict tomorrow's weather by averaging historic records of that day's weather — mathematical models do a much better job with the help of daily data from satellites.

To effectively use the explosion in big data, we need to improve the modelling of biological processes. As one example of the potential, Peter is already reporting results that show how it will soon be possible to take a person's genetic makeup and — with the help of sophisticated modelling, heavyweight computing and clever statistics — select the right customised drug in a matter of hours. In the longer term, we are also working on virtual humans, so treatments can be initially tested on a person's digital doppelganger.

Designer Medicine Needs More Than Big Data, It Needs New Science (Op-Ed)

Useful but not profound

Understand laws of biology