The Perils of Data Mining

By Dr David Laing Dawson

Allowing computers to search through large medical data bases may one day discover a link, an association of great importance and one that stands up as actually a causal link. It is really the headlines associated with the reporting of these studies with which I have a problem.

These headlines appear on Google searches, Google news, newspapers, and trade epublications such as Psychiatry Times. I suppose the purpose of a headline or lead is to make the reader want to read the article, or in these cases, the research findings and the methodology.

If I read that eating bacon is going to double my chance of a heart attack I am compelled to read the actual study. In that case (an example from a few years ago) I concluded, after reading the actual study and juggling statistics with reality, that I would have to increase my bacon consumption from occasional to every day to increase my chance of dying from cardiovascular disease within the next ten years from 14 percent to 16 percent.

You can’t make good carbonara without bacon or prosciutto.

Butter is good, butter is bad, and now butter is good again.

These data mining exercises can never account for all variables, and they certainly don’t prove cause. In fact they are quite dumb in the sense of ignoring the obvious, and they seem often to be initiated with a prejudice, with the prejudice informing the headline but belied by the actual results of the study.

Others have pointed out that there is a very strong correlation between the presence of an ambulance and a road side accident. My satire on the subject would compare the rate of death from cancer in people who have taken anti cancer drugs with people who never have.

But I am writing this because of a Psychiatric Times headline that implied a causal relationship in the elderly between antidepressant treatment and hip fractures. Forcing me to read at least the synopsis of the study.

Comparing the elderly population (mean age 80) who were not taking antidepressants with those that were found that more of those taking antidepressants had suffered hip fractures. In the details of the study they found peak incidences of hip fracture 30 and 90 days before the initiation of antidepressants. Yes, before the initiation of antidepressants.

This throws the notion of antidepressants causing hip fractures out the window and hints at a much more complicated relationship between hip fractures, falls, osteoporosis, and depression. Depression is, after all, an illness that affects the body as well as the mind: (diet, life style, exercise, concentration, isolation, sleep, carelessness, memory, awareness, along with low mood).

Of course with the elderly all drugs need to be prescribed with added caution, often lower doses, and closely monitored. But if not newspaper editors at least the medical and science writers should refrain from writing headlines that are actually not supported by these data mining exercises.

But more often today all the other interpretations of the data, the cautions, the caveats, the list of missing variables, and the call for more research is added at the end. But few readers today, as we know, read more than the headlines and first paragraph.

6 thoughts on “The Perils of Data Mining

  1. As for me i keep pushing the bacon.

    Yesterday I visited a palliative care place for a dear friend who is ebbing at 94. very good life style has my friend.

    I also came across two events in there. Both had landed in hospital with seriously broken ribs. one because she had crossed the road from a dark house and tripped. The lights were on across the road and she was out to find out why her side of the road was in the dark ! Another occupant whom i did not visit had slipped while decorating the Christmas tree. much older than i am and much more prominent also a heavy smoker to boot or had been…..

    She had mashed several ribs. Stuff happens and it alas hurts. But I can see that both of these people had really had a very good life with humour and work. . The smoker had been a very big person working in all kinds of prominent ways for the community.

    The “wellness brigade” really gets up my nose with their questionable beliefs and peddling of everlasting fake news.


  2. Terms like Evidenced Based Practice and According to Best Practices should now be taken with a grain of salt. When these terms are used, we should exercise the opportunity to challenge the evidence that is being claimed.

    But it is hard to find someone to address these issues to.

    I have observed these terms used carelessly in the field of mental health by people that are not sincere. And what is the motivation possibly to promote a publicly funded program that may not be efficient. Government funders should look more closely at what they are funding with our money. And tax payers need to be vigilant. I am thinking about the popularity of peer driven services to the mentally ill. Yes it helps to talk with someone who has had a similar experience to you. But if that person has no training in serious mental illness, then his or her service has limits. As a caregiver, I have observed this firsthand regarding my relative.

    If I had any say, I would ask that this money be spent on increasing the number of beds so that people can stay in hospital until they are well. And then have appropriate supports to help in the community following discharge.


  3. Regarding the “strong correlation between the presence of an ambulance and a road side accident,” my favorite is the correlation between the post WW2 return of the storks to the rooftops of Amsterdam and the birth rate.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s