Using the Wrong Tools in Health
Research
‘Although it is inappropriate, and potentially inaccurate, researchers
frequently use linear regression on nonlinear phenomena, calculus on
discontinuous functions, or χ2 when data points are interdependent.’—Eric Dent PhD, 1999
The principles of systems science are clear: We’re using research tools
designed for static, isolated, linear and mechanical systems, but human
beings are nonlinear, adaptive, biologic and heavily influenced by
interactions with the rest of our constantly changing world.
In his book “Predictive Analytics: The Power to Predict Who Will Click,
Buy, Lie or Die,” Eric Siegel describes the story of IBM Watson’s
victory in the game Jeopardy against its two greatest human champions at
the time.4 He said it is one of the best examples of
machine learning and the potential for wonderful benefits if applied
appropriately to health care. But he is clear in the need for an
accurate context that produces the data and the need for multiple
collaborations to share what is learned from the data.
In health care, we have a very poor understanding of the context that
produces the data available that we see in fragmented care, often
primarily from coding and billing data. This data is incredibly
inaccurate. Some estimate that nearly 50% of billing data is wrong or
omitted, with the great majority from human and systems error, not
fraud.5
But when the application of the principles of systems science is done
well, and Siegel gives many examples in his book, then the use of a
well-defined context and meaningful collaboration can prevent the law of
diminishing returns, or as Siegel calls it in his book,
“overlearning,” which can allow for continuous improvement of value
over time.