Our guest blogger this week is Laurie Gelb a researcher based in Seattle who has worked & published at consultancies, Sanofi-Aventis, WellPoint and the MD Anderson Cancer Center?
When we research and attempt to maximize favorable patient-reported outcomes (PROs), apples and oranges abound.
We may think we are centering on benefits for the patient, or the value of therapy. Yet in many diseases, the patient sees proxies for benefit (such as an A1c in diabetes, lipid or blood pressure measurement) that may correlate poorly with health outcomes, quality of life or longevity, even presuming these proxies are tracked consistently. And tracking becomes less likely when patients cannot see a value to tracking. Knowing their BP over time may not make them feel better or confer greater safety in their view, meaning that it may not drive decisions.
But PRO researchers say, the BP numbers are not enough. We must examine quality of life, function, cost and more. And soon we enter a level of abstraction that is unfamiliar and counterintuitive (as well as long and tedious) for most patients whose most salient outcomes we supposedly are measuring. Cost-effectiveness, QALYs and other attempts to quantify absolute and comparative value have so many warts that even professionals express skepticism.
As for “satisfaction” and “preference,” which you might hope could best summarize what/how patients think, these should have been banished to the scrap heap long ago, as they require abstraction that few people are capable of. We seldom mention either in real life. You may prefer one brand of peas to another, or aspirin that is in a smaller tablet as opposed to a larger one, but do you “like” either? Do you even “prefer” either? Or is it a question of behavior — what do you do/use and why — that defies deconstruction because you never really internalized a choice that is so easy?
But, you might protest, peas are easy and treatments are hard. In many cases, you would be wrong. The patient believes the rationale for a mechanism of harm (e.g. high blood pressure numbers mean I might die like my grandmother did), trusts valued influencers, trusts a clinician, trusts a Web site or two, whatever the case — it is all a piece and often not abstracted. Even a decision like surgery vs. observation is seldom subject to rational deconstruction, given the emotional costs of information and care-seeking.
Then you, a researcher come along and want to deconstruct a choice, or pre/post outcomes, that just don’t mean much, if anything to your subject, yet everything you say you do is in her name.
PRO research thus benefits from a naturalistic, conversational data collection approach in which we do not ask patients (or anyone else) questions other than those we might pose to friends. Properly analyzed, “In the last week, what has the arthritis kept you from doing?” or “What do your tablets seem to be doing?” or “In the last week, how many nights did you not sleep as long as you wanted to?” can yield as much structured insight as all the validated scales in the world. And since function and pain are also often seen as of a piece, after we separate them (as we must, of course), we have to put them back together to see if/how the patient does.
To inform value-based purchasing, we have charts and claims data, but we also have surveys that can get at what happened — not just what the expectation was, not how the patient selected therapies, but what really, actually happened that justified a payor’s and/or patient’s investment. And there is great power when we triangulate these data, and even more when we include clinicians.
But another question is, do we care what the arthritis is doing or more what the drug or injection is doing? If we understand the delta, did the baseline merit millions of dollars and hundreds or thousands of days? Understanding that scientific research generally adopts some sort of pre/post design, health care decision support today demands shorter lag times from research approval to report or publication in order to be useful. Doing away with expendable baseline data and/or extrapolating it from charts, claims and self-report, is one way to accomplish this aim. Adaptive and “real time” trial design falls into this paradigm.
One way to look at measurement, whether pre/during/post, is to consider (1) what domains the patient will use or has used to judge effectiveness (pain is often one) (2) what measures are used within this domain (such as the ability to extend limbs) and (3) what thresholds apply within this measurement for each patient that uses this domain/measure (e.g. whether the patient can reach food products on the highest shelf).
Eliciting domains, measures and thresholds for each patient enables us to build a decision or effectiveness model that can be expressed in the aggregate but is based on the N=1 notion that no two patients need see the same questions (that is what piping, branching and the like accomplish for digital questionnaires, and non-linear discussion guides) nor provide the same answers. This speaks to the individuality that we say PROs support, yet so seldom do. Adherence is often a recurring/daily decision, so modeling the effects of new stimuli and how criteria evolve really merits more than the glittering generalities of many studies.
In the US, the newly-established Patient-Centered Outcomes Research Institute (PCORI), for which I recently served as a merit reviewer, is now the designated center for thought leadership in this area. And it has attracted a variety of interesting applications. However, PCORI is just about to make its first funding awards, so as yet the most concrete progress at the Federal level has been made by AHRQ (Agency for Healthcare Research and Quality), whose patient “satisfaction” questionnaire uses a zero to ten scale to measure extent of achievement on various desired outcomes. I began this approach decades ago, with a health system that was willing to ask discharged patients to what extent the call button was answered within 15 minutes of pressing it, etc.
As an example of “naturalistic” question sets, in asking about specific events, often, as in the call button case, it is more “natural” to ask categorically about frequency (never, sometimes, half the time, usually, always, e.g.), not levels of agreement. For other measures, however, such as the subjective evaluation of the physician’s willingness to listen and respond appropriately, extent of agreement (from not at all to completely, for example) is more appropriate.
The best tools and constructs for the job are relatively easy to access when you keep in mind that patients are in fact your friends, or could be, and are the last people who should be subjected to research outside their context. When you bring PROs to the table that patients can understand, you meet the ultimate prerequisite for their being relevant to those who suffer and thus are much more able to optimize adherence.
If you found this blog interesting or helpful please forwarded it to a colleague.
Categories: Patient reported outcomes