How the cognitive sciences can improve patient reported outcome (PRO) measures

Reliance on patient self-report (patient reported outcome (PRO) measures) to gain insight into the impact of living with a disease or the outcomes of an intervention is based upon the assumption that the respondent understands the question and terminology in the same manner as the researcher, is able to accurately recall the relevant information and formulate the appropriate answer and accordingly respond. Unfortunately, the task is not so simple and answering the question can place a considerable cognitive burden on the respondent.


A variety of variables can affect the respondent’s ability to report accurately which include the:

  • instructions
  • available response options
  • time the information has been held in memory
  • question order

Although there are a number of major models proposed to explain the process on how a respondent  answers a question they all generally share the same steps which are:

  1. Understanding the question
  2. Recalling the relevant behaviour or event
  3. Inference and estimation
  4. Mapping the answer onto the available response options
  5. Editing the answer for social desirability

Whilst these models represent a sequential approach in which the respondent arrives at the appropriate answer and accurately reports it, unfortunately cognitive research suggests otherwise. Even apparently simple questions can be highly ambiguous.

The item” How often have you felt that your blood sugars have been unacceptably low recently?” is one of eight items of the Diabetes Treatment Satisfaction Questionnaire (DTSQ) selected for a study demonstrating the burdon of hypoglycemia on patients’ quality of life. The response options range from 0 (none of the time) to 6 (most of the time).

Readability of the item based on the Flesch Reading Ease was just below “ Fairly easy” However, in addition to readability, what are the cognitive tasks the respondent must undergo to answer the question within the context of the study? Clearly the key task for the respondents was to determine what the term “unacceptably low recently” meant. Does the term refer to low levels in respect of future complications or unacceptably low resulting in seeking medical care or assistance from others, that there was some interruption to activities or that the level of blood sugar was below what the respondent considered “perfect”? Similarly, the time period “recently” is not defined.

To understand the question respondents will draw on a wide range of contextual information to report on their behaviour or frequency of some event or symptoms using a range of strategies to arrive at an estimate. This contextual information will include the response options presented and the given time frame in which the behaviour or event has occurred.


Cognitive research has shown that the response options presented can influence respondents’ reporting of the frequency of vaguely defined events and that the set of response alternatives is treated as information in the interpretation of the question.

In an earlier study we examined whether such effects would occur in the context of respondents reporting of health-related events using high and medium frequency closed format response categories, which might be used interchangeably by researchers. Overall, our finding showed that those respondents presented with response alternatives discriminating at medium frequency, reported significantly fewer target events than those presented with high frequency response alternatives.

Research evidence suggests that respondent use the response categories as a guide as to whether the event or behaviour in question is of minor or major importance from the perspective of the investigator, with high frequency reporting response categories e.g. “Once a week” to “everyday” being associated with minor events and behaviour. In contrast questions with lower frequency e.g. “Less than a year”  to “more than once a month” options the magnitude of the event or episode is interpreted as more major.

Because response options carry meaning which can be conveyed to the respondent that influences their  interpretation of the question developers need to consider the implications of the chosen response scale on the respondents behaviour.


When  reporting a particular event or behaviour respondents are unlikely to scan the prescribed recall period and count i.e. recall and count the instances, unless the event or behaviour is memorable and infrequent such as needing medical attention as a result of severe hypoglycemia. Therefore, there are a number of factors that make the recall and count strategy unsuitable for much of the behaviours and episodes that are evaluated using PROs because:

  • Memory decreases over time irrespective of its importance and in particular when the question relates to frequent events of behaviour,
  • Respondent are unlikely to have detailed representations of the various episodes or behaviour when they are are of frequent nature. Rather,  the closely related events and episodes blend  into a global representation that lacks time or location.
  • Our autobiographical memory is not structured by categories of behaviour or events such as having low blood sugar levels but, is thought of as a hierarchical network in which lower-level events (was hospitalised with flu)  are nested within higher order events (such as diagnosed with diabetes) and within these are the specific or summarised events at a lower level (frequently had low blood sugar levels).

Marrettt et al in order to quantify the frequency and severity of hypoglycemia, patients were asked to read a list of symptoms and record the frequency which was collected as 1-2 episodes, 3-6 episodes for each level of severity (mild, moderate, severe and very severe) for the preceding 6 months.

Taking account of the above discussion in terms of the complexity of such a task given the long recall period, a feature of this approach is that the cognitive task has been decomposed into a number of separate questions (levels of severity) which act as retrieval cues, thus, simplifying the recall and estimation task for the respondent. However, although this can result in increased frequency of reporting of behaviour, it does not reliably increase the accuracy of the reports. However, there is evidence (Means & Loftus) that when recalling individual events greater than five or more, respondents’ memory appear to be generic rather than episodic. That is the representation is global without fixed recall links.

Global questions e.g. “experiencing low blood sugar levels” pertain to more frequent events than do more specific ones e.g. “experiencing low blood sugar levels requiring assistance” tend to foster underestimates. In contrast however, a series of  specific and narrow questions (“requiring assistance”, “requiring medical help”, “little or no interruption to activities”) tend to foster small overestimates resulting a considerable overestimate of experiencing low blood sugar levels.

Another consideration of Marrettt et al work is the response order effect in the presentation of the symptoms because in general an item is more likely to be endorsed when presented early  rather than late on a list in a self-administered questionnaire or on a show card.


The ordering of questioning is also important and cognitive research suggests that asking respondents to recall the most recent event aids recall of previous episodes. This is probably due to the most recent recall serving as a cue for recalling previous events.

Designing the questioning sequence to be more compatible with memory retrieval processes suggest that the event should be first recalled and then dated. Although Marrett et al are not clear in their description of the process, questions about the specific episodes and the circumstances surrounding them should first be asked before attempting to record their frequency or date them.


The quality of data obtained from patient self-reported outcome measures (PROs) can be greatly enhanced by the application of cognitive techniques as well use of formatting and design principles within a cognitive framework.

This post has outlined just a few of the many issues regarding asking respondents question about their behaviour etc.

Not least is the respondents comprehension of just what the question is asking and the various strategies the respondent will use to answer the question.

We know that frequent behaviour is poorly represented in memory and that respondents will use a variety of strategies to derive meaningful estimates. We also know that the response options offered may provide information that is used by the respondent in interpreting the question which may systematically bias respondent reporting.

Respondent are unlikely to have detailed representations of the various episodes or behaviour when they are are of frequent nature. Rather,  the closely related events and episodes blend  into a global representation that lacks time or location.

Our autobiographical memory is not structured by categories of behaviour or events but, is thought of as a hierarchical network in which lower-level events  are nested within higher order events  and within these are the specific or summarised events at a lower level.

Reporting of high frequency events tend to underestimated and low frequency events tend to be over estimated.

Temporal direction of recall can act as a cue for recall.

  Very nice summary Keith. I'll be sharing this with colleagues.Thank you!

