Examining the language demands of informed consent documents in patient recruitment to cancer trials using tools from corpus and computational linguistics

Talia Isaacs, Jamie Murdoch, Zsófia Demjén, Fiona Stevenson

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)
19 Downloads (Pure)


Obtaining informed consent (IC) is an ethical imperative, signifying participants’ understanding of the conditions and implications of research participation. One setting where the stakes for understanding are high is randomized controlled trials (RCTs), which test the effectiveness and safety of medical interventions. However, the use of legalese and medicalese in ethical forms coupled with the need to explain RCT-related concepts (e.g. randomization) can increase patients’ cognitive load when reading text. There is a need to systematically examine the language demands of IC documents, including whether the processes intended to safeguard patients by providing clear information might do the opposite through complex, inaccessible language. Therefore, the goal of this study is to build an open-access corpus of patient information sheets (PIS) and consent forms (CF) and analyze each genre using an interdisciplinary approach to capture multidimensional measures of language quality beyond traditional readability measures. A search of publicly-available online IC documents for UK-based cancer RCTs (2000-17) yielded corpora of 27 PIS and 23 CF. Textual analysis using the computational tool, Coh-Metrix, revealed different linguistic dimensions relating to the complexity of IC documents, particularly low word concreteness for PIS and low referential and deep cohesion for CF, although both had high narrativity. Key part-of-speech analyses using Wmatrix corpus software revealed a contrast between the overrepresentation of the pronoun ‘you’ plus modal verbs in PIS and ‘I’ in CF, exposing the contradiction inherent in conveying uncertainty to patients using tentative language in PIS while making them affirm certainty in their understanding in CF.
Original languageEnglish
Pages (from-to)431-456
Number of pages26
Issue number4
Early online date13 Oct 2020
Publication statusPublished - 1 Jul 2022


  • cancer
  • clinical trials
  • corpus linguistics
  • informed consent
  • research ethics

Cite this