Coding a heresy trial, one clause at a time

Extracting structured data from texts is central to digital humanities research, yet the process has long sat uneasily alongside the interpretive, context-sensitive reading practices that humanists rely on. In response to this gap, DISSINET has proposed a novel solution: Computer-Assisted Semantic Text Modelling (CASTEMO), an approach that not only captures informational content but also its discursive and contextual embedding. For digital historical studies, it thus allows source criticism to be placed at the heart of data-led analysis. Our new article published in Digital Scholarship in the Humanities demonstrates the benefits of this methodology on the basis of the trial record of Bernard-Oth of Niort and his family.

17 Apr 2026

A document map of the 1234/5 Niort family trial document. The charge responses and key social categories of the witnesses, as well as the referential links between witness depositions are shown. Unlinked witness depositions have been placed on the left. The black bars indicate document sections, labelled in [textual order]/[chronological order] format: e.g. 2/3 = second document section, representing the third sitting of the trial.

In our new article “Syntactic-semantic capture of historical texts as a platform for source-critical analysis: telling the story of a premodern heresy trial with Computer-Assisted Semantic Text Modelling (CASTEMO)” , recently published in Digital Scholarship in the Humanities (Oxford University Press), Robert L. J. Shaw, Katalin Suba, Tomáš Hampejs, and David Zbíral present a new approach to data capture from textual sources. Rather than extract selected facts from texts, CASTEMO offers researchers the ability to reconstruct clauses as structured data statements, capturing syntactic structure, semantic qualities, lexical choices, epistemic framing, and contextual relationships (including who is speaking, with what degree of certainty, and in what discursive context). The resulting data can be queried as a knowledge graph, enabling quantitative analyses that remain deeply attentive to how texts construct and present knowledge. 

We demonstrate the benefits of applying CASTEMO in its most intensive, “maximalist” form through a case study on the earliest surviving medieval inquisition trial record, the 1234/5 proceedings against Bernard-Oth of Niort and his family (ca. 5,000 words and 113 witness depositions). The charges against the Niorts centred largely on reputation rather than directly witnessed acts, an unusual feature that historians have previously struggled to interpret systematically. Applying CASTEMO to these records produced a total of 1102 statements and revealed several key benefits:

  • Bridging close reading and computation. The process of systematically encoding clauses as data statements is itself a form of structured close reading, compelling the researcher to attend to discursive and contextual patterns that other more selective forms of data extraction might miss. Analysis and interpretation develop hand-in-hand with data capture rather than after it. When applied to the Niort trial, CASTEMO directed our attention toward the epistemic basis of witness claims, the referential structure of depositions, and the social conditioning of testimony, all of which enhanced our understanding of the document's conditions of production.
  • Capturing epistemic nuance at scale. CASTEMO records not just the information reported in a text, but how it is phrased and conveyed. This includes the “mood” of an action (such as belief, allegation, certitude), “mood variant” (realis/irrealis), and epistemic levels (textual, interpretive, inferential) that distinguish close textual modelling from analytical inference. Coding such information in the Niort trial records revealed that around 45% of witness depositions affirmed each charge on hearsay alone – exposing the pervasiveness of hearsay with new precision.
  • Representing relational and referential complexity. CASTEMO's data model accommodates intra-textual cross-references, enabling their semantic implications to be made explicit and quantified. In the Niort trial, modelling the pervasive "he said the same as…" phrasings exposed an apparent notarial strategy of embedding weak, hearsay-based evidence within networks of cross-referenced depositions, making the fragility of such evidence less conspicuous and, arguably, the charges appear better supported than they were.
  • Integrating contextual information within the same relational structure. Social, spatial, and temporal context can be captured alongside discursive features in CASTEMO, enabling their correlation in analysis. In the Niort case, capturing witness occupations, geolocated positions, and the depositions within the context of trial events helped reveal the way in which inquisitors approached the investigation and responded to evidential challenges. Across the trial sittings they alternated between approaching confident and knowledgeable witnesses often proximate to the Niort stronghold of Laurac whose evidence produced complications, and a more distant but pliable cast, who affirmed charges on hearsay. To have clear affirmations of guilt and solid evidence together proved challenging – a tension that CASTEMO's relational data brought into view. 

To read the full article, go to https://doi.org/10.1093/llc/fqag016. To access the dataset upon which the article is based, go to https://zenodo.org/records/14289840.

Link to the article


More articles

All articles

You are running an old browser version. We recommend updating your browser to its latest version.

More info