JUSTICE - No. 77

65 Spring 2026 partial view. The Eichmann Trial’s corpus exceeds the bounds of what any reader can process. With more than a hundred witnesses and nearly two million words, the trial resists the kind of comprehensive attention that would allow a scholar to say with confidence what the entire archive of voices actually contained. The methodological alternative, computational distant reading, carries a different and opposite risk: it processes everything but hears nothing in particular, producing statistical patterns without the contextual knowledge needed to interpret them. Legal transcripts, and testimonies especially, are not amenable to analysis that strips away their narrativity entirely. What the study on which this essay is based attempted and what it argues digital humanities can distinctively offer is a scaled approach that moves between these two modes: first zooming out to analyze the full corpus computationally, identifying latent topical structures across all sessions; then zooming back in to examine which speaker groups are most associated with each topic, reconnecting the statistical patterns to the voices that produced them. The result is a method that can listen to all the voices without being deafened by their number, volume or traumatic content, and without resorting to the selection of a privileged few. III. What Computational Analysis Can Hear The computational method employed, topic modeling using Latent Dirichlet Allocation (LDA), is best understood as a tool for discovering the latent semantic structure of a large body of text.8 It identifies groups of words that tend to appear together across documents, without the researcher specifying in advance what those groups should be. This is its crucial advantage in a context as historically and legally charged as the Eichmann Trial: the algorithm does not know what the trial “should” be about, it does not have a predefined scholarly bias. It finds what it actually contains. Applied to the full corpus of 119 session transcripts, covering sessions from the reading of the indictment through the prosecution’s witnesses, Eichmann’s own testimony, closing arguments, and the final judgment, the analysis identified ten coherent topics. While the number of topics was decided in a methodological manner by the researcher, the scope and content of these word clusters were not imposed by the researcher but emerged from the statistical relationships among words across the entire trial collection. Figure 1: Top ten terms per topic, Eichmann Trial corpus (LDA topic model). Each column presents the ten highest-prevalence terms for a given topic, ranked in descending order. Bar length indicates relative term prevalence within the topic. Topic titles reflect interpretive labels assigned on the basis of domain expertise. 8. David M. Blei, Andrew Y. Ng and Michael I. Jordan, “Latent Dirichlet Allocation,” 3 JOURNAL OF MACHINE LEARNING RESEARCH 993–1022 (2003). LDA is a hierarchical probabilistic model that represents each topic as a distribution over terms and represents each document as a mixture of the topics; Adji B. Dieng, Francisco J. R. Ruiz and David M. Blei, “Topic Modeling in Embedding Spaces,” 8 TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS 439–453 (2020). When fit to a collection of documents – in our case, the Eichmann corpus – the topics summarize their contents, and the topic proportions provide a low-dimensional representation of each document (Dieng et al., “Topic Modeling”). This type of model is characterized as a bag-of-words model (BOW), in which word order in the corpus has no meaning for the inference of topics. I implemented an LDA algorithm using Gensim package for Python; see Radim Rehurek and Petr Sojka, “Software Framework for Topic Modelling with Large Corpora,” in PROCEEDINGS OF THE LREC 2010 WORKSHOP ON NEW CHALLENGES FOR NLP FRAMEWORKS (CITESEER 2010). 1 2 3 4 5 6 7 8 9 10 Statement Submit Material Investigation Prosecution Translation Israel Less Origin Copy Letter Reich Police Area Department Ministry Emigration Police Original Date Cohen Berlin Vienna Reich File Printout Written Law Section Remember Child Deportation Theresienstadt Train In fact Wislieeny Transport People Community Letter Ghetto Child Life Warsaw Auschwitz Street Resistance Polish City Group Annihilation Miller Department Heydrich Himmler Auschwitz Area order East IVB4 Remember Picture We were Gas Inside I saw Barrack People Some Happened Less Auschwitz Tape Is it Part Translation Announvement Budapest Sound Protectorate Law Crime Section Nazi International Law an Act Israel Jurisdiction The Hungary Budapest Hungarian i said Remember Himmler You said Answer Becher Part Topic 0 Legal Procedure Topic 1 Nazi Bureaucracy Topic 2 Negotiations Topic 3 Deportations Topic 4 Poland Jewry Holocaust Topic 5 Final Solution Topic 6 Death Camp Topic 7 Eichmann's Statement Topic 8 Legal Indictment Topic 9 Hungary jewry Holocaust The potential of topic modeling for this form of analysis is well illustrated by the process of interpreting and naming Topic 3. At first glance, the topic appears to be a disparate list of unrelated words. Yet when examined through a lens of domain expertise, that is, prior knowledge of the Eichmann Trial and the Holocaust, these words

RkJQdWJsaXNoZXIy MjgzNzA=