Charles S. Peirce is widely acknowledged as one of the founders of philosophical pragmatism and semiotics. After he died in 1914, Harvard University acquired Pierce’s writings to preserve them for future study, collecting approximately 1,650 unpublished manuscripts for over 100,000 pages. Carefully cataloged by his pupil Richard Robin, the archive was microfilmed in the 1960s, and the Houghton Library subsequently digitized the microfilms, but their limited legibility recently persuaded the library to undertake a renewed digitization effort working with the original manuscripts. This new effort offers an exceptional opportunity to study Pierce’s sometimes difficult writings through AI-based computational techniques that allow for automated transcription and the study of idiosyncratic graphical markings, color codes, revisions, insertions, and annotations.
Alongside traditional editing practices, the Peirce Interprets Peirce project aims to leverage the power of these new techniques, from machine learning to text analysis to data visualization, to develop a prototype for an interactive digital edition of the entire Pierce archive. A small team of undergraduate and graduate students with varying skillsets, from literary and paleographic analysis to coding to graphic design, will be assembled to this end and will collaborate with the University of Lausanne, the University of Groningen, and the Bibliotheca Hertziana, with metaLAB (at) Harvard serving as the project hub. The project goals are articulated in 3 phases.
The textual analysis of Pierce’s work aims to demonstrate how the manuscripts open up a broad interpretive space. The first goal is connecting to its broader context through techniques of wikification and entity linking provided by Linked Open Data. These connections will rest on an ontological framework based on Peirce’s semiotic theory. This process aims to track semantic slippages in his writings over the years.
The modern-day building of interoperable catalogs of cultural significance employs methods heavily influenced by Peirce’s work. Knowledge graphs and Semantic Web offer standards for constructing semantically enhanced digital scholarly editions. While TEI and IIIF can be adopted for publishing textual and visual data, the editions will be available as RDF-based knowledge graphs and paired with semantic annotations of the archived content. This process will pinpoint Peirce’s notes in the landscape of computational linguistics and cognitive computing, as well as represent his contributions and theories, using widely-adopted vocabularies and ontologies for scholarly publishing and annotation.
From the standpoint of data visualization and interface design, our goal is to provide end users with a series of data visualizations capable of working as indices to specific facets of the archive. In addition to standard tools like text search and classification, comprehensive visualizations will allow for the thorough exploration of the archives and for zooming down into particulars.