Digital Tools Assignment 3

Making Text into Data

As you know from class, XML (eXtensible Markup Language) is a metalanguage (a language used to describe or markup another language) used in order to process and store data. Markup languages are valuable as means of helping digital archivists and instructional technologists do two things: first, markup creates a rich data set from the text in question which can then be converted into a number of formats; second, markup encodes how a particular text should be presented on a webpage, digital publication, data set, etc. Marking up a document in XML is a matter of directing a computer as to how the manuscript’s content should be presented and interpreted. Decades ago, the Text Encoding Initiative (TEI) was developed to standardize how digital humanists encoded texts.

For this assignment, you’ll be adding limited XML to a page of Middle English medical recipes that you’ll transcribe from Trinity College Cambridge MS O.8.35, a mid-fifteenth-century English compendium of medical knowledge. Your marked-up transcription will then get added to the digital critical edition of the manuscript, Old Books, New Science, created using EditionCrafter, an open-access tool that marries XML encoded text to images served in IIIF.

I’ll assign your page of text in class on Tuesday, February 17, and you’ll work with your classmates on Thursday, February 19 to complete your transcription.

We’ll work together on Steps 1 and 2 of this assignment on Thursday, February 19 and Thursday, February 26. Steps 1, 2, and 3 are due March 6 at 11:59 pm; Step 4 is due by March 13 at 11:59 pm.

Trinity College Cambridge MS O.8.35, digitized by the Trinity College Cambridge Library

Step 1: Transcribe a page of a fifteenth-century manuscript

Each of you is assigned one page of the manuscript to transcribe for your this digital tools assignment. Consult the table below to find your assigned page and the correspinding XML ID that matches that page. To begin transcribing your page of the manuscript, open a Google doc and name your doc the xmlid that appears in the table below. So, for example, referencing the table of assignments below, the first page in the table (f. 64r) would be transcribed in a Google doc named f142.

In the Google doc you’ve created, you will transcribe the entire page you’ve been assigned. Be sure to hit enter to create a line break whenever there’s a line break in the manuscript. You’ll also be sure to mirror the spelling and punctuation that you see exactly. This will be difficult! The best way to get better is to keep trying. I encourage you to try your best to get a rough draft, then swap with a partner and have them read what you’ve done and offer suggestions.

Student	Page	XML ID
Jane Allinger	f. 64r	f142
Fiona Corrigan	f. 64v	f143
Camila Erazo	f. 65r	f144
Hudson Hahn	f. 65v	f145
Stella Lenzie	f. 66r	f146
Kasia Love	f. 66v	f147
Jonathan Martinez	f. 67r	f148
Kylie Millar	f. 67v	f149
Aiden Reed	f. 68r	f150
Keilah Scott	f. 68v	f151
David Smith	f. 69r	f152
Andrew Stillwell	f. 69v, f. 70v	f153, f155
Hailey Stuart	f. 71v	f157
Katie Tovar	f. 72r	f158
Vivian Velasquez	f. 71r	f156

Step 2: Mark-up your transcription with XML according to TEI heirarchies

The EditionCrafter workflow will automatically create a TEI header and a number of other XML tags to merge these separate txt files into one XML file, so you don’t need to open your transcription with a <p> or <ab> tag.

However, you should still add internal tags to the text you transcribe when necessary. The following tags are included in the TEI header:

<head>: denotes a title that is separate from the block of text
<title>: denotes a title that is in-line with the block of text
<PersonName>: exactly what it sounds like
<term>: these tags should be wrapped in <ref> tags, and they are uncommon. They indicate a word that should be included in an eventual glossary because they are unusual or unique.
<supplied>: should be used when you add something to the transcription that isn’t entirely indicated in the manuscript. These are uncommon tags, and you do not add then when you simply extend a common abbreviation.
<measure>: indicates a unit of measurement in the recipe
<time>: indicates when a length of time is referenced in the recipe

Please refer to the first sixteen folios of transcribed pages in dyngley-data/transciptions for a model of how to incorporate tags. You can also refer to the glossary of official TEI elements here, or you can review the basic TEI tags in the Basic Tagging tutorial from the Women Writer’s Project.

Step 3: Submit your transcription with a pull request

Once you’re sure you’ve got the TEI and transcription right, it’s time to add your transcription to the dyngley-data repository in our GitHub organization so that I can include your work in the critical edition of TCC MS O.8.35.

First, you’ll want to download your file from Google docs as a .txt file. Be sure it’s saved somewhere on your computer as xmlid.txt. So, for example, if you were assigned f. 66v in the manuscript, you’d see in the table above that the XML ID for that page is f147. You’d name your transcription file f147.txt.

Next, just as we did with the spring-2026 repository, you’ll create a new branch in the dyngley-data repository named lastname-dt3. Within that branch, locate the transcriptions folder. Click Add file and upload your .txt file with your transcription and XML.

Finish by submitting a pull request to the main branch of the dyngley-data repository. Once I’ve looked over your transcription, I’ll merge your pull request and then run EditionCrafter to incorporate your transcription into our digital edition.

Note: If you’d like more information on how EditionCrafter works to combine the XML of your marked-up transcription with the XML of the IIIF manifest, you can read about that here. I’m also happy to explain it in more detail in class, though I’m not the developer who wrote the program!

Step 4: Explain yourself

In the spring-2026 repository, create a Markdown file following our course protocols (create a new branch named lastname-dt3, navigate to the _posts folder and create a new .md file titled yyyy-mm-dd-your title.md, or draft the post in Google docs and then download and upload the .md file) and include the appropriate YAML header.
Note: Do not name your file Digital Tools 3. We can’t have multiple files merged with the same filename. They’ll overwrite one another once I merge your post into the course repo. Think of something more creative!

Somewhere in your post, copy your transcription and TEI encoding from your .txt file. Next, link to the page in thedigital critical edition featuring your transcription. You’ll want to refer to these when answering the questions below.

In the same .md file, write a 3-4 paragraph blog post responding to the following questions:

What recipes were in your transcription?
What was the process of learning to transcribe Middle English like?
What questions do you have about the act of recording this recipe in a manuscript?
How might encoding the text with XML mark-up transform its utility?
According to Pamela Smith, communicating embodied knowledge in writing required acts of translation: taking tacit knowledge and making it explicit. In what ways does the act of encoding text with XML tags mirror the process of ‘translation’ decribed by Smith?

Step 5: Submission

Finish by submitting a pull request to the technologies-of-history/spring-2026 repository. Name the pull request lastname-dt3. That’s it!