Project: Encoding Thoreau

By: Emilio Garcia, Cindy Castillo, Nicole Logrieco, Mallory DelSignore, and Anonymous

Our Project, Encoding Thoreau aimed at inquiring deeper into TEI, the markup language most commonly used for scholarly digital editing. Briefly covered in class, our group heavily focused in understanding TEI more thoroughly as well as using it to encode two of Thoreau’s journal entries using TEI. The purpose of our project was to not only transcribe the text into TEI, but to add more dimension to the text itself. By identifying locations, we were able to make a map, and through the tags, add more detail into our files about Thoreau’s journal entries. Together, we produced a journal entry in TEI code in an effort to not only better understand many aspects of TEI, but also Thoreau and his life.

As a whole, we had to decide what pathway we wanted our project to take. We initially rolled around with the idea of tracking changes through manuscripts, but eventually settled on taking two entries and encoding them into TEI using the software, Oxygen. After the long journey of finding our focus, it easily fell into place shortly after. We broke down the tasks into two teams: The Research Team and The TEI Team. To begin, one of us created a Google document and shared it with the team, as well as typed up our two chosen journal entries, January 30th and May 14th. Mallory and another group member marked up and color coded the text, using different colors to identify proper nouns, nouns, real, and artificial nouns. From there, Nicole took those nouns and properly tagged them. Emilio then used those tags in Oxygen to complete the actual encoding itself and tasked Cindy into looking over the files created to make sure everything looked in order as well as edit and format the groups blog post. As central as the process of encoding was to our project, there was also a great deal of research conducted in order to grasp a deeper understanding of the actual content we were looking and marking up so meticulously. For this mere reason, we were very much interested in researching on Thoreau and the locations he’d mention via his journal. In order to help our viewers conceptualize this even more definitively, one group member went a step further and put together a map of all the places mentioned in the journal entries we encoded using Google Maps.

As we take a step back now and reflect on the areas in which we felt we excelled and the areas in which we had the most difficulty, we all came to the mutual agreement that perhaps our first initial challenge stemmed from the utter fact that we did not fully grasp the task of our own project and just the TEI language overall. As we all identified ourselves as novice level regarding how comfortable we felt with TEI, having never worked with TEI or tasked with a similar project as the one proposed by this course, we were all a tad flustered in the beginning–probably more than just a “tad.” In fact, when we were first assigned to this project by Professor Schacht, we toyed with the idea of using fluid text, however, we eventually split ways with this approach as we could not locate enough versions of the journal entries that varied from the original one we had in order to complete that. As we internalized the reality that this approach was not exactly feasible, we went back to square one. From there though, we decided to regroup and focus on just picking out and identifying the nouns in the journal entries we had all agreed on so we could encode them. Although this was a step towards the right direction, we ran into the issue of marking up nouns that could not be encoded because we did not have enough knowledge of the TEI guidelines at the time.

As the course progressed, we began to grow more comfortable with our project, fully conceptualizing the power TEI renders and allows for. From being able to tag words, to being able to understand inserted location coordinates and relative distance all while structuring the text properly, the entire TEI language itself allows for so much growth and potential. Despite our growing understanding of the language itself, it is just as important to outline where we did find limitations, which were all primarily issued when we wanted to relate pronouns to the nouns they were referring to, but were unable to do so with TEI. Conjointly, we also ran into a bit of trouble with some of the tags, particularly since there wasn’t specific tags for everything we needed. This limitation however, ceased as we later found out that we could actually create said needed tags. As our project now comes to an end, we hope it serves as a living testament to the power of TEI and in the same way, that it will also generate an outline of the limitations this can also impose and create. This of course, ressonsted with us and was very true in our case since the limited tagging capabilities did not serve us nor benefited what our project was trying to accomplish due to time restrictions. That is not to discredit the TEI language, but rather to acknowledge the areas where it transcends usefulness and the areas where it does not. Certainly, we could most likely look past these areas of concern if time limitation weren’t nearly as pressing.

As for what we gathered and collectively learned from encoding Thoreau, everything was actually quite substantial. Just from a few journal entries, we were able to discover so much about Thoreau in a way that we wouldn’t have had we not approached the project in the way we did. Not only were we able to track down his various locations, one being the discovery that his family home is in Concord, MA, and that he actually moved very frequently (moving homes for a total of three times). In addition, we were able to discover that he frequented Boxborough a minimum of three times. Among our research, we were also able to learn that Thoreau found a kitten near Mr. Picard’s land in 1853 with his sister, Sophia. This acknowledgement was particularly significant as this kitten–who was named Min– was often the focal point for a few of his entries, including in the May 14th journal entry we encoded. Moreover, there was also an older cat referenced throughout his entries, which we later learned he owned as well. The significance of our research ultimately allowed for us to take notice that both the cats mentioned in Thoreau’s entries were in fact not the same cats, and thus, we were able to tag them separately. As a reader, it is quite evident to notice Thoreau’s affinity for cats, as he often observed stray and house cats all over Concord, writing that they were creatures who are “so naturally stealthy, skulking and creeping about, affecting holes and darkness (Thoreau).”

Through the creation of our project, we were able to learn the language of TEI through marking Thoreau’s journal. We created a map, as well as files of TEI using the free 30-day trial of program, Oxygen. After participating in this course and in this project, our group understands that as we move into an even more digitized world, learning the TEI language will work as a great asset. It is not only a great tool to transcribe and produce digital manuscripts, but it also allows for us to understand and experience the “feel” of contributing to a project as large as the one produced in this course. Through TEI, which allows the placement of more detailed information in the original text, we were able to give these journal entries more complexity and allow them to come to life.

