Holly Gilbert – Digital Humanities at Geneseo

December 14, 2017December 14, 2017 by Holly Gilbert - No Comments

What Women Wrote

19th century American women writers received plenty of disapproval and even derision for their work during their lifetimes, and their texts were often placed into boxes – domesticity, sentimentality – that may not be fair representations of what they were really interested in writing about. Take Nathaniel Hawthorne’s biting remark about his female contemporaries as an encapsulation of this attitude:

“America is now wholly given over to a damned mob of scribbling women, and I should have no chance of success while the public taste is occupied with their trash-and should be ashamed of myself if I did succeed.”

With this in mind, I aimed to perform a textual analysis in two parts: on an individual author basis, with five selected writers I have studied in the past year, and on a macro level, with a larger sample of women’s writing from this period being gathered together for a much more distant read. My intention was to identify patterns throughout these texts that, for the individual authors, indicate the weighty social issues being address (slavery, education, law, and economic empowerment are big ones), and for the macro-level analysis, show us how sentimental their writing may be.

Analyzing the language of texts is a very traditional humanities aim; essentially, I’m performing the age-old task of drawing connections between a piece of literature and its historical and social contexts, as well as challenging preconceptions about who we read and why. Where this project really differs from traditional scholarly efforts to read and understand texts is in the way it analyzes literature. Instead of performing a close reading, I’m using digital tools to “distantly” read literature – pulling out patterns and data that may suggest what writers are focused on without necessarily having a strong, intimate understanding of the individual text. This reading also allows humanists to evaluate an entire subset of writing – the nineteenth century, women, etc. – and find intriguing results to further evaluate, rather than build an understanding from the ground up by developing an expertise in specific works of literature.

This work was all compiled on a WordPress site, which went through a few different themes before I settled on one that would provide the easiest navigation for visitors – a side menu bar with the various options listed in full as readers are scrolling down each page, paired with a top menu that also shows where these pages can be found at a glimpse. I was also aware that I didn’t have too many eye-grabbing images to include, so I needed a more text-centric theme that was still appealing to visitors.

The first page users will come across is the welcome page, which introduces the project, addresses its impact, and briefly points to some of the conclusions that can be found in the analysis pages of the site. I’ve also incorporated an overview of the texts being individually analyzed in a Timeline JS timeline, which can be seen below:

My intention is that users will first, if interested, look through the author analysis pages in order to learn more about each of the women’s writings I selected. On each page, I include a brief introduction of the writer, and then the text analysis put in context with my own close reading of the novels or other pieces being examined. For the analysis, I used Voyant Tools, a free online resource that allows users to upload text files and analyze them in a variety of ways – for my purposes, I mainly used Word Cloud, which is a visual representation of word frequencies, and Trends, which creates a graph tracking the relative frequency of specified words across one or multiple texts.

The introductory text for E.D.E.N. Southworth.

The best way I found to represent these results was to embed them directly from Voyant Tools, which allows users to interact with the graphs rather than just view them as an image. This has the potential to create issues with my project – it’s unclear how permanent these embeds are, since they’re hosted by Voyant Tools itself. So far, it’s been a couple of months and there have been no complications from this, but I did save the graphs as PNG images just in case I need to insert those instead. The benefits of the embedding can best be seen in the Word Tree chart I used for E.D.E.N. Southworth’s The Hidden Hand – with this tool, users can click on selected words to see their linkages and contexts. Try it for yourself below, and explore Southworth’s usages of the terms girl and work.

As a whole, I found some really interesting patterns using Voyant Tools on the smaller-scale level. Although I cannot include all the author analysis pages on this post, Hannah Webster Foster’s Trends chart is another example of the conclusions that may be drawn using digital tools.

Here, I’ve embedded the Trends chart from this page. Although readers can see what they find on their own by clicking through it, I’ve also included my own analysis:

The visualization of the pattern discussed above offers readers the chance to notice what may not be apparent in a close reading, or understand how Foster might be pushing against social norms even if they haven’t had the opportunity to read her novel The Coquette for themselves.

The second part of my project is the macro analysis, and the best description of how this was developed can be found right in its page on the WordPress site:

Sentiment analysis became a major focus for this part of the project – using digital tools to evaluate the emotions, positivity, subjectivity, and other non-quantitative aspects of a piece of writing. Polarity is one of the key aspects of sentiment analysis, and what I’ll be using here to demonstrate some of the difficulty I’ve had in drawing conclusions from such a tricky tool. For a description of sentiment analysis and some of its issues, read another excerpt from the project:

Here, I’ve embedded the line chart that has been produced from Kirk’s data on the polarity of these nineteenth-century American women writers, and also shared my interpretation of it below. As noted, a higher positive value indicates a more positive use of language (closer to +!), while -1 represents max negativity.

Text analysis was a much more conceptually simple tool when I was using it to evaluate individual texts that I was already quite familiar with, and I think the context I lend the graphs in my individual author pages help readers draw significant conclusions – yes, women writers in this time were interested in some pretty major and controversial national debates. Harriet Beecher Stowe, for example, wrote one of the most influential abolitionist works in our nation’s history, but my site also lets these results be read alongside the less well-known Harriet E. Wilson’s examination of the same issue in a much different context:

The macro analysis is inherently more problematic, but I end my project with the conclusion that sentiment analysis can offer humanists potential directions to take future work. I can’t make decisive conclusions just from the polarity and subjectivity results in the scope of my project, but they would have much more meaning if compared with the same values from a sample of men’s writing, or even from a sample of twentieth-century women writers. There are so many directions to take this kind of work, and I think my project succeeds in at least addressing this possibility. This has been a conceptually challenging undertaking, but over the course of the semester, I’ve been able to not only increase my technical skills (building a WordPress, performing some basic text analysis, doing plenty of embedding) but also my theoretical understanding of how humanists can use digital tools and technical, quantitative analysis to draw conclusions about literature.

The end of my welcome page, addressing these same issues.

November 5, 2017November 5, 2017 by Holly Gilbert - No Comments

Progress: Women Writers & Social Change

My project is a text analysis of the works of American women writers in the 19th century. Based off the core texts I read in Dr. Caroline Woidat’s ENGL 439: American Ways: Plotting Women, I hope to prove with digital tools that women writers in this period were intent on tackling pervasive and even controversial social issues. This work will attempt to break down misconceptions that early American women were confined to the sphere of domesticity in their writing by examining their works’ relationships to topics such as slavery, education, economic empowerment, and more.

Harriet Beecher Stowe, Harriet E. Wilson, and E.D.E.N. Southworth

Establishing connections between literary pieces and historical contexts is a traditional humanities aim, allowing us to understand what these texts may be responding to or influencing. Like many projects, it will focus on identifying trends, patterns, and language to support this goal. The benefit of digital tools, however, is that they allow us to push the boundaries of what a single human scholar can study; using Python, Voyant Tools, and/or other programs with text analysis capabilities, many lifetimes of reading can be processed at once.

One of my main obstacles in tackling this project has been stepping outside of my comfort zone and embracing the broad scope that digital tools can allow my work here to have. I’ve read roughly eight texts for the class this project is stemming from, and I initially intended on just analyzing those – I know them well, which means I can run them through text analysis with the results already envisioned. Below, the spreadsheet I’ve been developing this semester lists these women and their literary works on top:

Discussion in class and my meeting with Kirk Anne, however, have pushed me to include many more writers and texts in this analysis. Kirk can pull every text file from Project Gutenberg (happily, the time period of focus here precedes copyright law) and apply commands to them in order to identify the word choices, patterns, and relationships I’m looking for. That’s tens of thousands of options, which opens up so many opportunities – we could compare trends in women’s writing across time periods, compare them to a canon of typically much more well-known men’s works, etc. There is an honestly overwhelming amount of approaches, but for now, I am focused on finding a wider selection of American female authors within the window of the 19th century, using the writers I initially intended to analyze as major touchpoints.

This process is a challenge in itself, as I find myself running into the most sweeping questions English as a discipline faces – who do we read, and why? Which books and authors should I include, and on what basis? It’s also technically difficult; Project Gutenberg does not make distinctions between male and female authors, so Kirk has been working to parse that out based on name – leaving a massive chunk of authors in uncertain territory. The Excel file he has compiled of all the women writers on Project Gutenberg that he can identify is still monstrous, however!

Kirk Anne’s Excel spreadsheet listing all 4,103 identified women writers on Project Gutenberg; here are just a few in the range of birthdates to death-dates I’m focused on.

As I broaden my canon, I also have to identify which women are American on my own. I’ve been doing this by cross-referencing the Excel file with some Internet research on women writers within the given period (which, again, brings me back to an ideological conundrum about finding authors who aren’t going to pop up in Wikipedia lists but still had important things to say).

Here, I’ve identified Susan Warner as American and have highlighted her many texts as possibilities.

There’s also the question of genre; in my original selection, most texts are fictional novels, but there are lots of women writers in this period who were publishing nonfiction, poetry, or, in Harriet Beecher Stowe’s case, defenses of their own novels. There’s a lot I’d like to include that will force me to really consider how comparable different types of texts are and where data might be skewed.

Another, less mind-boggling obstacle is just my level of technical proficiency. Voyant Tools, intended for this type of digital scholarship, has been a great free resource for me as I begin to explore the connections between texts and social concerns on my own. With these, I can create frequency-based word clouds using the Cirrus tool, visualize the relative frequencies of word usages across all my texts with a graph, and get some data about which texts are longest and use the most unique words. Voyant Tools is user-friendly; you simply upload the text files you intend to use and it runs them for you on their website. Then, you can change which tools you’re using, make your own restrictions as to which words you want to search for within texts, and then export the results as images or embeddable HTML.

A Cirrus word cloud based on some of the core texts I’m using with specific “stop words” edited out (i.e. said, he, she, the, and).

Tracking the relative frequencies of a few key terms (orange = slave, green = christian, pink = school*) across texts – slavery hits a huge peak in *A Key to Uncle Tom’s Cabin*.

Voyant Tools does have significant limits when it comes to more advanced or particular text analysis attempts, which means Kirk Anne has been doing a ton of work on this project, pulling files and running his own commands. I do not have the programming knowledge to carry much of this out, which means I’m really going to be relying on his expertise and working to decide what I want to look for and how that can be pulled off. I’ve been really impressed with the range of possibilities this has opened up; for me, analyzing texts has always meant closely reading them, whereas he can find patterns in over 10,000 works at once.

Once I finalize my list of texts, I can start delving into finding the patterns I’m hoping to identify. In the next few weeks, I can hopefully compile some interesting results and display them – which leads to more choices and digital tools, for I am debating creating a WordPress to host this project and also developing a TimelineJS timeline of the women writers being showcased.

September 20, 2017 by Holly Gilbert - No Comments

Folger Digital Texts

Folger Digital Texts is a project undertaking a very traditional humanities aim – preserving, organizing, and making available Shakespeare’s collection of works – but breaks newer ground with digital tools and a great contribution to the collaborative, open source nature of digital humanities initiatives.

Shakespeare can be found everywhere, to be sure, but these editions are unique because they offer a highly readable, navigable, and interactive text to users. Physical books, PDFs, and simpler websites containing Shakespeare’s work are limited when it comes to these features, which is why the digitals tools the Folger Shakespeare Library implements really shine. Public domain texts with the widespread cultural impact that Shakespeare has are always going to be available, but a project like this ensures a quality reading experience to those who might otherwise lack access to a reliable physical edition or digital file.

For instance, once you open a play of your choice – let’s say, Hamlet – you are immediately shown a few displays. In the center of the website is the play itself, formatted much like a physical book with a clear, attractive layout. On the left, you have a display of information about the text, which is present throughout the reading experience. You can switch between a synopsis, a character list, and the table of contents of the work. If you have forgotten who Reynaldo is, for instance, you only need to slide your eyes to the left to see that he is Polonius’ servant. If you have been reading Act III for what feels like an eternity and want to know exactly how many scenes are left to get through, again, look to the left. There is no flipping between pages or scrolling endlessly to find what you need here; the tools exist for you to find background information, lines, and scenes immediately.

Furthermore, the text is marked at various points to indicate some of the changes between versions of the play; wherever there is a square bracket one can hold their cursor over, a note will appear with information about the version this word or phrase came from. This illustrates how scholarly work has been carefully encoded into the edition, although this feature is fairly limited in this project – it would be fantastic to have more complete scholarly notes strewn throughout the text. For a free online resource, however, this may be a tall order.

Projects such as this can be achieved through the use of TEI (Text Encoding Initiative), a language developed from XML that presents humanists with a general set of guidelines for digitally encoding texts. A thorough explanation of TEI can be found here, but you can see the process that may have been undertaken to encode Titus Andronicus in one of this website’s examples, including some important elements:

Outside of creating an accessible and dynamic collection of Shakespeare’s works, Folger Digital Texts also has another purpose, one that marks it as even more unique from traditional humanities projects. For every play, there is an option to download the complete, edited text in seven different file formats: XML, HTML, PDF, DOC with line numbers, DOC without line numbers, TXT, and TEI Simple. This turns the project into an excellent, centralized source for those accessing Shakespeare’s work to use in further digital projects – maybe humanists can pull a file to perform textual analysis of a play, examine relationships between Shakespeare plays and more recent works, or develop a digital edition with a different aim.

This spirit of open collaboration is a bit different from the type of collective work traditionally done through scholarly references, panels, and journals. Scholars can use this project to build many others simultaneously, freely, and easily (at least to the extent that they can begin with a reliable text file). There is no need to build a digital edition of a Shakespeare play from scratch for further initiatives when a TEI Simple file can be accessed at the click of a button. The Folger Shakespeare Library has recognized a need to share its labor with other digital humanists, and this focus on the growth of the field is very promising for those of us just getting started.

May 12, 2015 by Holly Gilbert - No Comments

Building a Thoreau Timeline

The Thoreau Timeline

Henry David Thoreau was a busy man, and our group was tasked with listing and detailing his many lectures, publications, journal entries, and general biographical facts along with his work on A Week on the Concord and Merrimack Rivers and Walden. We were provided with a rough outline of what such a timeline should encompass from Stephen Adams’ and Donald Ross’ Revising Mythologies: The Composition of Thoreau’s Major Works, as shown below.

A valuable resource but not the most intuitive.

Our main goal was to expand upon this information and put it in a more visual, interactive format. To do so, we used TimelineJS. This service allows users with little technical knowledge to create an attractive, intuitive timeline complete with media and layered information. TimelineJS provides a template for a Google spreadsheet, then generates a code for the published timeline that allows users to embed it into a website.

An example of what entries in the spreadsheet look like.

Our group split the categories into six parts, and we initially created six different spreadsheets of data. Alexa had the biography, Cassie had the journal entries, Gabe had the lectures and articles, and Holly had A Week and the Walden versions. Our text was drawn from a variety of sources (the most helpful to us being Walter Harding’s The Days of Henry Thoreau), and the images were generally licensed for use under Creative Commons and gathered from Flickr or Wikimedia Commons. If possible, Gabe linked articles and essays written by Thoreau into the timeline using PagePeeker, which provides an image and link of the website he found the essay on. Our main obstacles were working with TimelineJS itself – figuring out how to enter dates without knowing specific days of month, where to source our information, dealing with invalid image links, and tagging information that fell into different categories.

We tagged our entries in order to create a six-layered timeline.

In the end, we combined our six timelines into one monstrous spreadsheet dubbed “The Master TImeline.” Once this was published and fine-tuned, we had a visually intriguing and well-organized timeline that allows readers to easily connect different areas of Thoreau’s life to a certain time period. Building this timeline not only pulled together some of the technological skills we had learned over the semester, but gave each of us insight into Thoreau’s life and work.

(Holly) A Week and Walden: The timeline allowed me to pull together various aspects of the class – the various readings, the fluid text edition, the significance of the manuscript changes – and really illustrate the development of Thoreau’s literary works. Tracing the progress of Thoreau’s two books showed me how much Thoreau’s writing was impacted by his life experiences and the ups and down of his literary career. What struck me most was how prepared Thoreau seemed to be for the publication of Walden after finishing and working on publishing his first book, A Week on the Concord and Merrimack Rivers. As we know, it would be many years and manuscript changes later before Walden was actually published, and this was (arguably) partially due to the failure of A Week. What would Walden be had Thoreau’s first book sold well on the market? The timeline is a valuable tool that offers a big-picture perspective on how Thoreau’s life experiences interacted with his writing. Studying the intertwining nature of some of the entries, for example, can show how certain events may have shaped others, as is the case with A Week and Walden.

From the Walden timeline. — From the *Walden* timeline.

(Alexa) Biography: When creating my timeline on the biography of Thoreau’s life, I certainly found that Thoreau was a very exploratory person. He had taken multiple visits to Cape Cod and Maine, as well as Hill & Plymouth in Massachusetts for exploration purposes and to gather information for his journals. He also had taken an opportunity to travel to Canada mainly because the train ticket was very cheap. Along the course of Thoreau’s life, I found dates in the biography relating to what we found through reading Walden: when the cabin was being built at Walden Pond, when Thoreau lived at Walden Pond, the date Thoreau spent the night in jail, and when he returned to his home in Concord. I also found it interesting that this timeline helped show a different perspective on Thoreau not really seen when you read Walden. There was an event on the timeline dealing with the Burns Affair, for example. After researching this, it turns out this had to do with an escaped slave on trial, Anthony Burns, and Thoreau was protesting the return of Burns to his slave owner. I found some great detail about Thoreau’s opinion on slavery.

(Cassie) Journals: This task was challenging because it required me to organize the two journal sections from the original timeline, and then figure out where each entry recorded is located in the journal. The journal entries ended up being separated into two different sections, the 1906 and the Princeton/Mss versions. Although there was a bit of overlap, for the most part, where one journal ends, the other journal begins. In both of these versions, Thoreau manages to stay consistent with narrating the little things that go on in his day-to-day life. By organizing these entries, I was able to learn more about Thoreau’s thought process. When Thoreau was deciding to leave “high society,” for example, I was surprised to learn that he had much more going on in his life than just this desire to “live deliberately.” I also found that he left to “cure” his writer’s block. He talked about his fear that if he did not leave, he would never be able to finish his writings. In some ways, Thoreau may have surprised himself with how much his departure from society changed him. This realization made it easier for me to be able to connect the writing with the human. This is partly why the timeline itself is important; it connects all the parts of Thoreau’s life to give an impression of the real man, not just a series of disjointed writings. While reading these entries and figuring out a timeline of the feelings, thoughts, and emotions reflected in them, I was better able to understand not only the premise for Walden but also the way that Thoreau’s brain works.

(Gabe)
Lectures and Publications: My portion of the timeline was dedicated to recording the dates of the lectures that Thoreau would give on his various works across the Northeast, as well as the publication of those works beginning with his first lecture in May of 1835. I think the thing that most interested me about this reconstruction was how it demonstrated how hard it is to reestablish the past – so much of what I inputted was only known about from an off-handed mention in a journal entry of Thoreau’s, and because of that we often know that a lecture on a subject occurred but cannot give an exact date, or know Thoreau lectured on a certain date but are unaware of what it was on. I think that really demonstrates the necessity of linking the digital world with the humanities, so that in the future we do not have to piece together fragments in order to gain only the shadow of an understanding of our context, but can preserve it whole for the future.

March 27, 2015 by Holly Gilbert - 1 Comment

How Coding is Affecting Citizenship

Digital tools, as we’ve been learning, are a valuable asset when it comes to studying the humanities. With this in mind, I wondered, what other unexpected areas are similarly touched by technology? As it turns out, there’s a concentrated effort to transform the government – in all its massive, extensive, and inefficient glory – with open-source programs and teams of everyday coders. Code for America, founded in 2009, is a nonprofit organization that seeks to change the government with the use of technology.

This organization focuses on working with local governments. It enlists technologically-apt “fellows” to work in partnership with various local government for a year, in an effort to improve health, economic development, and safety & justice. In addition, they sponsor volunteer brigades, a network of interested government workers, and a few other groups in order to make the government more technologically-minded.

In this TED Talk video, Jennifer Pahlka talks about the vision Code for America has for government and citizenship. She makes it clear that embracing open source technology and creating apps that encourage people to take on more civic responsibilities can have a huge impact on the relationship between government and citizens.

While I was browsing through some of the applications Code for America has created, I found a few great examples of the different ways in which technology can impact us. For instance, take the “Public Art Finder” app, which is currently available in five U.S. cities. It allows users to find and learn about public art using a map interface, thus supporting local art and bringing interested citizens to it. I also looked at Boston’s “DiscoverBPS” app, which offers parents information on the admittance requirements, data, and test scores of area schools they might consider for their children.

What really stood out to me was how open all of these resources are. While I don’t know nearly enough about technology to understand how these apps work and how they can be spread, it is clear that this organization provides anyone looking at these programs with their codebases and instructions (albeit complicated ones) on how to employ them in your area. All of the information is accessible and open, serving as an example of the kind of change that can be made with the use of digital tools.

February 18, 2015 by Holly Gilbert - No Comments

Music as a Fluid Text

Non-percussionists may even find some comments disturbing.

In music ensembles, I’m given pages of marked-up, wrinkled pieces. People cross out entire measures, add in parts that didn’t previously exist, or change the dynamics and tempo of the piece without much explanation. When the piece gets to me, possibly after years of use, I get to see the many different interpretations of one musical composition and add my own comments.

As Casey has posted about in reference to Shakespeare, edits can change the meaning of a single text, and this is true with music pieces too. Take these two videos of cellists taking on the same piece, for instance – one performer uses a traditional approach, while the other incorporates beatboxing. Who is to say that one version is more valid than another?

Every time a piece is performed, it’s likely to be interpreted slightly differently by each individual conductor and player. Often, we learn about the composer’s original intent before playing the piece, but so many small changes and revisions are made each time it’s performed that you could say there are countless versions of every musical “text.”

In my opinion, the collaborative nature of a music score is comparable to digital humanities. I think of a music score as a fluid text, much like the digital Walden texts we use that allow readers to trace the various manuscript changes or leave their own thoughts in the margins. As Jack Stillinger argues in “A Practical Theory of Versions,” textual pluralism (what he calls the idea that every version of a work should be considered) is becoming more possible with digital efforts such as these. With music pieces, glancing at a marked-up copy of sheet music or listening to different performers can show you a wealth of interpretations that you may not have considered on your own. Working with digital literature seems to have similar benefits – I not only learn from Walden itself, but from the record of its changes over time and my fellow classmates’ varied thoughts about the text.