Category: Self-reflection

Data-Mining Walden: Tools for Literary Analysis

Henry David Thoreau had a fraught relationship with technology. As we discussed in our presentation, it is difficult to tell whether he would be on board with our digital projects regarding his work. What we can say for sure is that the technology we have engaged with this semester have allowed us to read his book, Walden, as deliberatively and as reservedly as it was written. By apprehending his text in the digital dimension we achieved new and unique insights into the way Thoreau thought about place and how he crafted his thoughts into writing. 

Melissa, Sean, Cal, and Emma each took a chapter to mine in order to track the language of place and its developments throughout the text. This required the downloading and installation of some software with the help of Kirk Anne and Dr. Schacht. Brianne worked on answering the “so what?” question by analyzing the data collected by the other group members. We worked with the Natural Language Toolkit (NLTK) and spaCy,both of which allowed us to mine for certain words and types of words. However, eached proved to have their own limitations within each chapter. We found that spaCy was better equipped in Cal’s mining of “The Ponds” whereas NLTK was more helpful for Melissa, Sean, and Emma.

Zooming out, data mining a text such as Walden did not come without challenges. Whether it was the virtual machine or the local server, Python proved to be a very demanding language, one with a steep learning curve which kept us guessing a lot of the time. Similarly, NLTK and spaCy had to be downloaded directly to our devices in order to accomplish the task at hand. It became pretty clear that while digital tools can often make reading easier learning the tools necessary to do so is all but simple. Still, when grappling with the limitations of all of our tools we seemed to be simultaneously addressing larger questions about the utility of technology, just as Thoreau does in Walden.

Nevertheless, the technology proved indispensable for our project because it helped us to expedite the mining/reading process. Python, the language we used to learn more about Walden, allowed us to operate on the text, while spaCy and NLTK provided a bank of resources that we could apply to the chapters we all chose. Each tool informed us on a general sense of place which we followed up with closer readings. We were able to clearly discern between the broadly spatial chapters (“The Village” and “House Warming”) and the specifically geographic ones (“The Ponds” and “Conclusion”). Whether he was talking about physical places or metaphorical spaces, as in headspace, Thoreau constantly framed his thinking through place specific language. This sort of “mapping” truly makes Thoreau into the “Surveyor of the Soul” that Huey Coleman claims him to be. His attention to the local and the distant, from Concord to Siberia, demonstrates both the interconnectedness that technology in the 19th century was making possible and the expansive reach of an inner geography, a soul whose territory outran the map.

Just as some of Thoreau’s themes exceed the scope of a geographic specific reading, so too did our task at hand exceed the capabilities of some of our tools. One thing our group really wanted to stress in our presentation is the importance of validating failure in digital projects. All of the setbacks, miscues, and limitations faced by engaging with Jupyter Notebook, Atom, Python, Anaconda, spaCy, NLTK, and beyond were equally as useful to thinking about the digital humanities as our successes with each of these tools. When we encountered errors in our work we were forced to ask why. This moment of self-reflection was critical for doing digital work because of the knowledge that stood to be gained by asking questions about the tools. Coming to this class with a variety of digital backgrounds, it was very important that we moved as a unit. Fortunately, the tools we used leant themselves well to collaboration and, ultimately, this project became about creating our own community space around Walden. 

From his comparative measures of White and Walden Ponds, to his rambles through Concord, to his building of a house in the woods, and his reflections on place inward and outward, Thoreau was constantly attuned to the language of place. We too were attuned too language, constantly seeking the instances of geography in his text by moving through it digitally. Just as Thoreau spatializes his world in Walden, so too do we attend to space by tracking its relative importance throughout the book. By using digital tools we were able to read Walden collectively, collaboratively, effectively, and deliberately.

Digital Humanities and Literacy

In class a couple weeks ago we started to have a discussion on TEI and XML. At the start of this conversation I had no idea what those acronyms stood for or what they meant. I came to learn that TEI stands for Text Encoding Initiative and XML stands for eXtensible Markup Language. Even after learning what these acronyms stand for I still don’t really understand what they mean or why they are important. We started talking about how TEI and XML add a rigorous structure to data and they take the shape of a tree, like a hierarchy. I still don’t completely understand what this means, but I found a connection between this and my Literacy Education course I’m taking this semester. We discussed how our ebooks and books in general also take the structure of a tree. The title could be seen as the trunk because it’s the base of what you’ll read later on and then as you go up the tree things get smaller such as the paragraphs, sentences, words, and individual letters. We also have to take into account the punctuation, spacing, and all individual bits of data. While creating our ebooks it’s important to represent that data and be able to recognize it. I found a relation between this and my literacy course because on the first day of class the professor showed us a picture of a bunch of symbols and asked “What do you need to know in order to read this?”

 

Some of the answers that we came up with were what each symbol stands for, what sound is associated with each symbol, what the difference between one symbol and two of the same symbols together sound like, and you have to read left to right. There are five pillars of early literacy that are essential to learn in order to be successful in reading and writing. They are: phonological awareness, phonics, fluency, vocabulary, and comprehension. In my literacy course we focused on phonological awareness and phonics. Phonological awareness is the general appreciation of how language can be divided into its components. For example, we speak in sentences. Sentences can be broken down into words and words into syllables. Just like TEI and XML having its own bits of data that people have to recognize and understand to be successful, children have to recognize and understand the small bits of language in order to read and write.

Blog post #2: What I learned, what I didn’t learn, and what I hope I will learn

“By the time I respond to this post at the end of the semester, I hope to have a greater proficiency in html and a better understanding of what can be done with a VM and why we should want to do these things. But mostly, I want to be able to create fuller answers surrounding the question of the digital humanities’ value.”

-Me, earlier this semester.

Part I : Reality Check

Read more

Democracy and Digitization

Like the human brain or the deepest parts of the ocean, the potential for discovery in the digital age seems boundless, especially to someone new to computing like me. Literature and Literary Study in the Digital Age has provided me with keys to locks on doors that I never even knew existed. The technical tools and languages fascinate me, how they command my computer to do things I never thought possible. However, I want to focus on how these technical things build a sort of digital democracy and how this might act as a model for other social environments. We have learned that most of what makes the internet work is open source and free to use/observe. Granted, editing the web can be limited by administrative privileges, but if I learned anything, it is that I am more in control than I thought when it comes to shaping my computing experience.

Applying these technical tools and concepts The Reader’s Thoreau is the best example of the sort of democracy I am talking about. This community, in which Thoreauvians can exchange questions and ideas about his works, is a microcosmic formation of democracy made possible by the computer. Apprehending a plain text version of Walden, raw and unbound from the material book, allows readers the access to the words at a level beyond that of the book. Plain text and plain-text editing with XML or HTML makes things like CommentPress possible. Digitizing Walden has not only brought the text to the more readers, it has engaged them in conversations with other readers. Here, then, is an example of how the technical can perform the conceptual, how digitization can democratize. After working with XML and HTML in the fall to digitize Yeats, I ultimately wanted all of my digital humanist work to surround this core issue, the democratization of information. Little did I know that the internet is set up perfectly for this type of work.

In my investigations of Lessig and Free Culture it became clear to me that computers are the backbone of what Lessig calls “remix culture.” The ability of markup languages like XML and HTML are instructive and thus can produce and reproduce texts that shed new light on old words. Similar to riffing in music or stigmergy in organizational theory, these languages allow developers (citizens of the web) to repeat and revise content in new and interesting ways. Lessig writes, “democratic tools gave ordinary people a way to express themselves more easily than any tools could before” (33). Just like a camera, the computer allows take control of their reality, revise and remix it to their liking. This makes the internet rich in texture and vibrant in culture. It reflects what is so good with democracy and it relies on technical copying and revision. This copying and revision happens, for us, at the command line, where we have been spending some time this semester. We can participate actively in the process of making and remaking by directly accessing our computers internal structure. Knowing the technical hierarchy gives each of us the chance to govern ourselves, which is both fundamental to democracy and vital to self-preservation in the hyper-surveillance culture we live in today.

True, the accessibility computers provide people can be used for harm. We are living in an era of “memetic warfare,” where hate can be propagated through the exact same methods of copying and revision. Open sourcing the internet is always at risk of this. Trolls on YouTube and Wikipedia will constantly disrupt the ideal digital democracy, just as corruption and scandal will plague our own democracy. However, the moment we attempt to purify this democracy by placing tight restrictions on spaces like Wikipedia and YouTube we sacrifice that very same democracy. In my directed study with Dr. Doggett, we are talking about this precise issue. The theorist we are reading, Slavoj Zizek, would say that to purify democracy is actually a totalitarian move. Thus, we must preserve the aberrations and deal with hate quickly and effectively. Wikipedia does this by running a “Talk” page alongside each entry, a separate HTML file for people to discuss and suggest changes to each page. It relies on a democratic schema to self-organize and create good.

Similarly, we have seen both sides of computer-as-society with The Reader’s Thoreau. We have engaged in a rich conversation of Walden all semester with each other and readers around the world. Blogging and commenting has fostered a community that exemplifies what we should strive for on and off the internet. We have also seen individuals penetrate the community looking to cause harm (I am referring to the woman asking for money). However, thanks to the self-organizing principles of the internet and some quick action from the site’s administrator, the community was able to move passed this and get back to reading deliberately.

All of this has been made possible by a hyperlinked internet that allows users to move freely between data points and information. As Jeffery Pomerantz points out, the potential of an HTML file is the precise reason why we have the internet. This the underlying technical structure of what makes the computer a democratic tool. Texts connect to other texts which connect people to texts and people to people. This is probably the most important thing I will take from this class. The computer’s ability to convene more and different people around a text, inviting new perspectives always, intrigues me as a student and excites me as a person. I want to take the digital humanities into my education going forward as it has proved so helpful in considering the ethics of writing, something I think about constantly. In short, the technicalities of digitization have prompted me to think in new ways about things that have always been important to me. By continuing in the pursuit of discovery, I will continue in my pursuit of democracy.

The War of the STEM and the Human

STEM and Humanities majors, engaging in brutal warfare

There is a war currently being waged between STEM and humanities
majors. This is not a violent conflict, no blood has yet been shed;the only casualties are those who become polluted into thinking that
either field is inherently better than the other.

The common perception is that STEM majors are intelligent but rigid, unappreciative of art and unable to see the “human” factor. This is contrary to the humanities major, who is wise and thoughtful
but so focused in on their particular niche they lose sight of what can be truly useful to society. Such toxic mentalities are rampant in college circles, as students stick to their clustered cliques, tut-tutting the other side for just not getting it.

Into this open vacuum of stuffy jingoism comes this strange little class: the Digital Humanities. And into this strange little class came a student who has long held the belief that those STEM majors just don’t get it. He had heard of this strange little class with its oxymoronic title and he just had to check it out. So this poor student, who has always held a certain degree of contempt for those STEM majors and their stupid intelligence, finds himself learning that the reason he has access to all of his beloved media, his art, his literature, is because of a few lines of code and some strips of a metal he probably couldn’t even pronounce the name of.

That poor student, being me of course, finds himself in an awkward position when at the end of the course he actually thinks that all of these technical STEM things are all of the stuff he always thought was monopolized by the humanities. Languages are just issues of communication solved through the grouping of symbols and sounds. Every time a writer goes back and edits their existing material, trimming away the lines that don’t work, they are engaging in the same behavior as the scientist or mathematician or computer technician that is solving a complicated set of code, or discovering a black hole. What is in our books? A rigid structure of chapter>page>paragraph>sentence>word>letter. It’s just the same as in coding or mathematics. Librarians, those brave guardians of the humanities, use coding and mathematical processes of data collection, as we learned in “Metadata”. So… I guess problem solving isn’t just in the realm of the STEM majors.

Towards the beginning of this class, we discussed what the study of humanities was, and it’s a question I keep returning to even as we trudge into the muck of html and source code. What we do in class, picking through backchannels on networking websites and adding brackets to couplets of letters, that is part of the humanities. It could also be considered part of the STEM field. This class helped to dispel the notion that there is a binary existence of art on one side and math on the other, only separated by a thin de-militarized zone where business majors eke out some sort of meager existence. Rather, it is a nebulous field where both exist, borrowing elements most think belong to the other and transforming them into what we recognize as its pure form. I guess “Digital Humanities” isn’t such an oxymoron.

Whatever I say, I’ve still got my eye on you, STEM majors.

Emilio’s Blog Post #2

I knew about the existence of languages like HTML and XML long before taking this class. However, it seemed like such a foreign concept to me, that one would need to be a genius with technology in order to make anything with it. As we moved further along in the semester I learned that markup and markdown languages aren’t a mystery, they’re a tool, and just about anyone can learn to use them if they had the passion for it. Granted, I myself have a long way to go before I can confidently work with TEI in my group project, but I have the confidence that by the end of the semester it will be no mystery.

What I think is important about markup languages is how valuable they are as tools. As a blacksmith cannot work without a hammer, a web designer cannot work without a universal markup language. Unlike a hammer, XML is only limited by the creative mind of its user. We have seen in class some of the practical ways in which XML and other languages have been used. Some of these include

  • Distinguishing differences, such as what was included, left out, or changed, in writings such as the Gettysburg Address and in Walden.
  • Highlighting certain words in specific colors with TEI, such as proper nouns being blue, in order to determine where something happened, at what time, and telling us who was involved.
  • And building a website to showcase information that would otherwise be hard to find, as seen with Omeka

Granted, on a conceptual level, most of what can be done online can also be done offline, however what I have tome to appreciate is how much easier and more focused studying literature can become with technology. In a novel we read for this class, we had learned that when it came down to computing, the hardest part was the equations. Said equations were not difficult to solve, but it was the effort needed to plug in each and every number, which was all done by human hand, and the amount of time that took which then took time away from research into a topic. With technology, the most laborious factor gets taken away, and this also connects to our study in Digital Humanities; without the difficulty of searching of varying sources of literature, Walden for example, we can instead focus on the differences in those sources which can then be documented through a markup language.

This knowledge has helped me better understand the use of information. Honestly I haven’t changed much as a person, but overall I must say that I am more inquisitive about the ways in which information is used; the idea that, the number of times the word tree or trees are used versus the number of times people are mentioned in Walden and what meaning can be interpreted from it, is fascinating.

Connections

I remember on the first day of class, I told my partner I was pretty proficient in computers. All those days I spent playing tech support for all my older relatives, and being the go-to graphic design girl in my business class in high school, gave me a faux sense of confidence when it came to computers. When I initially thought learning how to use the command line had to do with making graphic lines, was my first sign of trouble. VirtualBox was a bit of a shock at first, considering the expectations I had. On top of that, even though I’m concentrating in English, I guess I had never placed much thought into linguistics itself.  The relationship of linguistics, information, and technology, have been presented to me through the duration of this course in a way I could have never fathomed.

The essence of the biggest lesson I have taken from this class is this: information, in both linguistics and technology, is not a creation of this current digital age. There are decades and centuries behind us of this coexisting relationship. I had personally been clueless of this, just listening to the elders that talked down to my fellow screen-addicted peers, saying that this was only a product of our generation. Looking at Walden alone opened my eyes to this, especially when I had to dig to try to make connections to our class. I remember being given that very task, sitting there thinking, “What? Walden is the polar opposite of all the things we do in class, there’s no way to make a meaningful connection.” But as I began to dig, the puzzle pieces started to connect as I realized the influence of that relationship was so heavily presented in Walden, written over 100 years ago. To continue on, I think it’s worthwhile to not only look for this everlasting relationship, as our current technology continues to grow and boom. As a future teacher, teaching to make meaningful connections between text and our world around us is a key point of literature itself. I think it will be important to give these sorts of challenges to open students eyes to all the relationships and connections around us.

Technically, one of my favorite hands-on programs I have learned is being able to utilize atom. As I journal in it everyday, and maneuver the wonders of plain text, I can’t help but think of the long-term use. While I move forward in my teaching career, knowing how to use a program like atom to be able to save text in different formats, makes every future worksheet, e-journal, and lesson plan easier and more efficient to do on a computer. Not to mention, this is a skill I could implement with upper elementary students. Especially if digital technology continues at the rapid rate it does, skills like markdown and plain text will only become more handy. Having an understanding for a deeper use of the technology we use daily, makes me more versatile in the classroom.

Just last week, when I applied for a job at a local library, I wrote in the special skills box that I was currently taking this class that was teaching me about metadata and encoding. I was already comfortable in the humanities, but now finding ease in my new computer knowledge, makes my skill set more functional and resourceful than ever. I’m glad to not just know the surface level function of these machines, but be able to navigate my machine in relation to my studies, and even far beyond those.

Timeline JS

As I said in my first blog post, I had no idea this course was going to consist of understanding how our computer works and things such as coding. I guess I didn’t even know it was going to be a Digital Humanities course. Therefore when I walked in on the first day I was taken back and a little worried the content would be over my head. However, after taking the time to listen in class I have added new knowledge about technology to my brain that I never thought I’d learn or want to learn. I still don’t know all there is to know about technology, especially with the command line in Virtual Box, but one thing that I have learned that has been particularly interesting is Timeline JS. In class we learned how to add in different information, images, and links to additional sources to create a timeline through Timeline JS. The first thing that came to my mind as a future educator, is how I can implement this into my future classroom. Timeline JS would be perfect for a Social Studies lesson where the students could take information they have learned, such as the years of major wars fought around the the world or dates of major events during the Civil Rights movement, and input the dates on Timeline JS to create their very own timeline. It can also be used for other subjects. For example, english teachers could have their students make a timeline of important events that occurred in a book. I used to do this in school, but it was on paper. Technology has become such an important aspect in education today and it’s only going to progress further, so the more we can incorporate it into our schools, the better off our students will be in the future. Timeline JS has also benefited me in being able to fulfill the Walden Project. I now know how to create a timeline to show the stages of composition of Walden in relation to other events in Thoreau’s life and important events taking place in the world around him. I feel fortunate to have gotten the opportunity to take this course because I now have a new understanding of how technology works. I’m looking forward to using this new information as I move forward in my career and teaching my future students about Timeline JS.

Technical and Conceptual Learning as a Vehicle for Change

With more than half a semester’s worth of learning and work behind me, it is safe to say that I have a great deal more knowledge now than when I began English 340 in January. The scope of skills and concepts I learned about and learned from is extremely wide. Indeed, from learning the basics of the open-source code editor application, Atom, to using the command line on my virtual machine to generate a program that creates journal entries in Atom, I have made great strides in my knowledge of computers, computing, and coding. However, I believe that the most interesting, most life-altering thing I have learned in English 340 is how to understand, write, and utilize HTML.

Read more

Gradual Growth

Looking back on how nervous I was walking into class during syllabus week not knowing what to expect, now makes me laugh. I find it amazing how over the course of a few months, how much I have learned and how my perspective on technology has changed overall. Although all of this new information can be overwhelming at times, it has given me the courage to apply to jobs on campus such as SA Tech and the CIT desk so that even after this class comes to an end, I will continue to learn and broaden my knowledge.

Change is something that I have never been comfortable with and I felt like taking this class was something way out of my comfort zone and a huge risk. I was not sure if I was capable of doing well in it and considering how little of an understanding I had with computers, worried me. With technology constantly advancing I knew I had to become more knowledgable and familiar with the gadgets surrounding me. I have wanted to be a teacher practically my whole life, so what kind of disservice would I be doing to my students by not having them engage with technology? Implementing the use of technology is so important in the classroom because it is an effective approach to connect with students who may learn differently. In high school, I had a teacher who insisted on using one of those old projectors, even though there was a smart board in the classroom. I found it extremely difficult to stay focused and engaged in any lesson and easily got distracted. After that, I promised to myself that when I become a teacher one day, I will create my lesson according to the needs and wants of my students.

Nowadays, there is such a negative stigma surrounding technology and all of the harmful impacts it has on us. Although I am not disagreeing with this statement as social media has been proven to be detrimental to one’s mental health, I feel that because we dwell on the negative, we do not appreciate the benefits that technology has brought us. We do not give enough credit to the cures and treatments that new technology has provided for those with diseases and illnesses. We are too busy focusing on what we do wrong, rather than what we do right. Like anything else, I believe it is essential to find a good balance. I do not think we should rid ourselves completely of technology but instead monitor how much time we spend on social media.

After doing some more research on technology and its impact on mental health, I decided to go on a social media cleanse for a week. I have to say, I felt so out of the loop. Social media is my prime form of communication which made it even tougher to get rid of. I also do not watch the news and get all of my news updates from apps such as Twitter or Instagram. One of the organizations I am apart here at Geneseo, uses the apps Facebook and Groupme to inform us about events as well which made it that much harder to keep up with what was going on. I have to say I am definitely addicted to my phone and giving up social media was not easy. Once the week came to an end, I became more okay with it and actually felt kind of free from it. I believe that a social media cleanse should be something that everyone should do once in a while.

One thing that I know now that I did not know in the beginning of the semeste,r is what XML is. XML is a rigid structure that gives us language to mark up textual data on ways that are specific so that computers can read them. This rigid structure is important because computers do not handle loose data well.  XML enables us to add rigorous structure to textual data. I liked how Professor Schacht compared XML’s structure to the structure of a book. Books are loosely structured data. Although this may seem like such a little takeaway, it is something that I would have never been able to comprehend before. This is a milestone for me since I came into this class knowing nothing.

I am glad that our Walden projects give us the opportunity to explore other websites and applications. This class is not set up like a traditional english class since it includes a lot of hands on learning that is enjoyable and interesting and keeps me engaged during class. I believe all english classes should incorporate some kind of technology aspect to it.