Author: Paul Schacht

Paul Schacht is Professor of English at SUNY Geneseo

Algorithmic Criticism and the Humanities

In a characteristically lively and thoughtful post, Katie Allen looks at some articles about computer programs that automate the evaluation of student writing. She eloquently expresses a concern that many in the humanities, myself included, share about the use of machines to perform tasks that have traditionally relied on human judgment. “Those of us who study English do so because we recognize literature to be an art form, and because we believe in the power of language to give shape to the world,” she writes. A computer can run algorithms to analyze a piece of writing for length and variety of sentences, complexity of vocabulary, use of transitions, etc., but it still takes a trained human eye, and a thinking subject behind it capable of putting words in context, to recognize truth and beauty.

Yet if we’re right to be skeptical about the capacity of machines to substitute for human judgment, we might ask whether there is some other role that algorithms might play in the work of humanists.

This is the question that Stephen Ramsay asks in his chapter of Reading Machines titled “An Algorithmic Criticism.”

Katie’s post makes Ramsay sound rather like he’s on the side of the robo-graders. She writes that he “favors a black-and-white approach to viewing literature that I have never experienced until this class… . [He] suggests we begin looking at our beloved literature based on nothing but the cold, hard, quantitative facts.”

In fact, though, Katie has an ally in Ramsay. Here is what he says about the difference, not between machines and humans, but more broadly between the aims and methods of science and those of the humanities:

… science differs significantly from the humanities in that it seeks singular answers to the problems under discussion. However far ranging a scientific debate might be, however varied the interpretations offered, the assumption remains that there is a singular answer (or set of answers) to the question at hand. Literary criticism has no such assumption. In the humanities the fecundity of any particular discussion is often judged precisely by the degree to which it offers ramified solutions to the problem at hand. We are not trying to solve [Virginia] Woolf. We are trying to ensure that the discussion of [Woolf’s novel] The Waves continues.

Critics often use the word “pattern” to describe what they’re putting forth, and that word aptly connotes the fundamental nature of the data upon which literary insight relies. The understanding promised by the critical act arises not from a presentation of facts, but from the elaboration of a gestalt, and it rightfully includes the vague reference, the conjectured similitude, the ironic twist, and the dramatic turn. In the spirit of inventio, the critic freely employs the rhetorical tactics of conjecture — not so that a given matter might be definitely settled, but in order that the matter might become richer, deeper, and ever more complicated. The proper response to the conundrum posed by [the literary critic George] Steiner’s “redemptive worldview” is not the scientific imperative toward verification and falsification, but the humanistic propensity toward disagreement and elaboration.

This distinction — which insists, as Katie does, that work in the humanities requires powers and dispositions that machines don’t possess and can’t appreciate (insight, irony) — provides the background for Ramsay’s attempt to sketch out the value of an “algorithmic criticism” for humanists. Science seeks results that can be experimentally “verified” or “falsified.” The humanities seek to keep a certain kind of conversation going.

We might add that science seeks to explain what is given by the world through the discovery of regular laws that govern that world, whereas the humanities seek to explain what it is like to be, and what it means to be, human in that world — as well as what humans themselves have added to it. To perform its job, science must do everything in its power to transcend the limits of human perspective; for the humanities, that perspective is unavoidable. As the philosopher Charles Taylor has put it, humans are “self-interpreting animals” — we are who we are partly in virtue of how we see ourselves. It would be pointless for us to understand what matters to us as humans from some neutral vantage outside the frame of human subjectivity and human concerns — “pointless” in the sense of “futile,” but also in the sense of “beside the point.” Sharpening our view of things from this vantage is precisely what the humanist is trying to do. If you tried to sharpen the view without simultaneously inhabiting it, you would have no way to gauge your own success.

The gray areas that are the inevitable territory of the English major, and in which Katie, as an exemplary English major, is happy to live, are — Ramsay is saying — the result of just this difference between science and the humanities. As a humanist himself, he’s happy there, too. He’s not suggesting that the humanities should take a black-and-white approach to literature. On the contrary, he insists repeatedly that texts contain no “cold, hard facts” because everything we see in them we see from some human viewpoint, from within some frame of reference; in fact, from within multiple, overlapping frames of reference.

Ramsay also warns repeatedly against the mistake of supposing that one could ever follow the methods of science to arrive at “verifiable” and “falsifiable” answers to the questions that literary criticism cares about.

What he does suggest, however, is that precisely because literary critics cast their explanations in terms of “patterns” rather than “laws,” the computer’s ability to execute certain kinds of algorithms and perform certain kinds of counting makes it ideally suited, in certain circumstances, to aid the critic in her or his task. “Patterns” of a certain kind are just what computers are good at turning up.

“Any reading of a text that is not a recapitulation of that text relies on a heuristic of radical transformation,” Ramsay writes. If your interpretation of Hamlet is to be anything other than a mere repetition of the words of Hamlet, it must re-cast Shakespeare’s play in other words. From that moment, it is no longer Hamlet, but from that moment, and not until that moment, understanding Hamlet becomes possible. “The critic who endeavors to put forth a ‘reading’ puts forth not the text, but a new text in which the data has been paraphrased, elaborated, selected, truncated, and transduced.”

There are many ways to do this. Ramsay’s point is merely that computers give us some new ones, and that the “radical transformation” produced by, for example, analyzing linguistic patterns in Woolf’s The Waves may take the conversation about the novel in some heretofore unexpected, and, at least for the moment, fruitful direction, making it richer, deeper, more complicated.

At a time when those of us in the humanities justly feel that what we do is undervalued in the culture at large, while what scientists do is reflexively celebrated (even as it is often poorly understood), there are, I believe, two mistakes we can make.

One is the mistake that Ramsay mentions: trying to make the humanities scientific, in the vain hope that doing so will persuade others to view what we do as important, useful, “practical.” (Katie identifies a version of this mistake in the presumption that robo-grading can provide a more “accurate” — that is, more scientific — assessment of students’ writing skills than humans can.)

But the other mistake would be to take up a defensive posture toward science, to treat the methods and aims of science as so utterly alien, if not hostile, to the humanities that we should guard ourselves against contamination by them and, whenever possible, proclaim from the rooftops our superiority to them. Katie doesn’t do this, but there are some in the humanities who do.

In a recent blogpost on The New Anti-Intellectualism, Andrew Piper calls out those humanists who seem to believe that “the world can be neatly partitioned into two kinds of thought, scientific and humanistic, quantitative and qualitative, remaking the history of ideas in the image of C.P. Snow’s two cultures.” It’s wrongheaded, he argues, to suppose that “Quantity is OK as long as it doesn’t touch those quintessentially human practices of art, culture, value, and meaning.”

Piper worries that “quantification today is tarnished with a host of evils. It is seen as a source of intellectual isolation (when academics use numbers they are alienating themselves from the public); a moral danger (when academics use numbers to understand things that shouldn’t be quantified they threaten to undo what matters most); and finally, quantification is just irrelevant.”

That view of quantification is dangerous and unfortunate, I think, not only because we need quantitative methods to help us make sense of such issues of pressing human concern as wealth inequality and climate change, but also because artists themselves measure sound, syllable, and space to take the measure of humanity and nature.

As Piper points out, “Quantity is part of that drama” of our quest for meaning about matters of human concern, of our deeply human “need to know ‘why.’”

Admin’s note: This post has been updated since its original appearance.

Net Gain?

Does the Internet bend towards a certain kind of politics? Democracy? Anarchy? Totalitarianism? Something else?

Is its basic tendency to promote the freedom and autonomy of its users? Rob them of their privacy? Cultivate a stance of critical detachment? Distract them into complacency?

Does it have no particular bent? Is it just a tool, capable of promoting whatever purpose the user puts it to?

The authors in our next set of readings engage questions of this kind. Although they all acknowledge various abuses to which the Internet is susceptible, they’re broadly optimistic about its overall impact. In the final chapter of Small Pieces Loosely Joined (not, unfortunately, one of the chapters you can read for free on the book’s website), David Weinberger suggests that “The Web’s movement is towards human authenticity” – and, consequently, away from “alienation.” In “The Wealth of Networks,” Yochai Benkler argues that the Internet’s networked structure (the same feature referenced in Weinberger’s title) tilts towards more autonomy in our relationship to culture, more power to find and assess information, more opportunity to engage in democratic deliberation, and more space for non-market and non-proprietary production (simply put, stuff made for love rather than money). The title of Clay Shirky’s book – “Here Comes Everybody: The Power of Organizing Without Organizations” – references not the structure of the Internet but its human corollary: the loosely joined individuals and groups whom the Internet enables to circumvent centralized political and economic organizations in the pursuit of shared goals. Again, the vision offered is one of greater freedom and autonomy.

The Sunday before last, in its SundayReview section, the New York Times published a piece by Jeremy Rifkin titled “The Rise of Anti-Capitalism”. Like Benkler, Rifkin sees profound, and profoundly liberating, consequences in a central economic fact about digital production: that the marginal cost of that production is near zero.

You’ll find a much darker view of the Internet, however, in the talk that Maciej Ceglowski delivered last month at Webstock 2014 in Wellington, New Zealand. Titled “Our Comrade The Electron”, Ceglowski’s talk doesn’t dispute what the writers above argue about the bent of the Internet’s architecture, but it asks us to consider the possibility that maintaining that architecture may just be too hard. And it asks us to contemplate the consequences if that possibility turns out to be correct.

Ceglowski’s piece isn’t on the syllabus, but you’d be a fool not to read it. It’s thoughtful and lively, and the argument is embedded in some fascinating history.

When you’ve finished it — but only then — soothe yourself by listening to Pamela Kurstin play “Autumn Leaves” on the theremin.

The Gettysburg Address as Fluid Text

Digital Thoreau’s “fluid text edition” of Henry D. Thoreau’s Walden is so named in reference to John Bryant’s 2002 book The Fluid Text: A Theory of Revision and Editing for Book and Screen. Every text is fluid, Bryant suggests, insofar as it represents not the definitive articulation of a fixed intention but rather one entry in the record of an author’s evolving and shifting intentions. The full record of those intentions would involve, at a minimum, all of the author’s drafts, and perhaps even information about authorial decisions in flux between the moment a pen is raised and the moment it touches paper.

Some texts are more obviously fluid than others because we have more information about their genesis. Such is the case with Walden. And some are fluid not only because of their pre-publication but also their post-publication history. An example of the latter is Lincoln’s “Gettysburg Address.”

The fluidity of this famous speech briefly became a matter of lively public discussion in 2013, its sesquicentennial year, when conservative media outlets expressed outrage over a recording of it made by President Obama. The reason for the outrage? Obama’s omission of the words “under God” from the final sentence.

As it turned out, Obama had given an historically faithful reading of one of the address’ five versions: the so-called “Nicolay copy,” sometimes referred to as the “first draft” of the address because it’s the earliest surviving manuscript copy and may have been the copy from which Lincoln read at the cemetery’s dedication on November 19, 1863. At the request of Ken Burns, Obama recorded the Nicolay copy as part of Burns’ Learn the Address project, which encourages “everyone in America to video record themselves reading or reciting the speech” — and which might remind us that the post-publication fluidity of some texts (most obviously, perhaps, speeches and plays) is partly a consequence of their having been intended for performance.

Google Cultural Institute has a nice timeline of the address’s interesting textual history. It draws largely from the House Divided Project, a digital humanities Civil War project at Dickinson College to which Dickinson undergraduates have contributed, and where you can read all five drafts.

The Gettysburg Foundation also provides transcriptions of the five versions, highlighting the differences between them in boldface.

In ENGL 340 tomorrow, we’ll take these five versions and encode the differences between them in XML, using the critical apparatus tagset of TEI. Then we’ll display them side by side using the Versioning Machine and — if time allows, and if all goes well — Juxta in order to see how visualization tools can help us understand the fluid nature of one of our nation’s most important texts.


Alone Together?

I’ve been enjoying people’s recent posts on the question of anxiety over technological change.

In contemporary social critics’ and social scientists’ efforts to grasp the cultural consequences of digital technology, one theme that has emerged is epitomized in the title of a book by Sherry Turkle. The book is Alone Together: Why We Expect More From Technology and Less From Each Other. Turkle is Abby Rockefeller Mauzé Professor of the Social Studies of Science and Technology at MIT and the director of the Initiative on Technology and Self.

Turkle’s book offers evidence for the idea that in some ways digital technology is driving us farther apart even as it offers new possibilities for making connections. So it wouldn’t be fair to say that her thesis is refuted by a few images at Kids These Days, the Tumblr maintained by Nathaniel Rivers, an assistant professor of English at Saint Louis University who studies rhetorical theory, blogs as pure_sophist_monster and tweets as @sophist_monster.

But the images do offer reason to be skeptical of the near-apocalyptic claims you sometimes hear about “kids these days” and their devices. (To be fair to Turkle, her own approach to this question isn’t in the apocalyptic vein but is in fact more nuanced.) I routinely hear it said that “kids these days” just aren’t capable of having “real” social interactions — only the “virtual” ones made possible by social media such as Facebook and Twitter.

As Oscar Wilde has Algernon say in The Importance of Being Earnest, “The truth is rarely pure and never simple. Modern life would be very tedious if it were either…”

Literature and Literary Study in the Digital Age, Spring 2014 Edition

“Literature and Literary Study in the Digital Age” is in its third incarnation at SUNY Geneseo as ENGL 340 – a brand new course with its own place in the line-up of offerings under Geneseo’s new English major. The course began as HONR 206, Digital Humanities in Spring 2011, and was offered twice as ENGL 390, first as Studies in Literature: Literature in the Digital Age and then as Literature and Literary Study in the Digital Age.

The latest iteration of the course has its home in this space, a group blog for all students and faculty at Geneseo interested in digital humanities. The blog is part of a larger community organized as English @ SUNY Geneseo, a community powered by the open-source blogging platform WordPress and the open-source plugin Commons In A Box.

If you’re a student in the course, this is where you’ll be blogging this semester, following guidelines you’ll find on the page How to Blog Here. If you’re not a student in the course but you’ve joined the group, please join the conversation and follow the same guidelines.

This year the course coincides with the rollout of two projects at Digital Thoreau, a collaboration among SUNY Geneseo, the Thoreau Society, and The Thoreau Institute at the Walden Woods Project. The two projects are Walden: A Fluid Text Edition and The Readers’ Thoreau. Students in previous iterations of the course have contributed to both projects, and this year’s group will use the projects as resources and carry them further, while also continuing work at Digital Thoreau’s third project, The Days of Walter Harding, Thoreau Scholar.

I’m posting here from San Marino, California, where I’ve just spent the past two days in the Huntington Library with two Thoreau scholars who’ve been instrumental to Digital Thoreau and to the development of this course: Ron Clapper and Beth Witherell. We’ve been looking together at the HM 924, the manuscript of Walden. More on that to come.