We can use numbers to better understand literature. That’s right: there are fruitful ways for readers to deepen their analysis of books that are computational and quantitative. Doing so can be really intimidating though, evoking terms like text analysis, natural language processing, and data science.
There have to be better ways to create entry points for readers to explore literature quantitatively.
After exploring this idea for a few years now, I have developed what I think will be an accessible and enjoyable activity for readers and English teachers alike. Allow me to walk you through the process for using my customized Shakespearean spreadsheets–which I affectionately call “readsheets”–to create charts that quantitatively present how collections of words and themes develop throughout his plays.
OK. Let’s dive in.
Step 1: Count Every Word
We begin with this spreadsheet. It contains the frequency of every word used in Hamlet, excluding stop words (functional words like and, a, the, etc.)*. I like presenting the spreadsheet in long form, for reasons that will become clear in a moment. As a result, the column headers indicate the Act/Scene of the play.
Step 2: Identify Your Inquiry
In this example, I focus on using the spreadsheet during or after one has already read the play. It’s a way to analyze the text more deeply, to stoke text-based discussions and insights. (Though it is totally possible to use the spreadsheet as a pre-reading exercise as well.) In the case of Hamlet, it is not uncommon in high schools, for instance, to examine Shakespeare’s diction as it relates to death. For our purposes, let’s consider a version of such a prompt, but with a computational twist: Using both quantitative and qualitative data, tell me how Shakespeare portrays death in Hamlet.
Step 3: Create a Customized Table
Now that you know what you are interested in, it’s time to identify all of the Bard’s deathly diction AND to tally how many times each word is used in each scene. It sounds overwhelming, but really is quite simple. Go over to the first column and click on the filter drop-down. Above the list of words, you will see a search box. In it, type “death.”
In a split second, you will see all the words that contain “death” on the spreadsheet. This is worth playing with, because Google Sheets lets you select a few words and then search for others you can select as well. Copy that filtered list.
Next, paste the list of filtered words on the Custom tab of the spreadsheet. That’s where you are building your own customized table to visualize later. Finally, repeat the process for other words that you or students believe refer to death–directly or indirectly. In my sample, I include variations of the following words: death, sleep, and dream. It’s all about using the first tab to filter, and then copying interesting words onto the second tab. That’s the gist.
Once you have the tallies for individual words set, use the SUM function to create a tally of your collection of words. Essentially, you manually create a new data point that captures words-related-to-death-in-Hamlet.
Finally, now that you have your new tallies, you need to create a new table from which you will generate your graph. First, copy your SUM row and Paste Special below it, pasting Values Only to the new row. (This strips the formula, which would likely result in an error later.) Second, copy the top row of the table, which includes the Acts/Scenes, and Paste Special it below your table using the Transpose option. Do the same for the bottom Sum row. Rename the headers Scenes and Frequency (or whatever you like) and you should be set to chart! (This could also be done on a separate table, if that is preferable.)
Step 4: Make a Graph of Your Keywords
So, you have a customized data table with the sum of your keyword tallies. Now you can create a chart. It’s easy enough to do so right there in Google Sheets. Just highlight the new data set you created. Click on Insert and select Chart. For these data, an area chart tends to work really well. Once the basics look right, you can also play with the Chart Editor options, which is how I got different labels and colors. (Note: You can also copy your data set and paste it into other visualization tools online like this one.)
Step 5: Identify Insights and Inquiries from the Chart, and Re-read
Now that you can see the data visualized, what insights and inquiries surface for you? When I study the chart above, I wonder what is happening in Act 2, Scene 1. It’s one of only a handful of moments where the Bard’s deathly diction disappears. The scene is where Ophelia tells Polonius that she thinks something is wrong with Hamlet, highlighting the idea that Hamlet might not just be glum or eccentric, but mad. When I re-read the brief scene, I am drawn to this passage in which Ophelia says:
My lord, as I was sewing in my chamber,
Lord Hamlet, with his doublet all unbrac’d,
No hat upon his head, his stockings foul’d,
Ungart’red, and down-gyved to his ankle,
Pale as his shirt, his knees knocking each other,
And with a look so piteous in purport
As if he had been loosed out of hell
To speak of horrors, he comes before me.
Though she doesn’t speak of death here per se, her line “As if he had been loosed out of hell // To speak of horrors” resonates with me. It suggests that Hamlet re-appears as a phantom en route from hell. That is, his earthly behavior and mental state is one that transforms him into a tortured spirit. Though explicitly Shakespeare’s deathly diction is absent, the passage introduces a powerful concept: that the mind itself is the bridge between this world and the next. It’s like that line in Paradise Lost where Milton’s Satan declares “The mind can make a heaven of hell, a hell of heaven.” Hamlet’s mental state is the epicenter of the play, particularly as it relates to death. As I processed what the qualitative data (Ophelia’s excerpt) was suggesting, a new inquiry emerged: What is the correlation between Shakespeare’s language of death and his language of the mind?
I decided to go back into the data. This time, I created a new dataset for words associated with the mind, including: think, mind, brain, and mad. It looked like this:
Following the same steps as before, I created a new column for mind-related words and added it to my previous table. The result is a compelling visualization that offers some quantitative perspective on how Shakespeare portrays the relationship between death and the mind.
Were I teaching this text now or discussing it with my book club, I would love to explore the relationship between death and the mind further, especially in those few moments in the play where the language of the mind outshines the language of death–in 2.2 and 3.4.
Step 6: Try and Share with the World
Now that you have seen how I created a data visualization with Hamlet, it’s time to try one for yourself. I’ve made data sets available for some of my favorite Shakespearean plays! Here they are:
And I’d love to see what you create. Please follow and tag me on social media (@tomliamlynch) and I’ll check out your creations. Now off thou go!
* I create these “readsheets” for individual works of literature, ensuring that they are as accurate as possible. It requires previewing e-texts for boilerplate disclaimer language and transcription oddities, manually structuring the texts for analysis, verifying that the text mining program is reading the work correctly, and ensuring nothing gets lost in translation when saved as a spreadsheet. I’m exploring ways to create these for more works. If you have one you’d like to see, tell me on Twitter.