In honor of National Poetry Month, I unearthed this computational example of reading poetry. I took an e-text version of Walt Whitman’s “Song of Myself” located online, copied it, and pasted it into a spreadsheet. I then organized the text so each line of the poem had one cell in one row, labeling the verse of the poem in a separate column. The next step was to generate a quantity associated with each line. I could, for instance, have used a simple formula to tally the number of words per line, but that would not have been interesting in this case. Instead, I used a proprietary tool to calculate a sentiment score for each line of the poem in a process called sentiment analysis. Sentiment analysis is an increasingly popular way for companies to determine whether people are speaking positively or negatively about them online, especially social media. In the end, each line of the poem received a score from -1 to 1 based on how negative or positive the words contained therein appeared to be. I then used the data to create a visualization that shows the sentiment analysis score for each verse in chronological order.
What does this quantification and visualization convey? First, I was struck that there were sequential verses that software read as drastically positive and negative. What are the topics of those verses and why might Whitman have written with such extremity? Is there a clear aesthetic effect he might have been trying to achieve? Second, it was surprising that there were a couple instances of verses that were either perfectly neutral or close to it. What are these verses about and why are they placed where they are? If students are afforded the opportunity to analyze the data visualization, dive into the poem to look more closely at the text, and present their insights, it seems to me a sound use of time. Further, I encourage students to question the validity of the numbers. Do you agree with how the sentiment analysis engine scored the verse? If you agree, what is the logic that the analytical engine seems to be using? What kinds of words are being counted as positive and negative? How might the fact that such analytical engines are used by companies for social media be affecting the kinds of scores it generates? The goal is not for students to place their faith in pure quantitative objectivity. It is to explore the fact that the divide between numbers and words is a false one, that numbers are used all the time to tell stories and that those stories can be as fictional as they are factual.
*To read more about computationality in the English Language Arts classroom, check out my collection of essays called Strata and Bones available on Amazon.*