Applying psychological theory and text-analyzing software, researchers at the University of Texas at Austin have discovered a unique psychological profile that characterizes Shakespeare’s established works, and this profile strongly identifies Shakespeare as an author of the long-contested play Double Falsehood.
“Research in psychology has shown that some of the core features of who a person is at their deepest level can be revealed based on how they use language. With our new study, we show that you can actually take a lot of this information and put it all together at once to understand an author like Shakespeare rather deeply,” says researcher Ryan Boyd of the University of Texas at Austin.
The study, conducted in collaboration with James Pennebaker, also at UT-Austin, goes beyond examining authorship from the standpoint of word counts and linguistic regularities, providing a deeper exploration of an author’s psychological profile.
“This research shows that it is indeed possible to start modeling peoples’ mental worlds in much more complete ways. We don’t need a time machine and a survey form to figure out what type of person Shakespeare was — we can determine that very accurately just based on how he wrote using methods that are objective and easy to do,” Boyd explains.
Double Falsehood was published in 1728 by Lewis Theobald, who claimed to have based the play on three original Shakespeare manuscripts. The manuscripts have since been lost, presumably destroyed by a library fire, and authorship of the play has been hotly contested ever since. Some scholars believe that Shakespeare was the true author of Double Falsehood, while others believe that the play was actually an original work by Theobald himself that he tried to pass off as an adaptation.
Boyd and Pennebaker realized that using psychological theory to inform analysis of the playwrights’ respective works may shed light on the authorship question. They examined 33 plays by Shakespeare, 12 by Theobald, and 9 by John Fletcher, a colleague (and sometime collaborator) of Shakespeare. The texts were stripped of extraneous information (such as publication information) and were processed using software that evaluated the works for specific features determined by the researchers.
For example, the researchers’ software examined the playwrights’ use of function words (e.g., pronouns, articles, prepositions) and words belonging to various content categories (e.g., emotions, family, sensory perception, religion). They had the software identify themes present in each of the works to generate an overarching thematic signature for each author.
They also examined the works to determine how “categorical” the writing was. Categorical writing tends to be heavy on nouns, articles, and prepositions, and it indicates an analytic or formal way of thinking. Research has shown that people who rate high on categorical thinking tend to be emotionally distant, applying problem-solving approaches to everyday situations. People who rate low on categorical thinking, on the other hand, tend to live in the moment and are more focused on social matters.
By aggregating dozens of psychological features of each playwright, Boyd and Pennebaker were able to create a psychological signature for each individual. They were then able to look at the psychological signature of Double Falsehood to determine who the author was most likely to be.
Looking at the plays as whole units, the results were clear: Every measure but one identified Shakespeare as the likely author of Double Falsehood. Theobald was identified as the best match only when it came to his use of content words, and even then only by one of the three statistical approaches the researchers used.
When Boyd and Pennebaker broke the play down into acts and analyzed the texts across acts, they found a more nuanced picture. For the first three acts, the analyses continued to identify Shakespeare as the likely author; for the fourth and fifth acts, the measures varied between Shakespeare and Fletcher. Again, Theobald’s influence on the text appeared to be very minor.
“Honestly, I was surprised to see such a strong signal for Shakespeare showing through in the results,” says Boyd. “Going into the research without any real background knowledge, I had just kind of assumed that it was going to be a pretty cut and dry case of a fake Shakespeare play, which would have been really interesting in and of itself.”
By using measures that tapped into the author’s psychological profile, Boyd and Pennebaker were able to see that the author of Double Falsehood was likely sociable and fairly well educated — findings that don’t jibe with accounts of Theobald as well educated but also rigid and abrasive.
Together, these findings clearly show that exploring the psychological dimensions of a literary work can offer even deeper insight in the process of textual analysis.
“I’ve always held huge admiration for scholars who grapple with literature — there is a great deal of detective work that goes into figuring out who the authors really are ‘deep down,’ their motivations, their lives, and how these factors are embedded within their work,” says Boyd. “We demonstrate with our current work that an incredible amount of this information can be extracted automatically from language.”
The research was supported by grants from the Army Research Institute (W5J9CQ12C0043) and the National Science Foundation (IIS-1344257).
For more information about this research, please contact study author: Ryan L. Boyd [email protected] For a copy of the research article and access to other Psychological Science research findings, please contact: Anna Mikulak Association for Psychological Science 202.293.9300