Hello, my name is Emma, and I am a first-year in the Slavic PhD program. I graduated from New College of Florida in 2019 with a joint bachelor’s degree in Computer Science and Russian Language & Literature, so the work that I do lies at the intersection between these two areas, utilizing natural language processing techniques to extract information from nineteenth-century Russian literature. Past work I have done in this area has been in quantifying the semantic similarity between music and sexuality in Tolstoy’s The Kreutzer Sonata, using the algorithm word2vec to generate word embeddings (vector representations of tokens within a work), then measuring the cosine similarity between these vectors: words whose vector representations have the smallest cosine angle between them are most similar and appear in similar contexts in a work.

Currently, I’m working on an implementation of Named Entity Recognition (NER), an algorithm that takes a text as input, then outputs a tagged version of the text where the names of characters, places, etc. are identified as such.  NER is typically trained on data such as news stories, tweets, and Wikipedia articles; however, the naming patterns that appear in text of this kind are distinct from the naming patterns that occur in literature, which can take on a more nested form. The implementation of NER I’m working on is trained on a corpus of Russian literature in the original, so that when it is tested on other works of Russian literature, it is able to pick up on the syntactic forms that are distinct to text of this type.

I’m taking this course due to my general interest in the digital humanities, as well as a desire to distance myself further in methodology from what has become standard in the field of computer science, where quantitative rigor is given precedence over philological interaction with source text. Similarly, my goal for the course is to familiarize myself with various critical standpoints in the digital humanities across fields, not just in work with text, to better understand the implications of digital work with other forms of data.

One thought on “Emma’s Intro

  1. Welcome, Emma!

Leave a Reply