World Historical Gazetteer

For this portion of the assignment, I chose to look at Saint Petersburg, as I know it refers to a place in both Florida and Russia. When I searched this term, I was not surprised to see dots appear in these two places on the map, but it was interesting to see the other places that are referenced by this word pairing.  

Along with Florida and Russia, “Saint Petersburg” references locations in Colorado, Pennsylvania, and South Dakota. Of the seven returned results, Florida is the only space with more than one result. containing the variants “Saint Petersburg,” “Saint Petersburg Beach,” and “Port of Saint Petersburg.”

Apart from the various locations of Saint Petersburgs globally, I wanted to look at this particular city due to its change in name in Russia over time. When I clicked on the returned result in Russia, there were five different attestations: “Saint Petersburg,” “Sankt-Peterburg,” “Leningrad,” “Petrograd,” and “St Petersburg.” “Sankt-Peterburg” is the transliteration of the Russian name for the city (Санкт-Петербург), while “Leningrad” and “Petrograd” refer to  names given to the city in the twentieth century. “Grad” [град] is the Old Slavic term for “gorod” [город], meaning “city,” so both “Leningrad” and “Petrograd” refer to Lenin’s city and Peter’s (Peter the Great) city, respectively. Saint Petersburg was renamed Petrograd following the first World War, then renamed to Leningrad following Lenin’s death in 1924. Seeing that Санкт-Петербург appeared as a listed variant to “Saint Petersburg,” I then searched the city’s name in Cyrillic. Unsurprisingly, the only returned result was in Russia, as opposed to the seven results returned with the English search term of the city name.

Overall, I found the interface easy to use and interesting; however, the one question that I have relates to the numbers that appear over each green dot in the “temporal attestations” view. I’m unsure as to what these numbers refer to, and there isn’t a link on the numbers, as is present on other reference numbers that appear throughout the interface.

Recogito 

Originally, I wanted to look at a Russian text in the original language, so I used an excerpt of Dostoevsky’s The Double [Двойник] to test whether that would be possible. When tagging the protagonist’s name (Yakov Petrovich Golyadkin [Яков Петрович Голядкин]), the interface recognized that there was another appearance of the term, tagging it as a name as well. However, there was another occurrence of the name that was not recognized: the last name in the genitive case (Golyadkina [Голядкина]). I’m unsure of exactly how the technology works, but it does seem that patterns are matched through an exact match of character strings, as opposed to through coreference resolution, where terms that are different in spelling, yet reference the same entity (such as a character’s name/nickname, etc.), are able to be recognized. I also wondered whether each tag that is made has some way of telling whether it is a repeatable reference to a singular entity: does each occurrence of “Yakov Petrovich Golyadkin” that is tagged as a name internally recognize that it is referencing the same named entity (maybe through id number)? Because of the cases, I decided to look at a Russian text in English, as place names can also change through cases. For example, the sentence “I live in Saint Petersburg” would be “Я живу в Санкт-Петербурге” with the name of the city in the prepositional case, differing from the nominative form of the city, Санкт-ПетербургIn place-dense texts, this would present problems as characters move to and from cities, as well as attribute things to cities, changing the case of the word. I wanted to make sure that cities that differ only in case are not recognized as distinct entities: this problem isn’t present in English, so I went with that instead.

For the sake of tagging repeatable place names that can be recognized as such through consistency of spelling, I looked at Tolstoy’s War & Peace [Война и Мир], which is a fictional account of the lives of three central families during the time of the Napoleonic Wars (1803-1815). I chose this novel in particular due to its long passages describing battles/battlefields, as well as movement of forces across space and place. For the tagging, I looked at a chapter describing the events following the Battle of Borodino, tagging the names of people and places.

I then looked at the “Summary” pie chart, which stated that there where 38 annotations: 14 people and 24 places. It would be interesting to see a breakdown of this information: which places are referenced the most, as well as people? This also relates to my question about ids for the tags: are repeated references recognized as distinct or as related to the same named entity? In the document, there are 6 distinct place names, which occur a total of 24 times. There is a distinction to be made between 24 place occurrences and 24 places; I would be interested to see a count of the occurrence of each distinct place, separate from the more general “places that have been counted in the document” view. However, I think that this is an interesting tool that is well-designed and accessible, although I want to know more about the underlying technology.

Leave a Reply