The goal of this project funded by DI4DH is twofold: First, it is to be considered a contribution to the emerging field of “distant viewing”, which uses quantitative methods to assess a corpus consisting of a large number of visual media. Currently, deep learning methods play a minor role in distant viewing, as most of the projects use pretrained networks. This is understandable, as training is not trivial. However, using pretrained networks significantly reduces the amount of possible research questions. Moreover, a better understanding of the training process allows us to contribute to the field of “critical machine learning” as well; specifically we try to point out some of the benefits and pitfalls of training an artificial neural network for a humanities research project.
We selected YouTube as an example, which has become the most important online media outlet in Russia. In 2020, 82% of those aged 14–64 years use it daily, making in the most successful example of Social Media in Russia. Therefore, it is of vital importance for Slavic cultural and media studies to develop analytic tools for this platform.
Three test cases are used in our study: Ukrainian nationalist Stepan Bandera (1909–1959), who was instrumentalized by both sides of the Ukraine conflict starting in 2013; prominent Russian opposition leader Aleksei Naval’nyi; and Belarusian president Aliaksandr Lukashenka, who was re-elected in fall 2020. In the case of Naval’nyi and Lukashenka, YouTube clips with several million views helped to bring the protest to the streets. Naturally, demonstrations rely a lot on visual symbols such as flags and thus, allow us to test our theories. Protesters in Belarus, for example, do not use the official flag and coat of arms, which stem from Soviet times, but rather those of the first Belarusian republic founded in 1918.
For these test cases, Deep Learning is used to train an artificial neural network (Resnet1010) to automatically detect 45 predefined nationalist(ic) symbols and 40 politicians from Eastern Europe, which in turn allow to analyze the symbolic language of both state-run and oppositional propaganda discourses in Eastern Europe. This research question is not only interesting for Slavic media studies, but also for Eastern European History and political sciences; the methodology, on the other hand, poses important impulses for opening Digital Humanities for non-text media.
First examples of our trained networks are available on Github.