The goal of this project is twofold: First, it is to be considered a contribution to the emerging field of “distant viewing”, which uses quantitative methods to assess a corpus consisting of a large number of visual media. Currently, deep learning methods play a minor role in distant viewing, as most of the projects use pretrained networks. This is understandable, as training is not trivial. However, using pretrained networks significantly reduces the amount of possible research questions. Moreover, a better understanding of the training process allows us to contribute to the field of “critical machine learning” as well; specifically we try to point out some of the benefits and pitfalls of training an artificial neural network for a humanities research project.
We selected YouTube as an example, which has become the most important online media outlet in Russia. In 2020, 82% of those aged 14–64 years use it daily, making in the most successful example of Social Media in Russia. Therefore, it is of vital importance for Slavic cultural and media studies to develop analytic tools for this platform.
Three test cases are used in our study: Ukrainian nationalist Stepan Bandera (1909–1959), who was instrumentalized by both sides of the Ukraine conflict starting in 2013; prominent Russian opposition leader Aleksei Naval’nyi; and Belarusian president Aliaksandr Lukashenka, who was re-elected in fall 2020. In the case of Naval’nyi and Lukashenka, YouTube clips with several million views helped to bring the protest to the streets. Naturally, demonstrations rely a lot on visual symbols such as flags and thus, allow us to test our theories. Protesters in Belarus, for example, do not use the official flag and coat of arms, which stem from Soviet times, but rather those of the first Belarusian republic founded in 1918.
For these test cases, Deep Learning is used to train an artificial neural network (Resnet1010) to automatically detect 45 predefined nationalist(ic) symbols and 40 politicians from Eastern Europe, which in turn allow to analyze the symbolic language of both state-run and oppositional propaganda discourses in Eastern Europe. This research question is not only interesting for Slavic media studies, but also for Eastern European History and political sciences; the methodology, on the other hand, poses important impulses for opening Digital Humanities for non-text media.
First examples for trained networks are already available on Github and serve as starting point for our project. As of now, our corpus consists of roughly 800 videos about Bandera, 500 about Lukashenka and Naval’nyi respectively, and those news broadcasts of ‘Vremia’, Russia’s most watched TV news broadcast, which aired since 12 July 2019 (also more than 500 clips). Additionally, the media archive of the Institute for Slavic Studies at the University of Innsbruck has archived all ‘Vremia’ broadcasts from 2014, and we have also been creating daily snapshots of YouTube search results concerning Belarus (since September 2020) and Russia (since January 2021). These snapshots can be used to at least partially uncover the inner workings of YouTube’s search algorithm, which presents itself as a black box.