Перейти к материалам
stories

A window into Yandex’s censorship A source code leak reveals how Russia’s top tech company protects Putin’s image

Source: Meduza

Last week, Russian Internet giant Yandex suffered a major source code leak when an unknown user (likely a former employee) published parts of the company’s internal repository online. The leaked code provides new insight into the inner workings of Russia’s largest search engine, which has faced growing criticism in recent years for cooperating with the Kremlin. Among other things, the breach confirmed that Yandex has censored image and video search results to prevent the Z symbol and images of Putin from appearing in contexts that might embarrass the Russian authorities. Meduza explains how.

Censoring users’ requests

Yandex’s leaked code shows that when users search for images on Yandex, their search queries are sometimes automatically edited behind the scenes. This is done with the help of a special-purpose code called ImgPatch, whose function is described in the leaked fragments as follows:

Allows you to organize a quick ban of images and videos by editing original requests. From minor changes to total reformulations.

This code is most often used to prevent Yandex from returning pornographic content. Its second most common use, however, is to remove images of Vladimir Putin from search results. This was first noted by Twitter user @bantg.

Protecting Putin

Yandex’s developers have done their best to ensure that certain search terms won’t return images of Vladimir Putin. These terms include:

As well as phrases such as:

This part of the code was written to apply regardless of a user’s location. We don’t know when it was first implemented or whether it’s still in use; before this article was published, we managed to retrieve images of Putin using each of the terms on this list except for “Dick in a spacesuit.”

Yandex since February 2022

'Toxic assets' How Russia’s invasion of Ukraine tore Yandex apart

Yandex since February 2022

'Toxic assets' How Russia’s invasion of Ukraine tore Yandex apart

Protecting the letter Z

Yandex’s search engine also contains code intended to prevent certain images and terms from appearing when users search the letter Z, which has become one of Russia’s main symbols of its war against Ukraine.

ImgPatch forces results for the search request “symbol z” to exclude images associated with any of the following terms (among others):

  • Luftwaffe
  • German
  • Germany
  • President
  • Slavic
  • Army
  • Reich
  • Wehrmacht
  • Nazis
  • SS
  • Hitler
  • Nazi
  • USA
  • Hitler Youth
  • WW2

It’s difficult to determine whether these parts of the code are still in effect. In the case of Putin, the code requires Yandex to exclude images of the Russian president when a user searches certain terms, but in the case of the Z symbol, the search engine is supposed to block various “banned” symbols, which is more difficult. Typing in the phrase “symbol z” still returns images of swastikas, for example, but this be either because the filter has been turned off or simply because it doesn’t work very well.

The captcha blacklist

The leaked code also contains a list of words that are banned from being used in Yandex’s captchas (short tests used for determining whether a user is human).

captcha.tar.bz2: data/blacklist_ru.txt

Most of the list items are unsurprising; they include terms like “Google” and “death,” as well as curse words and ethnic slurs. The last two words on the list, however, are “Lvov” and “surrender,” which suggests they were added most recently. While the word “Lvov” is homonymous with one form of the Russian word for “lions,” the word on the captcha blacklist presumably refers to the Russian name for Lviv, the Ukrainian city that Russia’s military has repeatedly shelled since the start of the full-scale war in Ukraine.

Note: On January 30, Yandex published the initial results of its internal investigation into the leak. Among other things, they reported that, in certain cases, the “logic of the work of [Yandex’s] services was corrected not with algorithms but with ‘crutches’ (which is a term developers use to refer to temporary solutions that are implemented hastily and sub-optimally). It was through these ‘crutches’ that certain mistakes of the recommendation system, which is responsible for additional elements of search results, were fixed, and search settings for pictures and videos were adjusted.”

Story by Denis Dmitriyev

English-language summary by Sam Breazeale