Twitter has released the source code for their recommendation algorithms. It seems that they do not like Ukraine and give odds to Elon Musk

Algorithms are not out of politics

The published part of the twitter algorithm mentions words related to the political situation. One of the most surprising examples is the “Ukrainian Crisis” (UkraineCrisisTopic). This category belongs to “safety labels” and is next to coordinated bullying, fakes, adult content and copyright infringement (CoordinatedHarmfulActivity, Misleading, Nsfw, MedicalMisinfo, GenericMisinfo, DmcaWithheld). It can be assumed that these markings are designed to lower content in the rankings. Or we are talking about labels that appeared back in 2020 – with their help, Twitter flags potentially harmful content. In the name of the variable, there is no indication of what kind of content related to the war in Ukraine, such a label can receive. Maybe it's content that users complain about, or that contains violent footage, or maybe any content at all. Since the code is not documented, and there are no developer comments either, it is impossible to say for sure. However, Ukrainian users have noticed in recent weeks that their tweets are getting less exposure than before.

Similar labels may have been used in the same context for elections in Brazil, the United States, France, the Philippines, and the COVID-19 vaccine. However, now they are marked deprecated, that is, “not recommended”.

Once again, the “Ukrainian crisis” appears in the algorithm, apparently due to the reasons for banning users (policyInViolationToReason). There, too, it is the only reason related to geopolitics. Others deal with cheating, harassment, justifying violence in general, and so on.

In addition, on March 31, users noticed that the algorithm contained variables associated with the name of Elon Musk himself. They are in all likelihood boosting his tweets in everyone's feed, regardless of whether or not they follow the current owner of Twitter.

Apparently, Musk is the only user whose recommendations are hardcoded into the code. Presumably, Twitter compares how the tweets of Musk, Republicans and Democrats “fly” with each other – that is, in fact, as the owner of Twitter stated in general terms, it collects statistics. True, exactly how “Democrats” and “Republicans” are marked in this case, that is, whether all alleged supporters of these American parties receive the marking or only politicians who clearly indicate their association is unknown.

Twitter helps you to show you the accounts of people who may be similar to you and post content that interests you. Separately, he can start tossing candidates for elective office into the feed. Whether this applies only to American elections or other countries too – it's hard to say.

The same thing happens with Twitter's recommendations on who you should follow – among other things, it will suggest that you follow candidates for elected office.

For the US and some other countries, Twitter remains the main social network where people discuss elections. Last year, Twitter introduced special account markings for American candidates. In 2022, the Washington Post conducted its own investigation and found that some election fakes, especially those spread by the US Republican Party, do not get labeled as false information. At the same time, tweets that contained false information on other topics began to be flagged under Musk even more often than after the introduction of such labels.

On the one hand, Twitter itself is trying to influence the popularity of candidates in elections, but on the other hand, it does not hide the fact that algorithms do not prevent states from interfering in internal processes. Otherwise, why would the company's developers have created a whole class called "At the request of the government" (GovernmentRequested) – along with other recommender classes such as "Be aware", "Latest news" and "Inauthentic content".

Social credits for likes and shitpost

Twitter also posted the part of the code that weights the tweet actions that make it potentially more popular. It turned out that “likes” (favorites) have the most weight – the more there are, the more likely it is that the tweet will fall into the recommended ones.

Likes have 30 points, and retweets are second with 20 points. From two to four points gives Twitter Blue – a paid tick of a verified account. That is, if a user pays to subscribe to Twitter, his tweets may be recommended more often than those who do not pay. Previously, the "blue tick" was given only to those users who verified their identity through a special procedure. The appearance of a paid verification mark raised concerns among security experts – they immediately noticed fraudulent and fake accounts that pretend to be official authorities or companies. However, because they bought the “tick”, they will potentially gain more credibility.

In addition, they will show the post to people from your “ inner circle ” more often, or a post with a photo or video. It probably plays a role how much you write about "your topics" – this can be indicated by the luceneScore variable. It contains the name of the algorithm , which is used, among other things, to find similar topics. Perhaps this means that if you always post cats and whine about work, and then one day go to a rally against the government, then your tweets about the rally will be downgraded.

Messages without text or with only one link go down in the rating – spammers often post such posts. Users' attention was drawn to the unknownLanguage label, probably lowering the tweet's inclusion in your suggested ones. Someone thinks that we are talking about languages ​​that Twitter will not be able to assign the marking of a specific language – for example, Tatar, Chechen and all other languages ​​\u200b\u200bused by Russians, except for Russian and Ukrainian, will fall here. The same fate may befall thousands of other languages ​​in the world, which are not only not included in the Twitter architecture, but quite rarely have any machine translation. However, perhaps the point is that the user will be shown more content in the languages ​​in which he already reads.

In addition, part of the prescribed algorithm for selecting personally recommended tweets indicates that users who post a lot will most often be suggested. By convention, if a politician, activist, or scholar does not post consistently, their new tweets are less likely to be recommended. But for a person who posts at least some content every day, and sometimes also photos, many tweets are more likely to fall into the recommended feed.

Transparency is better than opacity

Twitter became the first big IT company to publish its recommendation algorithms. When announcing the publication of the algorithm, Elon Musk mentioned that some of them may indicate direct interference with recommender systems. It's impossible to tell how much of the code published March 31-April 1 was written recently and how much was written before Musk's purchase of the company. It is also important to understand that there are very few comments in the code – notes from the developers, which would explain what this or that function does, what exactly this or that label means.

A good example of why you should not guess what the names of variables mean is the n**ger_thread constant, which has existed for many years in the source code of the Yandex search engine. Some employees knew about it, but after the code was leaked in January 2023, the company acknowledged its existence, as users who did not work at Yandex began to indicate that the code contained invalid words. Nevertheless, we give here possible interpretations of some names of variables, labels and functions, assuming that these names were given in such a way that they were as accurate and intelligible as possible.

Exit mobile version