State Social Operator Detector

Overview

State-funded social media operators are a hard-to-detect but significant threat to any democracy with free speech, and that threat is growing. In recent years, the extent of these state-funded campaigns has become clear. Russian campaigns undertaken to influence elections are most prominent in the news, but other campaigns have been identified, with the intent to turn South American countries against the US, spread disinformation on the invasion of Ukraine, and foment conflict in America's own culture wars by influencing all sides as part of an effort to weaken America's hegemonic status.

Iranian and Chinese efforts are also well-funded, though not as widespread or aggressive as those of Russia. Even so, Chinese influence is growing, and often it uses social media to spread specific narratives on Xinjiang and the Uyghur situation, Hong Kong, COVID-19, and Taiwan as well as sometimes supporting Russian efforts.

We need better tools to combat this disinformation, both for social media administrators as well as the public. As part of an effort towards that, we have created a proof-of-concept tool that can be operated via browser extension to identify likely state-funded social media operators on Twitter through inference performed on tweet content.

The core of the tool is a DistilBERT language transformer model that has been finetuned on 250K samples of known state operator tweets and natural tweets pulled from the Twitter API. It is highly accurate at distinguishing normal users from state operators (99%), but has some limitations due to sampling recency bias. We intend to iteratively improve the model as time goes on.

Usage

You can try out the model by entering in a sequence of 1-10 tweets. Each should be separated by pipes, as follows: "this is tweet one | this is tweet two." The model will then classify the sequence as belonging to a state operator or a normal user.

Further Information

You can obtain further information on the data collection and training used to create this model at the following Github repo: State Social Operator Detection

Contact

You can reach me at [email protected].