Online disinformation campaigns. How do these work. What can we do.

written by [email protected] (Silvia Puglisi - Hiro) on 2020-04-05

Every social media user has possibly noticed how content is often driven by very strong emotions. Many posts range from extreme cute kittens to drama stories, but above all emotions it seems like online content is especially fueled by anger [1] [2].

This might be specifically true for political driven content [3], but how people came from coming together to comment Eurovision [4] to insult each others during political campaigns?

To understand how we got here, we cannot ignore everything that happen with facebook and Cambridge Analytica [5]. Between 2013 and 2018 the company combined aggregation of Facebook data (later proved illegal), data mining and analysis, with communication campaigns during the electoral processes around the world with the goal to influence their outcome in favour of their clients [6].

What how does misinformation really look like?

If we take a look how any controversial or political topic on twitter we get a glimpse of how companies like Cambridge Analytica operate.

At its core misinformation campaigns are built to spread a certain political agenda using specific language and amplifying only certain sources.

In the last week I have been analyse topics spanning from the #covid-19 crisis to Spanish political propaganda. Here is what I found.

A few accounts and news sources spread the main messages, then a network of automated accounts make sure to retweet, highlight and reply to these messages.

A few examples

We have been observing a number of accounts following this similar pattern. Here is an example:

User: @alvisepf

We have chosen this user because it is one of the "top talkers" in the spanish Far right accounts.

For this user we extracted a subset of users that are retweeting their posts. For this subset of users we extracted a subset of their timelines.

This is the archive.

We were interesting in finding out what other accounts these users were tweeting. We found out most of the tweets are actually retweets to accounts in the spanish far right political spectrum. In other words these accounts are part of an amplification machine of the same group of top talkers for the spanish far right.

Hence we counted the number of retweets for account with the idea of extracting a distribution.

Again here is the result.

And here is a distribution per account:

distribution full image

How does a typical amplification account look like

This archive contains approximately 400 tweets from a single user, that coincidentally never sleeps ;).

What news sources are linked

Here is a list of identified news sources:

We noticed that all these accounts always retweet the same news source. So here we compiled a list of tweets to these recurrent news sources from accounts that retweeted @alvisepf.

Finally here is a complete archive of a set of tweets linking articles from these news source: archive.

What can be done

The patterns observed are pretty simple and repeated across countries and issues. Accounts like these are exposed every other day by researchers, but also by the social networks operators [7].

Why are operators not taking a stand against automated accounts? Patterns are not sophisticated, nor difficult to spot, different network and traffic metadata could be easily identified. More importantly these accounts exceed well the average frequency of tweets per hour of a normal user. A simple Proof of Work (PoW) [8] mechanism could well increase the cost of automating a large amount of highlights and retweets.

A PoW is a mathematical mechanism asking a client to perform a certain operation whose calculation difficulty is increased as the client makes more requests to a certain service. PoWs deters denial-of-service attacks and other service abuses, such as spam on a network.

What can you do

You can run your own research. I use a mix of own scripts calling twitter APIs and I also use twint. For visualizations I use metabase.

If you are a researcher or a journalist please get in touch. I'd be happy to collaborate with you monitoring different political scenarios around the world. I'd also be happy to share access to my DB and give you full access to the data.

REFERENCES:

[1] https://www.wired.com/story/this-big-beef-exposes-the-ugly-underbelly-of-vegan-vlogging/

[2] https://www.theguardian.com/science/2018/may/16/living-in-an-age-of-anger-50-year-rage-cycle

[3] https://time.com/4838673/anger-and-partisanship-as-a-virus/

[4] https://blog.twitter.com/en_gb/topics/marketing/2017/eurovision-2017.html

[5] https://www.theguardian.com/technology/2019/mar/17/the-cambridge-analytica-scandal-changed-the-world-but-it-didnt-change-facebook

[6] https://en.wikipedia.org/wiki/Cambridge_Analytica

[7] https://cyber.fsi.stanford.edu/io/news/april-2020-twitter-takedown

[8] https://en.wikipedia.org/wiki/Proof_of_work

Decentralization meets anonymity. Using the onion service protocol to build p2p applications.

written by [email protected] (Silvia Puglisi - Hiro) on 2020-03-25

The architecture of the web is based on the client/server paradigm where the Hypertext Transfer Protocol (HTTP) occupies a predominant role [1]. HTTP was designed to transfer resource representations, to abstract over lower-layered transport protocols, such as TCP or UDP, and to act as the primary application-level protocol.

Representational State Transfer (REST) architectures are a generalization of the Web based on the HTTP protocol. In this sense the World Wide Web represent an Internet-scale implementation of the RESTful architectural style.

RESTful architecture identify three main foundation blocks: – The identification mechanism, or a Uniform Resource Identifier (URI) – The communication process between agents – The representation of data being exchanged

We used to think of the web as hypertext documents linked to one another, but nowadays web documents are more like data objects linked to other objects, or in other words: hyperdata. The constraints imposed by the RESTful architectural style make the Web's architecture particularly malleable. The components making the web are continually changing and providing new capabilities, adding new resources in the form of novel websites and web services, supporting new representations for resources, etc [2].

The web architecture is hence inherently decentralized, but the data shared on the Internet and the services used to store this data are completely centralized [3]. These means that access to those data is controlled by the service owner, which is often different from the entity that produced the data. This approach might be considered extremely pragmatic and in most cases also cheaper and easier to manage than in-house data storage services. This process of centralizing data in a handful of location has been described as dataveillance [4], or a form of surveillance using the massive collection and storage of vast quantities of personal data. While users lose control over their data, they also lose control over their privacy.

Personal data store approaches have been proposed to over come these issues [5] and allow users to regain control over who can access their data. But decentralized access to data is only one part of the problems associated with web privacy.

Tor is an important tool providing privacy and anonymity online. The Tor network itself is only a part of what Tor is. Tor also provides privacy at the application level through the Tor Browser. The Tor Browser was designed to provide privacy while surfing the web and defend users against both network and local forensic adversaries. The same properties can be adopted by applications and services wishing to integrate the tor network in their architecture. Furthermore, onion services provide better authentication and assurance of who you are talking to. With onion services Tor can provide bi-directional anonymity by making it possible for users to hide their locations while offering various kinds of services, such as web publishing or an instant messaging server.

An .onion service needs to advertise its existence in the Tor network before clients will be able to contact it. Therefore, the service randomly picks some relays, builds circuits to them, and asks them to act as introduction points by telling them its public key. An onion service lives on the Tor network and its name is its long term master identity key. This is encoded as a hostname by encoding the entire key in Base 32, including a version byte and a checksum, and then appending the string “.onion” at the end. The result is a 56-character domain name.

This means that onion service operators do not need a public IP address to publish an onion service, but only it's onion service address (URI) making the protocol ideal for p2p applications.

While the introduction points and others are told the onion service's identity (public key), we don't want them to learn about the onion server's location (IP address). By using a full Tor circuit, it's hard for anyone to associate an introduction point with the .onion server's IP address. Because .onion services are only accessible via the Tor network, users do not need hosting or a public ip address to offer some service via .onion address. This means .onion services are a gateway to a decentralized, peer-to-peer internet, where users regain control on the content they create and who they are sharing it with.

Access control for an onion service is imposed at multiple points. The first stage of access control happens when downloading HS descriptors. Specifically, in order to download a descriptor, clients must know the public key of the service. Also, if optional client authorization is enabled, onion service descriptors are super-encrypted using each authorized user's identity x25519 key, to further ensure that unauthorized entities cannot decrypt it. The final level of access control happens at the server itself, which may decide to respond or not respond to the client's request depending on the contents of the request.

By using the Tor Browsers (or any web browser supporting the Tor protocol) clients can interact with onion service applications following the same RESTful paradigms. Users can still interact with applications via hyperlinks and the browser will render accessed resource representation in a similar way to how it renders the content of a website. In fact a website shared via .onion is still a website. In fact for a set of web services, the onion service protocol is just another way that they can be reached, i.e. via the Tor network. For p2p applications instead it is a way to take advantage of the flexibility, far-reach and ease of use of web technologies to create decentralized, privacy-friendly services. Furthermore because .onions can be hosted locally, on someone personal computer, we can start imaging services that are available for the time the user desire, and disappear when the .onion operator wishes to shutdown the server.

There are a number of applications that are using the onion service protocol in a p2p fashion. One of this is onionshare [6] that allow user to share files or publishing a website without using a centralized server. Another is Haven [7] that transform an Android phone into a physical security device and use an onion service to check logs of the phone sensors remotely. We can envision apps allowing people to communicate between one another or developing their own social network. Hopefully more applications will be developed in the near future, creating a dynamic onion service app ecosystem and fostering a more privacy friendly decentralized web.

REFERENCES:

[1] Berners‐Lee, T., Cailliau, R., Groff, J.F. and Pollermann, B., 1992. World‐Wide Web: the information universe. Internet Research. https://www.emeraldgrouppublishing.com/products/backfiles/pdf/backfiles_sample_5.pdf

[2] Fielding, R.T., Taylor, R.N., Erenkrantz, J.R., Gorlick, M.M., Whitehead, J., Khare, R. and Oreizy, P., 2017, August. Reflections on the REST architectural style and” principled design of the modern web architecture”(impact paper award). In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (pp. 4-14). https://pdfs.semanticscholar.org/1942/68d1965a7704d1361e440c3ebf599249a005.pdf

[3] Robinson, D.C., Hand, J.A., Madsen, M.B. and McKelvey, K.R., 2018. The Dat Project, an open and decentralized research data tool. Scientific data, 5. https://www.nature.com/articles/sdata2018221

[4] Solove, D.J., 2004. The digital person: Technology and privacy in the information age (Vol. 1). NyU Press. https://scholarship.law.gwu.edu/cgi/viewcontent.cgi?article=2501&context=faculty_publications

[5] Mansour, E., Sambra, A.V., Hawke, S., Zereba, M., Capadisli, S., Ghanem, A., Aboulnaga, A. and Berners-Lee, T., 2016, April. A demonstration of the solid platform for social web applications. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 223-226). https://pdfs.semanticscholar.org/5ac9/3548fd0628f7ff8ff65b5878d04c79c513c4.pdf

[6] https://onionshare.org

[7] https://play.google.com/store/apps/details?id=org.havenapp.main&hl=en

Internet Censorship

written by [email protected] (Silvia Puglisi - Hiro) on 2020-03-04

When we talk about things like internet censorship and surveillance, these might appear as abstract concepts to some people, especially in global north countries. This post will tries to explain the effects of these activities, and what can be done to help people subjected to both.

How does internet censorship look like?

Internet censorship can take many different forms. In most countries there is some form of internet censorship, meaning that the government might decide to block some websites for different reasons. Usually, in democratic settings, internet blocking is regulated by the state and it takes more the shape of content restrictions based on various constitutional rights and principles, and it is enacted both through the democratic process and formal legislation.

Generally speaking a network can be censored by blocking or slowing down access to certain websites, services, protocols. The ooni project [1] has been developing tests to find out when and how is censorship happening in a network [2]. How blocking is implementing varies widely from country to country and even by specific political situations or events. Measures that can be adopted to circumnvent censorship at a certain moment might not work after a while. It is said that censorship circumnvention is a arms race. As censorship circunvention measures become more sophisticated so does censorship technology. Eventually the right for freedom and democracy is won at the political level.

Ooni keeps a blog [3] where they present each internet censorship events and describe both the political background and the technical details of how the blocking is performed.

While interent censorship might seems secondary it is important to point out that autoritarian regimes use communication control and blocking to shutdown or reduce people ability to organize and protest. There is evidence that as digitalr rights are restricted human rights are violated [4] [5].

Internet censorship doesn't affect everyone equally either. Marginalized groups and people living in the global south are the ones that summer digital rights violation the most [6] [7].

Relationship between internet censorship and surveillance

When communication networks are controlled and blocked, citizens are surveilled [8][9]. Controlling dissent by targeting certain groups and dissidents is a common practice for many authoritarian governments [10]. In certain cases these governments prefer to engage themeselves online or partern with online media companies to control content and who has access to it [11]. In other cases the technics are more sophisticated. In certain cases even, having a state actor surveil their citizens affects users and business outside that country. This is the example of China weaponizing ligitimate requests to attack certain websites outside the Great Firewall (GFW) and even target the global chinese community [10].

How to protect yourself from surveillance and circumnvent censorship

The global investigative journalism network advises journalist to protect themselves and their sources from possible threats [13]. The first step is usually try to protect communications [14] by using end-to-end encrypted services, followed by taking necessary measures to protect access to personal accounts and using secure and encrypted tools and services to store and share documents.

The ability to interact with sources is strategically important for journalist. This is why tools like Secure Drop [16] are used in newsrooms around the globe. Often journalists rely on more lightweight applications. One of this is OnionShare [17]. Onionshare allows users to share or receive files, or to publish a website, directly from their computer. Because OnionShare uses the Onion Service protocol [19] and expose the service on the Tor network only. The Onion Service protocol allows for bi-directional anonymity, meaning both the party offering the service and the party receiving the service are anonymous and protected by a 3-hops tor circuit.

The Onion Service protocol also allows news organizations to reach surveilled and censored individuals [21] by offering their website on the Tor network. These website addresses end in the TLD .onion. Similar to how the https:// protocol of a website provides more security than the http:// protocol, an onion address also appears to be the same site but gives a visitor more privacy and security through end-to-end encryption and improved authentication. Visiting an onion address is easy. All that’s needed is Tor Browser (Tor Browser is built from Firefox and is similar to use); you visit the onion address in Tor Browser like you visit any web address.

Because journalists and activists often have a public profile, both in real life and on social media, they need to take extra care to protect themselves from phishing attacks [15]. To understand the magnitude of phishing operations we just have to consider that the NSO's spyware Pegasus was targeting people in 45 countries [18].

In other situation activists and journalist need to protect themselves by remaining anonymous. Tools like Tor [20] allow activists to safely browse the internet, do research, publish articles, and plan actions. All without being tracked.

[1] https://ooni.org

[2] https://ooni.org/nettest/

[3] https://ooni.org/post/

[4] https://www.washingtonpost.com/politics/2019/11/27/iran-shut-down-internet-stop-protests-how-long/

[5] https://netblocks.org/reports/evidence-of-internet-disruptions-in-russia-during-moscow-opposition-protests-XADErzBg

[6] https://www.accessnow.org/digital-rights-101-understanding-how-technology-affects-human-rights-for-all/

[7] https://ooni.org/post/2019-blocking-abortion-rights-websites-women-on-waves-web/

[8] https://www.theguardian.com/technology/2012/mar/02/censorship-inseperable-from-surveillance

[9] https://www.washingtonpost.com/world/asia_pacific/chinas-scary-lesson-to-the-world-censoring-the-internet-works/2016/05/23/413afe78-fff3-11e5-8bb1-f124a43f84dc_story.html

[10] https://www.wired.com/2017/04/internet-censorship-is-advancing-under-trump/

[11] https://www.zdnet.com/article/internet-censorship-its-on-the-rise-and-silicon-valley-is-helping-it-happen/

[12] https://www.eff.org/deeplinks/2019/10/chinas-global-reach-surveillance-and-censorship-beyond-great-firewall

[13] https://gijn.org/digital-security/

[14] https://www.icij.org/blog/2018/01/five-digital-security-tools-to-protect-your-work-and-sources/

[15] https://cpj.org/2019/07/digital-safety-kit-journalists.php

[16] https://securedrop.org/

[17] https://onionshare.org/

[18] https://citizenlab.ca/2018/09/hide-and-seek-tracking-nso-groups-pegasus-spyware-to-operations-in-45-countries/

[19] https://2019.www.torproject.org/docs/onion-services.html.en

[20] https://torproject.org

[21] https://blog.torproject.org/news-orgs-activists-onionize-your-sites-against-censorship