Analysis, Modelling and Protection of Online Private Data
Websites and applications use personalisation services to profile their users, collect their patterns and activities and eventually use this data to provide tailored suggestions. User preferences and social interactions are therefore aggregated and Online communications generate a consistent amount of data flowing among users, services and applications. This information results from the interactions between different parties, and once collected, it is used for a variety of purposes, from marketing profiling to product recommendations, from news filtering to relationship suggestions. Understanding how data is shared and used by services on behalf of users is the motivation behind this work. When a user creates a new account on a certain platform, this creates a logical container that will be used to store the user's activity. The service aims to profile the user. Therefore, every time some data is created, shared or accessed, information about the user’s behaviour and interests is collected and analysed. Users produce this data but are unaware of how it will be handled by the service, and of whom it will be shared with. More importantly, once aggregated, this data could reveal more over time that the same users initially intended. Information revealed by one profile could be used to obtain access to another account, or during social engineering attacks. The main focus of this dissertation is modelling and analysing how user data flows among different applications and how this represents an important threat for privacy. A framework defining privacy violation is used to classify threats and identify issues where user data is effectively mishandled. User data is modelled as categorised events, and aggregated as histograms of relative frequencies of online activity along predefined categories of interests. Furthermore, a paradigm based on hypermedia to model online footprints is introduced. This emphasises the interactions between different user-generated events and their effects on the user’s measured privacy risk. Finally, the lessons learnt from applying the paradigm to different scenarios are discussed.
Ph.D. dissertation, Universitat Politècnica de Catalunya, Jun. 2017
citation: S. Puglisi. (2017). "Analysis, Modelling and Protection of Online Private Data." Ph.D. dissertation, Universitat Politècnica de Catalunya, Jun. 2017.
On the Anonymity Risk of Time-Varying User Profiles
Websites and applications use personalisation services to profile their users, collect their patterns and activities and eventually use this data to provide tailored suggestions. User preferences and social interactions are therefore aggregated and analysed. Every time a user publishes a new post or creates a link with another entity, either another user, or some online resource, new information is added to the user profile. Exposing private data does not only reveal information about single users’ preferences, increasing their privacy risk, but can expose more about their network that single actors intended. This mechanism is self-evident in social networks where users receive suggestions based on their friends’ activities. We propose an information-theoretic approach to measure the differential update of the anonymity risk of time-varying user profiles. This expresses how privacy is affected when new content is posted and how much third-party services get to know about the users when a new activity is shared. We use actual Facebook data to show how our model can be applied to a real-world scenario.
citation: S. Puglisi, D. Rebollo-Monedero, J. Forne. (2017). "On the Anonymity Risk of Time-Varying User Profiles." Entropy 2017. 19(5), 190
journal: Entropy 2017
On web user tracking of browsing patterns for personalised advertising
On today’s Web, users trade access to their private data for content and services. App and service providers want to know everything they can about their users, in order to improve their product experience. Also, advertising sustains the business model of many websites and applications. Efficient and successful advertising relies on predicting users’ actions and tastes to suggest a range of products to buy. Both service providers and advertisers try to track users’ behaviour across their product network. For application providers this means tracking users’ actions within their platform. For third-party services following users, means being able to track them across different websites and applications. It is well known how, while surfing the Web, users leave traces regarding their identity in the form of activity patterns and unstructured data. These data constitute what is called the user’s online footprint. We analyse how advertising networks build and collect users footprints and how the suggested advertising reacts to changes in the user behaviour.
citation: S. Puglisi, D. Rebollo-Monedero, J. Forne. (2017). "On web user tracking of browsing patterns for personalised advertising." International Journal of Parallel, Emergent and Distributed Systems. 32 (502-521)
journal: 'International Journal of Parallel, Emergent and Distributed Systems'