Website Fingerprinting Attacks

Cui, Weiqi

View/Open

Cui_okstate_0664D_16246.pdf (2.036Mb)

Date

2019-05-01

Author

Cui, Weiqi

Metadata

Show full item record

Abstract

Most privacy-conscious users utilize HTTPS and an anonymity network such as Tor to mask source and destination IP addresses. It has been shown that encrypted and anonymized network traffic traces can still leak information through a type of attack called a website fingerprinting (WF) attack. The adversary records the network traffic and is only able to observe the number of incoming and outgoing messages, the size of each message, and the time difference between messages. In previous work, the effectiveness of website fingerprinting has been shown to have an accuracy of over 90% when using Tor as the anonymity network. Thus, an Internet Service Provider can successfully identify the websites its users are visiting.

Mitigations to these attacks are using cover/decoy network traffic to add noise, padding to ensure all the network packets are the same size, and introducing net-work delays to confuse an adversary. Although these mitigations have been shown to be effective, reducing the accuracy to 10%, the overhead is very high. The latency overhead is above 100% and the bandwidth overhead is at least 40%. We introduce a new realistic cover traffic algorithm, based on a user's previous network traffic, to mitigate website fingerprinting attacks. In simulations, our algorithm reduces the accuracy of attacks to 14% with zero latency overhead and about 20% bandwidth overhead. In real-world experiments, our algorithm reduces the accuracy of attacks to 16% with only 20% bandwidth overhead.

One main concern about website fingerprinting is its practicality. The common assumption in previous work is that a victim is visiting one website at a time and has access to the complete network trace of that website. However, this is not realistic. In our work, we aim to reduce the distance between the lab experiments with the realistic conditions. We propose a new algorithm based on Hidden Markov Model to deal with situations when the victim visits one website after another. After that, we employ deep learning algorithm to handle the situations when the captured traces are not perfect, such as partial traces, two-page traces or traces with background noise.

URI

https://hdl.handle.net/11244/321512

Collections

OSU Dissertations [11222]

SHAREOK^TM

advancing Oklahoma scholarship, research and institutional memory