How this de-anonymization attack works is hard to explain, but relatively easy to understand once you have the essence. A person performing the attack needs a few things to get started: a site they control, a list of accounts associated with people they want to identify who have visited that site, and content that sent to the platforms for accounts on their target list that either allow the targeted accounts to see the content or block them from seeing it – the attack works both ways.
Next, the attacker embeds the aforementioned content on the malicious website. Then they wait for who clicks. If someone on the targeted list visits the site, attackers will know who they are analyzing which users may (or may not) see the embedded content.
The attack exploits a number of factors that most people probably take for granted: Many major services – from YouTube to Dropbox – allow users to host media and embed them on a third-party site. Regular users typically have an account with these ubiquitous services, and it is crucial that they often stay logged in to these platforms on their phones or computers. Finally, these services allow users to restrict access to content uploaded to them. For example, you can set up your Dropbox account to privately share a video with one or a handful of other users. Or you can upload a video to Facebook in public, but block certain accounts from watching it.
These “block” or “allow” relationships are at the heart of how scientists found they could reveal identities. For example, in the “allow” version of the attack, hackers can quietly share an image on Google Drive with a Gmail address of potential interest. Then they embed the image on their malicious web page and lure the target to it. When visitors’ browsers try to load the image via Google Drive, attackers can infer exactly whether a visitor has permission to access the content – also called whether they have control over that email address.
Thanks to the large platforms’ existing protection of privacy, the attacker could not directly control whether the visitor to the site was able to load the content. However, the NJIT researchers realized that they could analyze available information about the target browser and the behavior of their processor while the request was being made, to draw a conclusion as to whether the content request was allowed or denied.
The technique is known as a “side channel attack” because the researchers found that they could accurately and reliably make this determination by training machine learning algorithms to analyze seemingly unrelated data about how the victim’s browser and device process the request. Once the attacker knows that the one user they were allowed to see the content has done so (or that the one user they blocked has been blocked), they have de-anonymized the visitor to the site.
As complicated as it may sound, the researchers warn that it would be easy to perform once the attackers have done the preparatory work. It would only take a few seconds to potentially expose every visitor to the malicious site – and it would be virtually impossible for an unsuspecting user to detect the hack. The researchers developed a browser extension that can prevent such attacks and it is available for Chrome and Firefox. However, they note that it may affect performance and is not available to all browsers.
Through a major disclosure process for several web services, browsers and web standard bodies, the researchers say they have started a larger discussion on how to solve the problem in a comprehensive way. Currently, Chrome and Firefox has not published an answer. And Curtmola says that fundamental and probably impossible changes to the way processors are designed would be necessary to solve the problem at the chip level. Still, he says collaborative discussions through the World Wide Web Consortium or other forums could ultimately create a broad solution.
“The sellers are trying to see if it’s worth it to fix this,” he says. “They need to be convinced that it’s a serious enough problem to invest in solving it.”