Computers, Privacy & the Constitution

Do users have a right to control data about themselves?

-- By LeonHuang - 05 Mar 2017 (Revised - 29 September 2017)

Protecting our privacy relies on finding someone we can trust. Our quest for privacy began with a loss of trust in the service providers online like Google and Facebook to protect the users’ privacy. The quest continued with the revelation that (1) we have become addictive to the cornucopia of convenience provided by Google, Facebook, and the like and (2) we do not have the technical know-how to reinvent the wheel. The quest would reach its end when we can entrust someone else to do that for us.

I was naïve to trust you, Google.

Service providers online, like Google and Facebook, keep records of how users are using their services. When our initial ecstasy over free sign-ups subsides, we become worried about the threat to our privacy. Although the service providers, to the best of their conscientiousness, promise to strip away identity-sensitive information when they collect the data, we understand that data can never be perfectly anonymized, and our worries remain.

Data devoid of identifiable information may still threaten our privacy. For example, the TLC Trip Record Data provides publicly available information on the dates, times and locations of all taxi pick-ups/drop-offs in the New York City in a given year. Although the data does not include the identity of the passengers, it nevertheless increases the risks of privacy violations when it is used in conjunction with other publicly available information. Celebrity cab rides become easier to identify. People may simply find a photo of a celebrity getting into a cab and use the date, time, and location to find out where the celebrity was going. And in turn people may easily find out where the celebrity lives. The average Joe faces the same risks. An acquaintance may easily find out where you live after seeing you leaving in a cab after work. This tension between privacy and highly-accurate geolocation data has led to a proposal that the TLC should reveal only census tracts instead of the exact coordinates.1

Anonymized data can reveal key information about users’ identity when such data is combined with data from other sources. A link can be established between datasets when there are significant overlaps, which are not as difficult to achieve as one would expect. In the case of the TLC Trip Record Data, date, time, and location provide enough overlaps to link a particular cab ride to the cab rider being photographed. In the case of purportedly anonymized records collected by service providers, there have already been efforts to combine such data from multiple sources to construct the complete profiles.2

We cannot entrust our privacy to the service providers, so long as they keep on collecting our data. And the service providers will keep on collecting our data. It is key to the business model which has made them so valuable to their investors, both private and public.

You know nothing, John Doe.

We, as the regular John Does and Jane Does, do not have the technical know-how to provide for ourselves. A personal story can illustrate this point: I am addicted to the easy access to all my files on the go enabled by cloud services like Google Drive. In March, I set out to install a personal cloud that I hoped would do the same thing with anonymity.

After two hours of preliminary research, I learned the kind of hardware I need to purchase: a single-board computer such as Raspberry Pi plus accessories such as an SD card for storage, at a total of $86.92. Once I obtained the equipment, I spent another two hours in research to learn how to install the images of the operating system and how to communicate with the equipment from my laptop. Then I spent four hours to install personal cloud software on the equipment, following a guide that I found online. Finally, in order to make my cloud accessible outside of my local network, I spent another two hours setting up port forwarding and dynamic DNS. In sum, I spent 10 hours of my time and $86.92 of my dime to set up a workable personal cloud with 32GB storage. In return, I gained the freedom to use cloud storage services without being forced to have my data collected by someone else.

It turns out the tradeoff is not limited to the initial set up cost. My personal cloud is excruciatingly slow compared to the established cloud services. It sometimes stops responding until I reboot the equipment, making it rather unreliable as a service intended for remote access. And I still cannot trust its security, because I know it is set up and maintained by an amateur with little knowledge about network security and little time to even keep its operating software up-to-date.

I cannot think of any files that I need remote access from time to time, that I am willing to tolerate the quirkiness of my personal cloud so as to prevent any service providers to harvest any data on me, and that are not sensitive enough for me to worry about targeted hacking. In the end, I do not know what to do with my personal cloud. I pulled the plug by the end of April.

According to Professor Moglen, I would have been better off had I consulted with experts. I would have spent $150 instead of $90 for a single-board computer much faster than a Rasberry Pi, and I would have installed Freedom Box software which would give me a personal cloud much more powerful. Taking his point further, I now believe my laughable attempt to reinvent the wheel in 10 hours is affront to the highly specific division of labor which we associate with the modern civilization.

Can I trust you?

If we cannot trust the established service providers or ourselves, the only way out is to seek help from other experts in the field. In my case, I could have asked Professor Moglen. And in a more generalized case, we would need to find someone who (A) has the necessary expertise, (B) has no or limited conflict of interest, and (C) cares enough about privacy to conduct due diligence. While someone who meets all three criteria can be hard to come by within the reach of one’s social circle, he or she is likely within reach over the internet. But how can I be certain when someone claims to meet all three criteria over the internet?

Building trust among strangers over the internet is as difficult as it is in the real world. When the apparent stakes are high, people are willing to go to extreme lengths in proving their genuine intentions. For example, the initiation of Zcash, a cryptocurrency, involved a lengthy ceremony simultaneously conducted by several participants across the globe while being video-recorded live from all angles.3 In the case of privacy protection, the stakes are less apparent and the consequences less direct. How can we make sure that a website purporting to provide secure cloud services is genuine? Privacy protection in the end resolves around this trust question.

Navigation

Webs Webs

r4 - 29 Sep 2017 - 21:47:44 - LeonHuang
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM