Computers, Privacy & the Constitution

Privacy Concerns with AI Language Models

Section 1. Sora and Virtual Reality In the new digital world, technological innovations continue to muddy the boundaries of privacy and personal autonomy. Sora, a digital language model that uses textual inputs to create virtual realities with striking photorealism, is of particular concern. While developers and companies promise use of the technology for educational purposes like to create training simulations and historical reenactments of important historical events, the potential for more "vulgar" uses like pornography and violence becomes immediately apparent. Users can create visual representations of any scenario they want and, using virtual reality technologies to view it, can create and experience an immersive reality of their choosing. As the technology evolves, users could theoretically put a VR headset on, use their voice to prompt the text input, and the environment would change around the user. There are supposedly safety measures in place to limit what users are able to recreate, but it is not clear how effective they will be or what the scope of the limitations are. This essay delves into the implications of Sora's capacity to create imagery that is indistinguishable from reality, the probable misuses of that technology, and the importance of preserving privacy amidst the surge of advanced digital technologies with a historical backdrop of government intrusion and private misuse of sensitive data.

Section 2. Sora's Capacity to Reveal Deep Secrets Sora's ability to construct immersive imagery from textual prompts grants individuals a lot of creative opportunity. While this fosters creativity and exploration, it also presents notable risks, particularly concerning the revelation of personal secrets. The allure of anonymity and the allure of digital confidentiality may embolden individuals to share their secrets through Sora, under the guise of fiction or artistic expression. Let's break down how this works with a concrete example: imagine a user, Bob, decides to use Sora to create a virtual scenario based on a personal experience he wants to visualize for artistic purposes. Bob inputs a series of prompts describing a specific memory, perhaps a secluded beach where he once had a significant conversation. Unbeknownst to Bob, his prompts inadvertently contain personal details, such as names, locations, or emotional nuances. As Sora interprets these prompts and generates the immersive virtual reality, it faithfully incorporates these details into the digital environment. The resulting virtual landscape not only visually replicates Bob's memory but also embeds personal identifiers within the scene. The critical aspect here is that Bob's intention was creative expression, not data disclosure. However, through Sora's transformative process, personal elements inherent in his prompts are unintentionally exposed in the virtual output. What began as a private artistic endeavor could inadvertently divulge intimate details to anyone who interacts with or accesses the resulting digital creation. This scenario underscores the distinction between Sora's operation and conventional search box functionalities. Unlike a search engine, where users actively seek information by inputting explicit queries, Sora operates on a more interpretive level. It doesn't merely retrieve existing data; it generates new content based on user prompts, potentially revealing hidden information embedded within those prompts.

Section 3: Internal Company Turmoil Raises Data Privacy Concerns OpenAI? , Sora's developer and owner, is a new company. The company was founded in 2015 as an effective altruism branded nonprofit with the goal to promote and develop human-friendly AI in a transparent and public manner. In 2019, the company transitioned from non-profit to capped-profit, and OpenAI? 's valuations skyrocketed to billions with the release of the first public edition of ChatGPT? in November of 2022. Concerningly, the company has already dealt with severe internal turmoil. Sam Altman, the CEO of OpenAI? , was ousted by the board of directors without warning in November 2023. He was reinstated a few days later after most employees threatened to quit if he was ousted, but the ouster remains a red flag and big hint at the company's lack of cohesion and unity. This is particularly concerning with the vast amount of personal and private data that Sora and ChatGPT? will give them access to. Additionally, it is well known that software can use browser history detection to determine users' personal preferences; might this also reveal Sora usage? Unclear. It is also not clear what steps the company takes to keep this data secure or what data the company holds onto. The transition from effective altruism to for-profit also raises concerns that OpenAi? will attempt to capitalize on the vast amount of data they collect.

Section 4. Intensely Private Data in a Surveillance State Government intrusion into the private sphere is not unheard of, by any means. Government surveillance programs, such as the United States' National Security Agency's (NSA) mass surveillance program revealed by whistleblower Edward Snowden, have demonstrated the extent to which governments can intrude upon individuals' privacy rights in the name of national security. The NSA's bulk collection of telecommunications metadata, including phone records and, particularly relevant here, internet activity, raises significant concerns about the erosion of privacy and civil liberties. The FBI can legally hack your computer, and the CIA monitors social media. The government will take advantage of any opportunity to gather data on its citizens, and Sora will have data that reveals citizens' most personal secrets; an even more compelling reason for the government to want it. There is no rational state interest in viewing citizens' private Sora creations, but it seems extremely likely that the government can and will access this information.

Section 6. Conclusion Sora's capacity to unveil personal secrets in digital form highlights the importance of safeguarding privacy in the new digital age. While technological progress offers unparalleled opportunities for innovation and connectivity, it simultaneously poses significant risks to personal privacy and autonomy. By fortifying privacy protections, we can mitigate the risks of unauthorized disclosure and exploitation, cultivating a digital landscape that respects and preserves individuals' privacy in the pursuit of progress and innovation.

-- JaredBivens - 01 Mar 2024

There are a couple of obvious routes to improvement. You don't ever substantiate the idea that secrets are "revealed" because a transformer is given user prompts. An actual illustrative example is needed. Why this is different than the search box should also be explained.

In essence, this is just another example of a service being offered in return for the surveillance of the user, which in this case is relatively closely limited to the input. But running the model by presenting input to an instance running elsewhere, like any remote service, is replaceable. I don't use a speech-to-text bot to make my audio transcriptions remotely, though that would be very "convenient." I run whisper.cpp on one of my own computers, so the actual recording of my speech in class is never transmitted to a third party (though I will make it available to the public anyway, through the web). That model running in my own computer also transcribes my day-to-day voice notes, which are indeed private, privileged, and definitely not something to be given to a third-party listener of any kind. But I don't have to. If I wanted to be running Mistral in a laptop, or one of the Meta LLaMa products, I could be doing that too. Using slightly different software slightly differently changes the entire nature of the situation and completely solves the problem. That's important to keep in mind, and to help the reader understand.

Navigation

Webs Webs

r4 - 08 May 2024 - 19:52:46 - JaredBivens
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM