ChatGPT’s Other Risk: Oversharing Confidential Data
Connecting state and local government leaders
As many as 6.5% of employees have pasted their organization’s data into ChatGPT, and 3.1% have copied and pasted sensitive data to the program, recent research showed.
It’s not necessarily what’s coming out of ChatGPT that’s concerning security researchers, but what’s going in.
After being trained on massive numbers of digital text-based sources scraped from the internet, OpenAI’s ChatGPT can produce essays, presentations, articles and computer code when presented with a carefully crafted prompt. It finds the most relevant answers in its knowledge base and assembles that information into human-friendly formats or programming scripts. Even with practiced, specific prompts, some of its responses contain untrustworthy, incorrect, outdated or even nonsensical information. Still, its many fans say it can produce a pretty good draft or code script for a human to evaluate and follow as appropriate.
But the risks aren’t just about what the program puts out, there are risks about what users put in.
Sometimes users submit data to ChatGPT as part of a conversation to help them refine their queries, perhaps asking for a cleaned up version of meeting notes or code. That uploaded information is used to improve the program’s AI models and is essentially public. Sensitive data that is entered may not be limited to just ChatGPT, but it may also be shared with “other consumer services,” according to the terms of OpenAI’s data usage policies.
The site’s FAQ tells users that specific prompts cannot be deleted from their conversation histories, warning users: “Please don't share any sensitive information in your conversations.” But not everyone has been paying attention.
In early April, two different programmers at Samsung’s Korean semiconductor business sent ChatGPT some buggy, confidential computer code, asking the AI to find the problems and fix them. When a third employee sent meeting notes to ChatGPT asking for a summary, company leaders realized the risk of exposing proprietary information and limited each employee’s ChatGPT prompt to 1,024 bytes.
As many as 6.5% of employees have pasted company data into ChatGPT, and 3.1% have copied and pasted sensitive data to the program, according to research by the data security company Cyberhaven, which analyzed ChatGPT use across 1.6 million workers that use its product.
Stopping this risky user behavior is challenging. Traditional cybersecurity solutions cannot prevent users from pasting text into the ChatGPT browser, leaving organizations unable to assess the scope of the problem. Many security products can track when files are uploaded and downloaded to the internet, but not when users copy and paste information from their screens into a ChatGPT browser window, Cyberhaven VP of Marketing Cameron Coles wrote in a blog post. Additionally, confidential data may not be as easily recognized and blocked as Social Security numbers or credit card numbers. “Without knowing more about its context, security tools today can’t tell the difference between someone inputting the cafeteria menu and the company’s M&A plans,” he said.
Due to the risks, a number of major companies have restricted their employees from using ChatGPT, including Amazon, Walmart, Accenture, Verizon, JPMorgan Chase and several other financial institutions.
National Association of Counties Chief Information Officer Rita Reynolds offers a number of strategies for local governments considering the use of ChatGPT.
In an April 4 article, she advised counties to create a responsible AI policy that addresses privacy and data security, transparency and accountability, fairness and bias as well as informed consent. Staff should be trained on ChatGPT’s risks and how to use it responsibly. She also suggested IT staff investigate an in-house, enterprise ChatGPT through the Azure OpenAI Service.
“Counties’ use of ChatGPT is in its infancy,” Reynolds said. “Even with the challenges, the benefits are significant.”
NEXT STORY: One city’s unexpected open-data benefits