OpenAI: A Redis bug caused a recent ChatGPT data exposure incident

Pierluigi Paganini March 26, 2023

OpenAI revealed that a Redis bug was the root cause of the recent exposure of users’ personal information and chat titles in ChatGPT service.

On Friday, OpenAI revealed that the recent exposure of users’ personal information and chat titles in its chatbot service was caused by a bug in the Redis open-source library.

On March 20, 2023, several ChatGPT users started reporting seeing conversation histories of other users appearing in their accounts.

The same day, the history function showed the error message “Unable to load history,” and the chatbot service was temporarily interrupted. Below is the message published by OpenAI CEO Sam Altman.

 The company identified the bug and quickly addressed it.

“We took ChatGPT offline earlier this week due to a bug in an open-source library which allowed some users to see titles from another active user’s chat history. It’s also possible that the first message of a newly-created conversation was visible in someone else’s chat history if both users were active around the same time.” reads an update published by the company.

The company investigated the impact of the issue and discovered that it may have caused the unintentional visibility of payment-related information of 1.2% of the ChatGPT Plus subscribers who were active during a specific nine-hour window. The company pointed out that the issue did not disclose financial information.

“In the hours before we took ChatGPT offline on Monday, it was possible for some users to see another active user’s first and last name, email address, payment address, the last four digits (only) of a credit card number, and credit card expiration date. Full credit card numbers were not exposed at any time.” continues the update.

The expert discovered that the bug was present in the Redis client open-source library, redis-py. The service uses Redis to cache user information in its server. 

OpenAI use the redis-py library to interface with Redis from its Python server, which runs with Asyncio. 

The library uses a shared pool of connections between the server and the cluster, the company states that it recycles a connection to be used for another request once done.

“When using Asyncio, requests and responses with redis-py behave as two queues: the caller pushes a request onto the incoming queue, and will pop a response from the outgoing queue, and then return the connection to the pool.” continues the update. “If a request is canceled after the request is pushed onto the incoming queue, but before the response popped from the outgoing queue, we see our bug: the connection thus becomes corrupted and the next response that’s dequeued for an unrelated request can receive data left behind in the connection.” 

The company explained that only in some cases, the corrupted data match the data type the requester was expecting. In this scenario, the responses provided by the chatbot service using the cache appear valid, even if it belongs to another user.

On March 20, the company accidentally introduced a change to its server causing a spike in Redis request cancellations. In this case, for each connection, there was the possibility to receive data belonging to other users.

The company notified impacted users and also implemented redundant checks to ensure the data returned by our Redis cache matches the requesting user.

Follow me on Twitter: @securityaffairs and Facebook and Mastodon

Pierluigi Paganini

(SecurityAffairs – hacking, Redis)

you might also like

leave a comment