Personal and social information of 1.2B people exposed on an open Elasticsearch install

Pierluigi Paganini November 22, 2019

Security duo discovered personal and social information 1.2 billion people exposed online on an unsecured Elasticsearch server.

Researchers Bob Diachenko and Vinny Troia discovered an unsecured Eslasticsearch server containing an unprecedented 4 billion user accounts.

The database, discovered on October 16, 2019, contained more than 4 terabytes of data is the largest data leaks from a single source organization in history.

The leaked data contained names, email addresses, phone numbers, LinkedIn and Facebook profile information.

According to the researchers, it contains personal and social information that appears to originate from 2 different data enrichment companies.

“The discovered Elasticsearch server containing all of the information was unprotected and accessible via web browser at No password or authentication of any kind was needed to access or download all of the data.” reads the post published by the experts.

“The majority of the data spanned 4 separate data indexes, labeled “PDL” and “OXY”, with information on roughly 1 billion people per index. Each user record within the databases was labeled with a “source” field that matched either PDL or Oxy, respectively.”

social information Elasticsearch

Researchers believe the data in the PDL indexes originated from People Data Labs, a data aggregator and enrichment company.

The archive contained nearly 3 billion PDL user records associated with roughly 1.2 billion unique people. The archive included 650 million unique email addresses, the data belonging the three different PDL indexes were respectively scraped from LinkedIN (i.e. Email addresses and phone numbers), and social media profiles such as a person’s Facebook, Twitter, and Github URLs.

The experts reported their findings to PDL that replied that the exposed Elasticsearch instance doesn’t belong to them.The following is a partially redacted sample of my personal record, downloaded from the server.

“In order to test whether or not the data belonged to PDL, we created a free account on their website which provides users with 1,000 free people lookups per month.

“The data discovered on the open Elasticsearch server was almost a complete match to the data being returned by the People Data Labs API. The only difference being the data returned by the PDL also contained education histories.”

“When I checked my account on, the returned results were identical – including that phone number.
Since I have never seen this phone number appear in any of my previously breached/leaked records, this is a very good indication that the leaked database originated from PDL.”
continues the post.

The exposed archive also includes records that appear to belong to the data enrichment company

The “Oxy” database contained records scraped from LinkedIn, including recruiter information. Once notified of the discovery, OxyData told the researchers that the server did not belong to it.

The researchers speculate that the server was operated by an organization that is a customer of both People Data Labs and OxyData, anyway it is impossible in this phase to attribute the ownership of the server to a specific company.

“If this was a customer that had normal access to PDL’s data, then it would indicate the data was not actually “stolen”, but rather mis-used. This unfortunately does not ease the troubles of any of the 1.2 billion people who had their information exposed.” concludes the post.

“Because of obvious privacy concerns cloud providers will not share any information on their customers, making this a dead end.
Agencies like the FBI can request this information through legal process (a type of official Government request), but they have no authority to force the identified organization to disclose the breach.”

[adrotate banner=”9″] [adrotate banner=”12″]

Pierluigi Paganini

(SecurityAffairs – Elasticsearch server, social information)

[adrotate banner=”5″]

[adrotate banner=”13″]

you might also like

leave a comment