Dirty Harry and His Black Coffee: What Does One Do After Finding 472,461 Email Addresses Online?
I’ve found some crazy stuff on the internet over the years.
I’ve long been a fan of Google Dorks. For many years, I would widdle away in the returned search results from a Google Dork, only to stumble across something strange. Most of the time I neglected to write down the odd things that I found. Even when I found it necessary to contact someone, I often did it with ProtonMail email accounts that I used for a few weeks then discarded. Or, before Google and Yahoo became completely enamored with two-factor authentication using a mobile number, I would simply create dozens upon dozens of email accounts for “one-and-done” communication.
Here on Medium I’ve written about email addresses from PayPal accounts being spit out in error messages and .gov domains incorrectly hosting the resumes of folks applying for municipal jobs. After years of finding stuff like this and shooting off emails, I figured that I might as well try to write some of these down — especially as I have kind of forgotten some of them over time.
Pointing out (in the “responsible disclosure” lingo of latte-land) PII or PHI data leaks or mis-configured SharePoint forms is always a thankless job. No one wants to hear from you and most organizations have zero incentive to acknowledge or thank you for what you’ve reported to them. I’ve been thanked by SAP in their hall of fame — but that medium sized company that you responsibly alerted to the fact that they were exposing the Social Security Numbers and the first and last names of thousands of waitresses and nurses — think they’ll thank you in any meaningful way?
Yeah. No way, Jose.
Yes, there are bug bounty programs. And yes, there are “pathways” of “responsible disclosure” in the United States. But the corporate wonderland of the United States gives all possible avenues of escape and coddling behavior to corporations — especially to big tech.
If you call someone a crack-addled out-of-control fiend in a letter to your local newspaper? Your local newspaper will likely be sued out of existence for publishing. Call someone a crack-addled out-of-control fiend on social media? No worries. God bless that Section 230.
Find an American corporation or entity that is leaking PII or PHI from thousands (millions maybe) of Americans? One must tread lightly down the primrose path of “responsible” disclosure. Find the PII or PHI of thousands of Americans on some shady website in the Philippines or Russia that was discovered after just a click or two from a Google search? Have fun with that. Google’s responsibility? Fuck no. Any recourse for the American consumer? You’re joking, right?
Which kind of gets to the heart of an issue that plagues “responsible disclosure” and data leaks and everything else: what does one do when the attribution of something is difficult to discern.
The frustrating thing? There’s virtually no one to report this to. The company that’s hosting the data, Google that’s indexing said data, the individuals whose private data is now indexed into search engines — there’s no one place to report this, and almost as if by design, there’s no real way to confront an issue like this. Europe has the GDPR, but with the clock ticking on the Biden administration, it sure doesn’t look like anyone in Washington will do anything to upset the corporate suits.
Take a look at this thing that I found last night — this is the kind of stuff that one will immediately start finding once one starts looking.
You find nearly half a million email addresses online that are stored improperly — what do you do?
Yesterday, at around 10:00 at night, I was browsing through some different things online. I decided (as I will do occasionally) to throw up a random Google Dork search term.
I try a simple and common one that searches all .com domains and searches for a URL that includes a string related to email addresses from Yahoo. This is the kind of search I try all the time, every week or so, just to see what might come up.
The results with Google (I usually scroll through 5 to 6 pages of returned search results) are usually never that interesting with such a basic search. Meaning: I see the same stuff over and over again.
But not this time! In the top results I see a header which indicates the tell tale signs of a list of email addresses. In addition, I see cdn (content delivery network) in the URL — this automatically raises an eyebrow. I decided to dive in.
I open the suspicious Google Search result and I see email addresses. A lot of email addresses.
Now, when you find something like this, you have to look at where it’s being stored. Usually, you might find something like this in a site using WordPress. Or in an old abandoned site from the late 1990's. Finding the provenance isn’t usually that hard.
Even if the source is a mystery, it usually isn’t that difficult to find the source.
I remember finding a large cache of email addresses about two years ago on a mis-configured website. Included were the full names and cell phone numbers of people along with their email addresses. While it wouldn’t meet the lenient PII regulations of any state in the United States, it sure would be a violation in the European Union. The addresses were all being used by some shitty, fly-by-night spam entity. I thought: “Oh! Unique email addresses! What is this!?!”
About 20 minutes later I realized: Someone had simply taken the contents of the River City Media data breach and posted them up on their own poorly configured site. I even logged it (the presence of these email addresses from the River City Media breach being indexed into Google Search results) with Google’s internal developer website.
“Say, hey, uh, guys? Remember that River City Media data breach thing? Well someone else recently took some of that data, and then used it on the back-end of their site. And now, wouldn’t you know it, every single one of those email addresses is being indexed by Google. Shouldn’t you work to get these emails indexed out of search?”
This was Google’s response:
“Huh?” I thought to myself. “That’s odd.”
You know what else was odd? The domain for multiscreensite.com leads nowhere, and support pages lead to no identifying information or social media profile. The ghost of Pablo Escobar could be running this from Putin’s dacha using seed money from the Trump organization as far as I could tell.
I decide to do a search on Twitter for mulitscreensite.com and spot this:
So, shadiness is pretty much confirmed at this point. I’m not expecting to run across a cute “community manager” for multiscreensite.com who’ll help to get this figured out.
I go into with the assumption that all of multiscreensite.com is shady and, sure enough, I soon find it:
If you’re still reading at this point, you probably need more help than I do. Just kidding. Sort of.
Anyways: long story short.
A shady hosting company without contact information. A cyber-crime forum that hosted questionable data. And Google, who indexed said data so that it is easy to discover.
Is there any way to report this? In a Dirty Harry voice: “I’ll find out and i’ll let you know.”