MoreRSS

site iconTroy HuntModify

Create courses for Pluralsight and am a Microsoft Regional Director and MVP who travels the world speaking at events and training technology professionals.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Troy Hunt

Processing 630 Million More Pwned Passwords, Courtesy of the FBI

2025-12-13 05:29:39

Processing 630 Million More Pwned Passwords, Courtesy of the FBI

The sheer scope of cybercrime can be hard to fathom, even when you live and breathe it every day. It's not just the volume of data, but also the extent to which it replicates across criminal actors seeking to abuse it for their own gain, and to our detriment.

We were reminded of this recently when the FBI reached out and asked if they could send us 630 million more passwords. For the last four years, they've been sending over passwords found during the course of their investigations in the hope that we can help organisations block them from future use. Back then, we were supporting 1.26 billion searches of the service each month. Now, it's... more:

Just as it's hard to wrap your head around the scale of cybercrime, I find it hard to grasp that number fully. On average, that service is hit nearly 7 thousand times per second, and at peak, it's many times more than that. Every one of those requests is a chance to stop an account takeover. But the real scale goes well beyond the API itself. Because the data model is open source and freely available, many organisations use the Pwned Passwords Downloader to take the entire corpus offline and query it directly within their own applications. That tool alone calls the API around a million times during download, but the resulting data is then queried… well, who knows how many times after that. Pretty cool, right?

This latest corpus of data came to us as a result of the FBI seizing multiple devices belonging to a suspect. The data appeared to have originated from both the open web and Tor-based marketplaces, Telegram channels and infostealer malware families. We hadn't seen about 7.4% of them in HIBP before, which might sound small, but that's 46 million vulnerable passwords we weren't giving people using the service the opportunity to block. So, we've added those and bumped the prevalence count on the other 584 million we already had.

We're thrilled to be able to provide this service to the community for free and want to also quickly thank Cloudflare for their support in providing us with the infrastructure to make this possible. Thanks to their edge caching tech, all those passwords are queryable from a location just a handful of milliseconds away from wherever you are on the globe.

If you're hitting the API, then all the data is already searchable for you. If you're downloading it all offline, go and grab the latest data now. Either way, go forth and put it to good use and help make a cybercriminal's day just that much harder 😊

Weekly Update 481

2025-12-05 15:14:33

Weekly Update 481

Twelve years (and one day) since launching Have I Been Pwned, it's now a service that Charlotte and I live and breathe every day. From the first thing every morning to the last thing each day, from holidays to birthdays, in sickness and in heal... wait a minute - did we marry each other or a data breach service?! We decided to do a 12th-birthday special together today to give everyone a bit more insight into what she does and what life is like running this service. It's a different weekly vid, and we really hope you enjoy watching it 😊

Weekly Update 481
Weekly Update 481
Weekly Update 481
Weekly Update 481

References

  1. Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. Just because a "fake" email address is in HIBP, it doesn't mean HIBP isn't accurately indexing data breaches (if it looks like an email address, it's an email address)

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

2025-12-04 07:37:06

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Normally, when someone sends feedback like this, I ignore it, but it happens often enough that it deserves an explainer, because the answer is really, really simple. So simple, in fact, that it should be evident to the likes of Bruce, who decided his misunderstanding deserved a 1-star Trustpilot review yesterday:

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Now, frankly, Trustpilot is a pretty questionable source of real-world, quality reviews anyway, but the same feedback has come through other channels enough times that let's just sort this out once and for all. It all begins with one simple question:

What is an Email Address?

You think you know - and Bruce thinks he knows - but you might both be wrong. To explain the answer to the question, we need to start with how HIBP ingests data, and that really is pretty simple: someone sends us a breach (which is typically just text files of data), and we run the open source Email Address Extractor tool over it, which then dumps all the unique addresses into a file. That file is then uploaded into the system, where the addresses are then searchable.

The logic for how we extract addresses is all in that Github repository, but in simple terms, it boils down to this:

  1. There must be an @ symbol
  2. There can be up to 64 characters before it (the alias)
  3. There can be up to 255 characters after it (the domain)
  4. The domain must contain a period
  5. The domain must also have a valid TLD
  6. A few other little criteria that are all documented in the public repo

That is all! We can't then tell if there's an actual mailbox behind the address, as that would require massive per-address processing, for example, sending an email to each one and seeing if it bounces. Can you imagine doing that 7 billion times?! That's the number of unique addresses in HIBP, and clearly, it's impossible. So, that means all the following were parsed as being valid and loaded into HIBP (deep links to the search result):

  1. [email protected]
  2. [email protected]
  3. [email protected]

I particularly like that last one, as it feels like a sentiment Bruce would express. It's also a great example as it's clearly not "real"; the alias is a bit of a giveaway, as is the domain ("foo" is commonly used as a placeholder, similar to how we might also use "bar", or combine them as "foo bar"). But if you follow the link and see the breach it was exposed in, you'll see a very familiar name:

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Which brings us to the next question:

How Do "Fake" Email Addresses End up in Real Websites?

This is also going to seem profoundly simple when you see it. Here goes:

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Any questions, Bruce? This is just as easily explainable as why we considered it a valid address and ingested it into HIBP: the email address has a valid structure. That is all. That's how it got into Adobe, and that's how it then flowed through into HIBP.

Ah, but shouldn't Adobe verify the address? I mean, shouldn't they send an email to the address along the lines of "Hey, are you sure you want to sign up for this service?" Yes, they should, but here's the kicker: that doesn't stop the email address from being added to their database in the first place! The way this normally works (and this is what we do with HIBP when you sign up for the free notification service) is you enter the email address, the system generates a random token, and then the two are saved together in the database. A link with the token is then emailed to the address and used to verify the user if they then follow that link. And if they don't follow that link? We delete the email address if it hasn't been verified within a few days, but evidently, Adobe doesn't. Most services don't, so here we are.

How Can I Be Really Sure Actual Fake Addresses Aren't in HIBP?

This is also going to seem profoundly obvious, but genuinely random email addresses (not "thisisfuckinguseless@") won't show up in HIBP. Want to test the theory? Try 1Password's generator (yes, Bruce, they also sponsor HIBP):

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Now, whack that on the foo.com domain and do a search:

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Huh, would you look at that? And you can keep doing that over and over again. You’ll get the same result because they are fabricated addresses that no one else has created or entered into a website that was subsequently breached, ipso facto proving they cannot appear in the dataset.

Conclusion

Today is HIBP's 12th birthday, and I've taken particular issue with Bruce's review because it calls into question the integrity with which I run this service. This is now the 218th blog post I've written about HIBP, and over the last dozen years, I've detailed everything from the architecture to the ethical considerations to how I verify breaches. It's hard to imagine being any more transparent about how this service runs, and per the above, it's very simple to disprove the Bruces of the world. If you've read this far and have an accurate, fact-based review you'd like to leave, that'd be awesome 😊

Weekly Update 480

2025-12-01 14:11:03

Weekly Update 480

Well, I now have the answer to how Snapchat does age verification for under-16s: they give an underage kid the ability to change their date of birth, then do a facial scan to verify. The facial scan (a third party tells me...) allows someone well under 16 to pass it easily. So, is that control "reasonable"? I guess that will depend on whether this case is an outlier or a much more common scenario, and a sample set of one isn't particularly scientific. Either way, I expect that what we're seeing is representative of a pretty obvious problem: privacy-preserving age verification is very unlikely to be reliable. It will inevitably result in letting too many young kids through, whilst blocking too many people of legitimate age. Or we end up with people needing to start uploading formal age-verification documents, which creates a whole new problem. Absolutely none of this should come as any surprise whatsoever!

Weekly Update 480
Weekly Update 480
Weekly Update 480
Weekly Update 480

References

  1. Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. This week, it's all about Australia's social media ban for under 16s (link to the thread that sparked all the debate)
  3. I wrote about "sharenting" back in 2020 (lots in there about protecting kids online whilst also making appropriate use of technology)
  4. Our eSafety Commissioner has an FAQ on what the ban means (lot of use of the word "reasonable" in there)

Weekly Update 479

2025-11-23 12:44:21

Weekly Update 479

I gave up on the IoT water meter reader. Being technical and thinking you can solve everything with technology is both a blessing and a curse; dogged persistence has given me the life I have today, but it has also burned serious amounts of time because I never want to let a problem go unsolved. But sometimes, common sense and the ROI of my time have to prevail, so I packed up all the gear and went back to processing data breaches. If you happen to solve this problem in a way that doesn't require any more time investment on my end, I'd love to hear it 😊

Weekly Update 479
Weekly Update 479
Weekly Update 479
Weekly Update 479

References

  1. Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device
  2. We've had a massive month on HIBP (20M+ visits is a solid number!)

Weekly Update 478

2025-11-16 16:13:02

Weekly Update 478

This week, it was an absolute privilege to be at Europol in The Hague, speaking about cyber offenders and at the InterCOP conference and spending time with some of the folks involved in the Operation Endgame actions. The latter in particular gave me a new sense of just how much coordination is involved in this sort of operation, all the way down to some of the messaging in the videos they've since released. I've seen some social commentary on these already, check them out and see what you think, especially as it relates to the psyops those videos play a role in.

Weekly Update 478
Weekly Update 478
Weekly Update 478
Weekly Update 478

References

  1. Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
  2. Operation Endgame saw a significant amount of criminal infrastructure taken down by Europol and friends (it's now the third "season" of Endgame that has ended up in HIBP)