2025-12-13 05:29:39

The sheer scope of cybercrime can be hard to fathom, even when you live and breathe it every day. It's not just the volume of data, but also the extent to which it replicates across criminal actors seeking to abuse it for their own gain, and to our detriment.
We were reminded of this recently when the FBI reached out and asked if they could send us 630 million more passwords. For the last four years, they've been sending over passwords found during the course of their investigations in the hope that we can help organisations block them from future use. Back then, we were supporting 1.26 billion searches of the service each month. Now, it's... more:
Just as it's hard to wrap your head around the scale of cybercrime, I find it hard to grasp that number fully. On average, that service is hit nearly 7 thousand times per second, and at peak, it's many times more than that. Every one of those requests is a chance to stop an account takeover. But the real scale goes well beyond the API itself. Because the data model is open source and freely available, many organisations use the Pwned Passwords Downloader to take the entire corpus offline and query it directly within their own applications. That tool alone calls the API around a million times during download, but the resulting data is then queried… well, who knows how many times after that. Pretty cool, right?
This latest corpus of data came to us as a result of the FBI seizing multiple devices belonging to a suspect. The data appeared to have originated from both the open web and Tor-based marketplaces, Telegram channels and infostealer malware families. We hadn't seen about 7.4% of them in HIBP before, which might sound small, but that's 46 million vulnerable passwords we weren't giving people using the service the opportunity to block. So, we've added those and bumped the prevalence count on the other 584 million we already had.
We're thrilled to be able to provide this service to the community for free and want to also quickly thank Cloudflare for their support in providing us with the infrastructure to make this possible. Thanks to their edge caching tech, all those passwords are queryable from a location just a handful of milliseconds away from wherever you are on the globe.
If you're hitting the API, then all the data is already searchable for you. If you're downloading it all offline, go and grab the latest data now. Either way, go forth and put it to good use and help make a cybercriminal's day just that much harder 😊
2025-12-05 15:14:33

Twelve years (and one day) since launching Have I Been Pwned, it's now a service that Charlotte and I live and breathe every day. From the first thing every morning to the last thing each day, from holidays to birthdays, in sickness and in heal... wait a minute - did we marry each other or a data breach service?! We decided to do a 12th-birthday special together today to give everyone a bit more insight into what she does and what life is like running this service. It's a different weekly vid, and we really hope you enjoy watching it 😊
2025-12-04 07:37:06

Normally, when someone sends feedback like this, I ignore it, but it happens often enough that it deserves an explainer, because the answer is really, really simple. So simple, in fact, that it should be evident to the likes of Bruce, who decided his misunderstanding deserved a 1-star Trustpilot review yesterday:

Now, frankly, Trustpilot is a pretty questionable source of real-world, quality reviews anyway, but the same feedback has come through other channels enough times that let's just sort this out once and for all. It all begins with one simple question:
You think you know - and Bruce thinks he knows - but you might both be wrong. To explain the answer to the question, we need to start with how HIBP ingests data, and that really is pretty simple: someone sends us a breach (which is typically just text files of data), and we run the open source Email Address Extractor tool over it, which then dumps all the unique addresses into a file. That file is then uploaded into the system, where the addresses are then searchable.
The logic for how we extract addresses is all in that Github repository, but in simple terms, it boils down to this:
That is all! We can't then tell if there's an actual mailbox behind the address, as that would require massive per-address processing, for example, sending an email to each one and seeing if it bounces. Can you imagine doing that 7 billion times?! That's the number of unique addresses in HIBP, and clearly, it's impossible. So, that means all the following were parsed as being valid and loaded into HIBP (deep links to the search result):
I particularly like that last one, as it feels like a sentiment Bruce would express. It's also a great example as it's clearly not "real"; the alias is a bit of a giveaway, as is the domain ("foo" is commonly used as a placeholder, similar to how we might also use "bar", or combine them as "foo bar"). But if you follow the link and see the breach it was exposed in, you'll see a very familiar name:

Which brings us to the next question:
This is also going to seem profoundly simple when you see it. Here goes:

Any questions, Bruce? This is just as easily explainable as why we considered it a valid address and ingested it into HIBP: the email address has a valid structure. That is all. That's how it got into Adobe, and that's how it then flowed through into HIBP.
Ah, but shouldn't Adobe verify the address? I mean, shouldn't they send an email to the address along the lines of "Hey, are you sure you want to sign up for this service?" Yes, they should, but here's the kicker: that doesn't stop the email address from being added to their database in the first place! The way this normally works (and this is what we do with HIBP when you sign up for the free notification service) is you enter the email address, the system generates a random token, and then the two are saved together in the database. A link with the token is then emailed to the address and used to verify the user if they then follow that link. And if they don't follow that link? We delete the email address if it hasn't been verified within a few days, but evidently, Adobe doesn't. Most services don't, so here we are.
This is also going to seem profoundly obvious, but genuinely random email addresses (not "thisisfuckinguseless@") won't show up in HIBP. Want to test the theory? Try 1Password's generator (yes, Bruce, they also sponsor HIBP):

Now, whack that on the foo.com domain and do a search:

Huh, would you look at that? And you can keep doing that over and over again. You’ll get the same result because they are fabricated addresses that no one else has created or entered into a website that was subsequently breached, ipso facto proving they cannot appear in the dataset.
Today is HIBP's 12th birthday, and I've taken particular issue with Bruce's review because it calls into question the integrity with which I run this service. This is now the 218th blog post I've written about HIBP, and over the last dozen years, I've detailed everything from the architecture to the ethical considerations to how I verify breaches. It's hard to imagine being any more transparent about how this service runs, and per the above, it's very simple to disprove the Bruces of the world. If you've read this far and have an accurate, fact-based review you'd like to leave, that'd be awesome 😊
2025-12-01 14:11:03

Well, I now have the answer to how Snapchat does age verification for under-16s: they give an underage kid the ability to change their date of birth, then do a facial scan to verify. The facial scan (a third party tells me...) allows someone well under 16 to pass it easily. So, is that control "reasonable"? I guess that will depend on whether this case is an outlier or a much more common scenario, and a sample set of one isn't particularly scientific. Either way, I expect that what we're seeing is representative of a pretty obvious problem: privacy-preserving age verification is very unlikely to be reliable. It will inevitably result in letting too many young kids through, whilst blocking too many people of legitimate age. Or we end up with people needing to start uploading formal age-verification documents, which creates a whole new problem. Absolutely none of this should come as any surprise whatsoever!
2025-11-23 12:44:21

I gave up on the IoT water meter reader. Being technical and thinking you can solve everything with technology is both a blessing and a curse; dogged persistence has given me the life I have today, but it has also burned serious amounts of time because I never want to let a problem go unsolved. But sometimes, common sense and the ROI of my time have to prevail, so I packed up all the gear and went back to processing data breaches. If you happen to solve this problem in a way that doesn't require any more time investment on my end, I'd love to hear it 😊
2025-11-16 16:13:02

This week, it was an absolute privilege to be at Europol in The Hague, speaking about cyber offenders and at the InterCOP conference and spending time with some of the folks involved in the Operation Endgame actions. The latter in particular gave me a new sense of just how much coordination is involved in this sort of operation, all the way down to some of the messaging in the videos they've since released. I've seen some social commentary on these already, check them out and see what you think, especially as it relates to the psyops those videos play a role in.