2025-10-22 03:20:01
Where is your data on the internet? I mean, outside the places you've consciously provided it, where has it now flowed to and is being used and abused in ways you've never expected? The truth is that once the bad guys have your data, it often replicates over and over again via numerous channels and platforms. If you're able to aggregate enough of it en masse, you end up with huge volumes of "threat intelligence data", to use the industry buzzword. And that's precisely what Ben from Synthient has done, and then sent it to Have I Been Pwned (HIBP).
Ben is in his final year of college in the US and is carving out a niche in threat intelligence. He's written up a deeper dive in The Stealer Log Ecosystem: Processing Millions of Credentials a Day, but the headline gives you a sense of the volumes. Have a read of that post and you'll see Ben is pulling data from various sources, including social media, forums, Tor and, of course, Telegram. He's managed to aggregate so much of it that by the time he sent it to us, it was rather sizeable:
That's 3.5 terrabytes of data, with the largest file alone being 2.6TB and, combined, they contain 23 billion rows. It's a vast corpus, and if we were attempting to compete with recent hyperbolic headlines about breach sizes, this would be one of the largest. But I'm not going to play the "mine is bigger than yours" game because it makes no sense once you start analysing the data. Part of what makes the data so large is that we're actually looking at both stealer logs and credential stuffing lists, so let's assess them separately, starting with those stealer logs.
Stealer logs are the product of infostealers, that is, malware running on infected machines and capturing credentials entered into websites on input. The output of those stealer logs is primarily three things:
Someone logging into Gmail, for example, ends up with their email address and password captured against gmail.com, hence the three parts. Due to the fact that stealer logs are so heavily recycled (they're posted over and over again to the sorts of channels Ben monitors), the first thing we always do is try to get a sense of how much is genuinely new:
This is the output of a little PowerShell script we use to guage where the email addresses in a new breach corpus have been seen before. Especially when there's a suspicion that data might have been repurposed from elsewhere, it's really useful to run them against the HIBP API and see what comes back. What the output above tells us is that after checking a sample of 94k of them, 92% had been previously seen, mostly in stealer log corpuses we'd loaded in the past. This is an empirical demonstration of what I wrote in the opening paragraph - "it often replicates over and over again" - and as you can see, most of what has been seen before was in the ALIEN TXTBASE stealer logs.
Back to the console output again, and having previously seen 92% of addresses also means we haven't seen 8% of the addresses. That's 8% of a considerable number, too: we found 183M unique email addresses across Ben's stealer log data, so we're talking about 14M+ addresses that have never surfaced in HIBP. (The final number once the entire data set was loaded into HIBP was 91% pre-existing, with 16.4M previously unseen addresses in any data breach, not just stealer logs.) But as with everything we load, the question has to be asked: Is it legit? Can you trust the shady criminals who publish this data not to fill it with junk? The only way to know for sure is to ask the legitimate owners of the data, so I reached out to a bunch of our subscribers and sought their support in verifying.
One of the respondants was already concerned there could be something wrong with his Gmail account and sure enough, he had one stealer log entry for "https://accounts.google.com/signin/challenge/pwd/1" with a, uh, "suboptimal" password:
Yes I can confirm that was an accurate password on my gmail account a few months ago
Another respondant who offered support had somewhat of a recognisable pattern in the sites he'd been visiting:
To his credit, he responded and confirmed that the list did indeed contain sites he'd visited, which also included online casinos, crypto websites and VPN services:
They all look like websites I have used and some still do use
As it turns out, he also had two other email addresses in the corpus of data, both with the same collection of passwords used on the first address he replied from. They also both aligned to services based on the same TLD as the other email address which suggested which country he's located in. (Incidentally, the online privacy offered by VPNs kinda falls apart when there's malware on your machine watching every site you visit and recording your credentials.)
Even without a response from a subscriber, it's still easy to get a sense of the legitimacy of the data in a privacy-preserving fashion (i.e. not logging in with their credentials!) just by testing enumeration vectors. For example, one subscriber had an account at ShopBack in the Philippines which offers what I'll refer to as "account enumeration as a service":
I simply added some character's in front of the email address and ShopBack happily confirmed that address didn't exist. However, remove the invalid characters and there's a very different response:
All of these little "tells" add up; another subscriber had a high prevalence of Greek websites they used, showing exactly the sort of pattern you'd expect to see for someone from that corner of the world. Another had various online survey sites they'd used, and like our "assandfurious" friend from earlier, a clear pattern emerged consistent with the apparent interests of the address's owner. Time and time again, the data checked out, so we loaded it. Those 183M email addresses are now searchable in HIBP, and the passwords are currently being loaded into Pwned Passwords (I'll revised this part of the post when that's complete), which has become rather popular:
Pwned Passwords just served 17.45 billion requests in 30 days 🤯 That's an *average* of 6,733 requests per second, but at our peak, we're hitting 42k per second in a 1-minute block. Crazy numbers! Made possible by @Cloudflare 😎 pic.twitter.com/Io6u1PiqJf
— Troy Hunt (@troyhunt) October 17, 2025
The website addresses are also now searchable, either in the stealer log section of your personal dashboard or by verified domain owners using the API. You'll find this data named "Synthient Stealer Log Threat Data" in HIBP, but stealer logs are only part of the Synthient story - the small part!
Ben's data also contained credential stuffing lists. Unlike stealer logs, which are the product of malware on the victim's machine, credential stuffing lists are typically aggregated from other places where email address and password pairs are obtained. For example, from data breaches where the passwords are either stored in plain text or protected with easily crackable hashing algorithms. Those lists are then used to access the other accounts of victims where they've reused their passwords.
Quick sidenote: Credential stuffing lists can be enormously damaging because they contain the keys to so many different services. Not only are they the gateway to so many takeovers of social media accounts, email addresses and other valuable personal resources, they're also responsible for many subsequent very serious data breaches. The 2017 Uber breach was attributed to previously breached employee credentials. Five years later, and the same approach provided the initial access to Uber again, after which MFA-bombing sealed the deal. Then there was the 23andMe breach in 2023, which was also traced back to credential stuffing. Similar but different was when Dunkin' Donuts had 20k customer details exposed in a show of how multifaceted this style of attack is: they were subsequently sued for not having sufficient controls to stop hackers from simply logging in with victims' legitimate credentials. It's wild; it's the attack that just keeps on giving.
Ever since loading Collection #1 in 2019, I have been extra cautious about dealing with credential stuffing lists. The 400+ comments on that blog post will give you just a little taste of how much attention that exercise garnered. Frankly, it was a significant contributor to the feeling that it was all getting a bit too much, leading to the decision that HIBP needed to find another home (which fortunately, never eventuated). The primary issue with credential stuffing lists is that we can't attribute a given row to a specific source website or data breach, and we don't offer a service to look up credential pairs. As you'll see from many of the comments on that post, I had angry people upset that, without knowing specifically which password was exposed in the list, the knowledge that they were in there was not actionable. I disagree, because by loading those passwords into Pwned Passwords, there are now three easy ways to check if you're using a vulnerable one:
My vested interest in 1Password aside, Watchtower is the easiest, fastest way to understand your potential exposure in this incident. And in case you're wondering why I have so many vulnerable and reused passwords, it's a combination of the test accounts I've saved over the years and the 4-digit PINs some services force you to use. Would you believe that every single 4-digit number ever has been pwned?! (If you're interested, the ABC has a fantastic infographic using a heatmap based on HIBP data that shows some very predictable patterns for 4-digit PINs.)
As of the time of publishing this blog post, only the stealer logs have been loaded, and as mentioned earlier, the data in HIBP has been called "Synthient Stealer Log Threat Data". We intend to load the credential stuffing data as a separate corpus next week and call it "Synthient Credential Stuffing Threat Data", assuming it's sufficiently new and the accuracy is confirmed with our subscribers! We're doing this in two parts simply because of the scale of the data and the fact that we want to break it into two discrete corpuses given the data originates via different means. I'll revise this blog post accordingly after we finish our analysis.
Something that is becoming more evident as we load more stealer logs is that treating them as a discrete "breach" is not an accurate representation of how these things work. The truth is that, unlike a single data breach such as Ashley Madison, Dropbox, or the many other hundreds already in HIBP, stealer logs are more of a firehose of data that's just constantly spewing personal info all over the place. That, combined with the duplication of previously seen data, means that we need a rethink on this model. The data itself is still on point, but I'd like to see HIBP better reflect that firehose analogy and provide a constant stream of new data. Until then, Synthient's Threat Data will still sit in HIBP and be searchable in all the usual ways.
2025-10-20 15:09:27
You're not going to believe this - the criminals that took the Qantas data ignored the injunction 😮 I know, I know, we're all a bit stunned that making crime illegal hasn't appeared to stop it, but here we are. Just before the time of writing, I was contacted by someone who received a breach alert from a similar service to HIBP in another part of the world and while it didn't explicitly say "Qantas" (side note: I hate it when other services redact the name), it sure as hell sounded like them based on the description and timing. So, good guys have it, bad guys definitely have it, but we don't have it. Everything goes a bit topsy-turvey once the lawyers get involved...
Oh, and apologies for the audio being a couple of seconds out of sync with the video. Something obviously glitched after a Windows update and reboot. Hope you enjoy listening anyway.
2025-10-12 11:25:42
This week's video was recorded on Friday morning Aussie time, and as promised, hackers dumped data the following day. Listening back to parts of the video as I write this on a Sunday morning, pretty much what was predicted happened: data was dumped, it included Qantas, and the injunction did nothing to stop it. I knew that in advance, and I'm also certain Qantas did too, but that hasn't stopped their messaging from implying the contrary:
This wording remains worrying: "we have an ongoing injunction in place to prevent the stolen data being accessed, viewed, released, used, transmitted or published" Clearly, this hasn't "prevented" the release and broad distribution of the data. More: https://t.co/SiuMqDlyHB
— Troy Hunt (@troyhunt) October 12, 2025
I'll save more for the next weekly vid as there's a lot to unpack, suffice to say that since this recording, I've been rather busy with media commentary, including explaining how the data is now out there, it's not just on "the dark web", we don't have it, but the bad guys definitely do.
2025-10-09 08:03:52
You see it all the time after a tragedy occurs somewhere, and people flock to offer their sympathies via the "thoughts and prayers" line. Sympathy is great, and we should all express that sentiment appropriately. The criticism, however, is that the line is often offered as a substitute for meaningful action. Responding to an incident with "thoughts and prayers" doesn't actually do anything, which brings us to court injunctions in the wake of a data breach.
Let's start with HWL Ebsworth, an Australian law firm that was the victim of a ransomware attack in 2023. They were granted an injunction, which means the following:
The final interlocutory injunction restrained hackers from the ALPHV, or “BlackCat”, hackers group from publishing the HWL data on the internet, sharing it with any person, or using the information for any reason other than for obtaining legal advice on the court’s orders.
To paraphrase, the injunction prohibits the Russian crime gang that hacked the law firm and attempted to extort them from publishing the data on the internet. Right... The threat actor was subsequently served with the injunction, to which, per the article, they responded in an entirely predictable fashion:
Fuck you fuckers
And then they dumped a huge trove of data. Clearly, criminals aren't going to pay any attention whatsoever to an injunction, but this legal construct has reach far beyond just the bad guys:
The injunction will also “assist in limiting the dissemination of the exfiltrated material by enabling HWLE to inform online platforms, who are at risk of publishing the material”, Justice Slattery said.
In other words, the data is also off limits to the good guys. Journalists, security firms and yes, Have I Been Pwned (HIBP) are all impacted by injunctions like this. To some extent, you can understand this when the data is as sensitive as what a law firm typically holds, and you need only use a little bit of imagination to picture how damaging it can be for data like this to fall into the wrong hands. But data in a breach of a company like Qantas is very different:
And now here’s mine. Still no indication of specifically which service was breached, but feels very much like loyalty program data (i.e. nothing to do with specific flights, password, passport or payment details). pic.twitter.com/r7KnlfM8TV
— Troy Hunt (@troyhunt) July 11, 2025
As well as my interest in running HIBP, I also appear to be a victim of their data breach, along with my wife and kids. And just to highlight how much skin I have in the game, I'm also a Qantas shareholder and a very loyal customer:
Sitting at the airport about to take my 301st (tracked) @Qantas flight. Nice banter with the staff: “you can lose my data, just don’t lose my bags” 😬 pic.twitter.com/ZGxc4I0aB1
— Troy Hunt (@troyhunt) July 2, 2025
As such, I was particularly interested when they applied for, and were granted, a court injunction of their own. Why? What possible upside does this provide? Because by now, it's pretty clear what's going to happen to the data:
This is from a Telegram channel run by the group that took the Qantas data, along with some other huge names:
🚨🚨🚨BREAKING - New data leak site by Scattered LAPSUS$ Hunters exposes Salesforce customers. Dozens of global companies involved in a large-scale extortion campaign.
— Hackmanac (@H4ckmanac) October 3, 2025
Scattered LAPSUS$ Hunters claims to have breached Salesforce, exfiltrating ~1B records.
They accuse Salesforce… pic.twitter.com/u2PAO7miyP
"Scattered LAPSUS$ Hunters" is threatening to dump all the data publicly in a couple of days' time unless a ransom is paid, which it won't be. The quote from the Telegram image is from a Qantas spokesperson, and clearly, the injunction is not going to stop the publishing of data. Much of my gripe with injunctions is the premise that they in some way protect customers (like me), when clearly, they don't. But hey, "thoughts and prayers", right?
Without wanting to give too much credit to criminals attempting to ransom my data (and everyone else's), they're right about the media outlets. An injunction would have had a meaningful impact on the Ashley Madison coverage a decade ago, where the press happily outed the presence of famous people in the breach. Clearly, the Qantas data is nowhere near as newsworthy, and I can't imagine a headline going much beyond the significant point balances of certain politicians. The data just isn't that interesting.
The injunction is only effective against people who meet the following criteria:
The first two points are obvious, and an asterix adorns the third as it's very heavily caveated. This from a chat with a lawyer friend thir morning who specialises in this space:
it would depend on which country and whether it has a reciprocal agreement with Australia eg like the UK and also who you are trying it enforce it against and then it’s up to the court in that country to determine - but as this is an injunction (so not eg for a debt against a specific person) it’s almost impossible - you can’t just register a foreign judgement somewhere against the world at large as far as I know.
So, if the injunction is so useless at providing meaningful protections to data breach victims, what's the point? Who does it protect? In researching this piece, the best explanation I could find was from law firm Clayton Utz:
Where that confidentiality is breached due to a hack, parties should generally do - and be seen to be doing - what they can to prevent or minimise the extent of harm. Even if injunctions might not impact hackers, for the reasons set out above, they can provide ancillary benefits in relation to the further dissemination of hacked information by legitimate individuals and organisations. Depending on the terms, it might also assist with recovery on relevant insurance policies and reduce the risk of securities class actions being brought.
That term - "be seen to be doing" - says it all. This is now just me speculating, but I can envisage lawyers for Qantas standing up in court when they're defending against the inevitable class actions they'll face (which I also have strong views on), saying "Your honour, we did everything we could, we even got an injunction!" In a previous conversation I had regarding another data breach that had successfully been granted an injunction, I was told by the lawyer involved that they wanted to assure customers that they'd done everything possible. That breach was subsequently circulated online via a popular clear web hacking site (not "the dark web"), but I assume this fact and the ineffectiveness of the injunction on that audience was left out of customer communications. I feel pretty comfortable arguing that the primary beneficiary of the injunction is the shareholder, rather than the customer. And I assume the lawyers charge for their time, right?
Where this leaves us with Qantas is that, on a personal note, as a law-abiding Australian who is aware of the injunction, I won't be able to view my data or that of my kids. I can always request it of Qantas, of course, but I won't be able to go and obtain it if and when it's spread all over the internet. The criminals will, of course, and that's a very uncomfortable feeling.
From an HIBP perspective, we obviously can't load that data. It's very likely that hundreds of thousands of our subscribers will be impacted, and we won't be able to let them know (which is part of the reason I've written this post - so I can direct them here when asked). Granted, Qantas has obviously sent out disclosure notices to impacted individuals, but I'd argue that the notice that comes from HIBP carries a different gravitas: it's one thing to be told "we've had a security incident", and quite another to learn that your data is now in circulation to the extent that it's been sent to us. Further, Qantas won't be notifying the owners of the domains that their customers' email addresses are on. Many people will be using their work email address for their Qantas account, and when you tie that together with the other exposed data attributes, that creates organisational risk. Companies want to know when corporate assets (including email addresses) are exposed in a data breach, and unfortunately, we won't be able to provide them with that information.
I understand that Qantas' decision to pursue the injunction is about something much broader than the email addresses potentially appearing in HIBP. I actually think much of the advice Qantas has given is good, for example, the resources they've provided on their page about the breach:
These are all fantastic, and each of them has many good external resources people worried about scams should refer to. For example, ScamWatch has this one:
And cyber.gov.au has a handy tip courtesy of our Australian Signals Directorate makes this suggestion:
Not to miss a beat, our friends at IDCARE also offer great advice:
And, of course, the OAIC has some fantastic guidance too:
The scam resources Qantas recommends all link through to a service that will never return the Qantas data breach. Did I mention "thoughts and prayers" already?
As threatened, the Qantas data was dumped publicly 2 days after writing this post. The data appeared on a clear web file sharing service linked to by both their .onion website and a clear web site that popped up on a new domain shortly after the data was publicised. Due to the injunction, I've not accessed the data myself but have had security folks in other parts of the world reach out and confirm my record and that of my family members is present. I've also seen public commentary from other researchers analysing the data, and have had multiple people contact me and offer to send it.
Clearly, the injunction has proven to be extremely limited in its ability to stop the spread of data. Further, as a Qantas customer, I've not heard anything from them in relation to my data having now been publicly released. None of this should surprise anybody, including Qantas.
As to questions in the comments about the legitimacy of the injunction and where it can be obtained, the advice I've obtained from the law firm we use is that it is absolutely legitimate and interested parties would need to contact Qantas if they want to see it (likely a redacted version). I'm not savvy with the mechanics of how courts issue these and why they're not more publicly accessible, but I suggest this story with quotes from Justice Kunc is worth a read. Frankly, it's all a bit nuts, but that's the environment we're operating in and the rules we need to adhere to.
2025-10-06 14:23:48
This probably comes through pretty strongly in this week's video, but I love the vibe at CERN. It's a place so focused on the common good of science that all the other cultural attributes that often put people at odds these days fade into the distance. That hit me more than it did on my last visit in 2019, perhaps because of the world events of late that have become so divisive. So, I'm exceptionally happy to give CERN the same level of access to HIBP data as we have the dozens of other national governments that use the service, hear all about that and more in this week's vid:
2025-09-29 15:03:48
It's hard to explain the significance of CERN. It's the birthplace of the World Wide Web and the home of the largest machine ever built, the Large Hadron Collider. The bit that's hard to explain is, well, I mean, look at it!
Charlotte and I visited CERN in 2019, nestled in there between Switzerland and France, and descended into the mountainside where we saw the world's largest particle accelerator firsthand. I can't explain this! The physics are just mind-bending.
A few months ago, we headed back there and saw even more stuff I can't explain:
How on earth do you make antimatter?! I know there's a lot of magnets involved, but that's about the limit of my understanding.
But what I do understand a little better is the importance of CERN. They're working to help humanity understand the most profound questions about the universe by exploring fundamental physics—the very building blocks of nature. And closer to my heart (or at least to my expertise), their role in the World Wide Web and the contribution CERN has made to the internet as we know it today cannot be overstated. It's also staffed by passionate individuals with a love of science that transcends borders and politics, including many from parts of the world that don't normally see eye-to-eye. This passion was evident on both our visits, and perhaps that's an extra poignant observation in a time with so much conflict.
In relation to HIBP and our ongoing support of governments, CERN is similar yet different. It's an intergovernmental organisation operating outside the jurisdiction of any one nation. However, they face the same online threats, and just like sovereign government states, their people sign up to services that get breached and end up in HIBP. And, like the governments we support, services that can be provided to help them tackle that threat are always appreciated. I was surprised to hear on our last visit that the sum total of contributions from their member states amounts to the price of a cup of coffee per person per year! For the work they do and the contribution they make to society, onboarding CERN as the 41st (inter)government was a no-brainer. They now have full and free access to query all CERN domains across the breadth of HIBP data. Welcome aboard CERN!