2025-07-04 05:58:02
I think that a lot of resistance to AI coding tools comes from the same place: fear of losing something that has defined you for so long. People are reacting against overblown hype, and there is overblown hype. I get that, but I also think there’s something deeper going on here. When you’ve worked hard to build your skills, when coding is part of your identity and where you get your worth, the idea of a tool that might replace some of that is very threatening.
— Adam Gordon Bell, When AI Codes, What’s Left for me?
Tags: llms, careers, ai, generative-ai, ai-assisted-programming
2025-07-04 05:16:51
TIL: Rate limiting by IP using Cloudflare's rate limiting rules
My blog started timing out on some requests a few days ago, and it turned out there were misbehaving crawlers that were spidering my/search/
page even though it's restricted by robots.txt
.
I run this site behind Cloudflare and it turns out Cloudflare's WAF (Web Application Firewall) has a rate limiting tool that I could use to restrict requests to /search/*
by a specific IP to a maximum of 5 every 10 seconds.
Tags: rate-limiting, security, cloudflare, til
2025-07-04 04:36:56
Frequently Asked Questions (And Answers) About AI Evals
Hamel Husain and Shreya Shankar have been running a paid, cohort-based course on AI Evals For Engineers & PMs over the past few months. Here Hamel collects answers to the most common questions asked during the course.There's a ton of actionable advice in here. I continue to believe that a robust approach to evals is the single most important distinguishing factor between well-engineered, reliable AI systems and YOLO cross-fingers and hope it works development.
Hamel says:
It’s important to recognize that evaluation is part of the development process rather than a distinct line item, similar to how debugging is part of software development. [...]
In the projects we’ve worked on, we’ve spent 60-80% of our development time on error analysis and evaluation. Expect most of your effort to go toward understanding failures (i.e. looking at data) rather than building automated checks.
I found this tip to be useful and surprising:
If you’re passing 100% of your evals, you’re likely not challenging your system enough. A 70% pass rate might indicate a more meaningful evaluation that’s actually stress-testing your application.
Via Hacker News
Tags: ai, generative-ai, llms, hamel-husain, evals
2025-07-04 04:19:34
Trial Court Decides Case Based On AI-Hallucinated Caselaw
Joe Patrice writing for Above the Law:[...] it was always only a matter of time before a poor litigant representing themselves fails to know enough to sniff out and flag Beavis v. Butthead and a busy or apathetic judge rubberstamps one side’s proposed order without probing the cites for verification. [...]
It finally happened with a trial judge issuing an order based off fake cases (flagged by Rob Freund). While the appellate court put a stop to the matter, the fact that it got this far should terrify everyone.
It's already listed in the AI Hallucination Cases database (now listing 168 cases, it was 116 when I first wrote about it on 25th May) which lists a $2,500 monetary penalty.
Tags: law, ai, generative-ai, llms, ai-ethics, hallucinations
2025-07-04 03:23:05
I built something that changed my friend group's social fabric
I absolutely love this as an illustration of the thing where the tiniest design decisions in software can have an outsized effect on the world.Dan Petrolito noticed that his friend group weren't chatting to each other using voice chat on their Discord server because they usually weren't online at the same time. He wired up a ~20 lines of Python Discord bot to turn people joining the voice channel into a message that could be received as a notification and had a huge uptick in conversations between the group, lasting several years.
Via Hacker News
Tags: social-software, discord
2025-07-03 22:28:56
Something I've realized about LLM tool use is that it means that if you can reduce a problem to something that can be solved by an LLM in a sandbox using tools in a loop, you can brute force that problem.
The challenge then becomes identifying those problems and figuring out how to configure a sandbox for them, what tools to provide and how to define the success criteria for the model.
That still takes significant skill and experience, but it's at a higher level than chewing through that problem using trial and error by hand.
My x86 assembly experiment with Claude Code was the thing that made this click for me.
Tags: llm-tool-use, ai-assisted-programming, claude-code, sandboxing, generative-ai, ai, llms