2026-03-03 14:04:04
I am currently developing "co-opt", a web-based optical design software that runs entirely in the browser.
In this post, I want to share the insights I gained while building a custom "constrained optimization engine" for lens design from scratch.
Before we dive in, a quick disclaimer: the methods introduced in this article are likely well-known, standard approaches to experts in mathematical optimization. Also, to prioritize clarity, I will be breaking down the concepts so that anyone who has touched basic calculus in high school or early college can understand them. Please note that this means the article contains expressions that are not strictly mathematically rigorous.
In lens design, we have specific goals, such as "I want the focal length to be exactly 50mm" or "I want to minimize optical aberration (blurring)."
Initially, I tried to handle this optimization using solely the LM (Levenberg-Marquardt) algorithm.
However, I immediately hit a fatal roadblock. While the LM method is ferociously good at "making the score (blur) as small as possible," it completely failed to adhere to constraints like "the center thickness of the lens must be at least 1mm." As a result, in its blind pursuit of a better score, the algorithm ran out of control, producing negative lens thicknesses and causing the whole process to crash.
Just as I was agonizing over how to handle these constraints, an expert on X (formerly Twitter) gave me a crucial piece of advice: look into the KKT conditions (Karush-Kuhn-Tucker conditions). Looking back, I think I might have learned this in university, but I had completely forgotten it. Time is cruel, isn't it?
I’d like to take this opportunity to deeply thank the person who gave me this hint. It meant the world to me.
That said, knowing the term "KKT conditions" and knowing how to translate it into code are two very different things. So, I started using AI as a sounding board, repeating endless cycles of brainstorming and debugging via prompts. After much trial and error, I discovered that I could break through this wall by combining the "Augmented Lagrangian Method (ALM)" with the "LM Method".
Before we get into the math, let me explain just one symbol:
∇f(x)\nabla f(x)∇f(x)
(read as "nabla"). This represents the gradient (slope) of the function f.
Imagine mountain climbing with multiple variables.
∇f(x)\nabla f(x)∇f(x)
is simply an arrow (vector) that tells you "the direction of the steepest uphill slope from your current position, and how steep that slope is." In optimization (searching for the bottom of the valley), the basic rule is to move in the exact opposite direction of this arrow.
It’s a technique used to solve problems like: "Minimize the score f(x), but absolutely obey the equality rule (constraint) that g(x) = 0."
It's a beautiful theorem stating that if we find the point where the derivative of this L is zero ( ∇L=0\nabla L = 0∇L=0 ), we will find the point that minimizes the score while obeying the constraints. Expressed as an equation:
This represents a state where the "force trying to walk down the slope of the objective function ( ∇f\nabla f∇f )" and the "repulsive force pushing back from the constraint wall ( λ∇g\lambda \nabla gλ∇g )" are perfectly balanced and at a standstill.
We often hear that "the KKT conditions are an extension of the method of Lagrange multipliers," and to get straight to the point: yes, it's true.
The method of Lagrange multipliers can only handle "equality constraints (g(x) = 0)".
However, in actual design, we absolutely need "inequality constraints," such as "the lens thickness must be 1mm or more," which translates to c(x) ≤ 0 (the thickness constraint violation must be zero or less).
The KKT (Karush-Kuhn-Tucker) conditions are exactly the theoretical
extension of the Lagrange method that allows it to handle these inequality constraints. The KKT conditions are a collective set of mathematical rules that the optimal solution must satisfy, essentially stating, "It is impossible to improve the score any further without breaking the constraints."
So, we understand our target destination: the KKT conditions. But how do we write a program to find that destination?
The defining feature of the Augmented Lagrangian Method (ALM) is that it runs a "double loop (inner loop and outer loop)" to gradually learn the repulsive force of the wall and inch closer to the optimal
solution.
Words alone can be confusing, so first, take a look at the overall calculation flowchart implemented in my software (optimizer-mvp.ts).
2026-03-03 13:58:13
Your "private" prompts are being baked into a training model that NEVER forgets. I just saw a firm lose a massive contract over ONE copy-paste. 📉
Stop being the training data:
https://www.sparkgoldentech.com/en/blog/2026/03/02/stop-treating-chatgpt-like-your-therapist-the-privacy-trap-nobodys-talking-about
2026-03-03 13:52:02
This is a submission for the Built with Google Gemini: Writing Challenge
I developed CV Advisor PRO, a senior career auditor designed to move beyond simple spell-checking and into high-level career coaching. Many talented professionals miss out on opportunities because they lack the "language of leadership." This tool democratizes access to the kind of feedback usually reserved for expensive executive consultants.
Google Gemini serves as the brain of the operation. Leveraging the Gemini 3 Flash Preview model via Google AI Studio, the app performs deep-tissue scans of professional documents. It uses Gemini’s massive context window and multimodal capabilities to:
Here is my video demo.
Building this tool taught me that career coaching is a delicate balance of data science and psychology.
The experience with Gemini 3 Flash was a game-changer for this specific use case:
The Wins: The model's responses are consistent and useful. I applied several improvements to my own CV. Also the speed is incredible. For a real-time conversational coach, latency is the enemy, and Flash handled complex reasoning almost instantly. Its ability to "understand" the nuance between a responsibility and an achievement was far superior to previous models I've tested.
The Friction: Fine-tuning the "AI Carrer Coaching" aspect required several iterations. Initially, the model tended to be too polite (the model praised everything, even irrelevant details). Another point of friction was that it types the response much faster than it can read. That's why I limited interaction after the AI finishes speaking.
Future Support: I’d love to see even deeper integration for real-time web-scraping within the AI Studio environment to facilitate the "Live Job Comparison" feature I have planned next.
2026-03-03 13:49:39
You write something simple like this:
response = client.responses.create(
model="gpt-4o",
input="Explain backpressure in simple terms"
)
A few hundred milliseconds later, text begins streaming back.
It feels instant.
It feels simple.
But that single API call triggers a surprisingly complex distributed system involving:
An LLM API is not just “a model running on a server.”
It is a real-time scheduling and resource allocation system built on top of extremely expensive hardware.
Under the hood, your request is competing with thousands of others for:
Understanding this pipeline changes how you think about:
In this article, we’ll walk through exactly what happens — step by step — from the moment your request hits the edge of the network to the moment tokens stream back to your client.
No hype.
No marketing language.
Just the infrastructure.
Before diving into details, here’s the high-level flow of a typical LLM API request:
Each of these steps exists for a reason.
Each introduces tradeoffs.
And each can become a bottleneck under load.
Let’s break them down.
Title: LLM API Request Lifecycle
Client
↓
Edge / Load Balancer
↓
API Gateway
↓
Auth & Quota
↓
Request Queue
↓
Scheduler
↓
GPU Worker
↓
Streaming Response
↓
Client
Small labels under each step:
This diagram gives the reader a mental map before diving deeper.
Title: Why Latency Explodes Under Load
Key takeaway:
Arrival rate > processing rate → queue grows → latency explodes
This makes queueing behavior intuitive without math.
Title: Naive Batching vs Continuous Batching
Naive batching:
Time →
[ Batch 1 ] idle [ Batch 2 ] idle [ Batch 3 ]
Continuous batching:
Time →
A B C
D E
F
(all decoding together)
This diagram explains why modern inference systems behave differently.
Title: Two Phases of LLM Inference
Prefill phase:
Decode phase:
Visual flow:
Prompt tokens → KV Cache
KV Cache → Token 1 → Token 2 → Token 3 → ...
This diagram makes streaming behavior obvious.
Title: Token Streaming Lifecycle
Request start
↓
Prefill
↓
Decode token 1 → sent
Decode token 2 → sent
Decode token 3 → sent
↓
Client disconnect?
├─ Yes → cancel → cleanup resources
└─ No → continue decoding
This highlights an often-overlooked detail:
cancellation must propagate through the system to avoid wasted GPU work.
Next, we’ll dive into what happens when your request enters the inference queue — and why that queue is where most latency problems begin.
2026-03-03 13:43:59

Announcement
With my font generator project that I released today, you can create any font you want with a style you define. You can freely use the unique fonts of a movie or game you are a fan of. The link is right below 👇
davincifont.labdays.io
Context
I started the Vibe Coding Challenge. I plan to release a new product every day, and today is my 5th day. You can visit my website (labdays.io) to learn about the process.
Notes from the 5th day of the Challenge
Starting a new project is my favorite thing. The later stages and polishing are a bit more tedious.
Generally, some work from the previous day's project spills over into the next day.
Lack of sleep visibly reduces productivity (Going to bed at 3 AM - Waking up at 7 AM).
I used to spend a long time looking at the projects I had done and the articles I had written, thinking about them over and over again. Now I don't think about them anymore (I don't have time!).
This challenge taught me the importance and satisfaction of working fast in just 5 days. Even if I work on a different project in the future, I should replicate this fast-paced challenge format there as well.
The fact that AI can code saddened me as a developer. I felt like the workers who broke the machines during the industrial revolution. But now I love it and I am integrating it into my workflow.
I am noticing how my previous projects are compounding into my future ones, and this is amazing.
When I look at the old projects of the creator of Openclaw, we can see the small pieces that lead to Openclaw. I hope this will happen for me too and lead to a big project.
Yesterday I only slept for 4 hours, and this directly impacted my productivity. I took a 30-minute break, and even that was enough to recharge my energy.
Half the day is over, and I have 5 half-finished projects.
The initial state of a project is very important in terms of its progress. When overly detailed plans are provided, the project spirals into an unmanageable state through its own decisions.
It was disappointing to realize that even the projects I had planned and considered niche had already been built.
I suppose once everything has been built, what remains will be novel inventions and the synthesis of multiple things, transforming them through proper orchestration.
Some days the AI works well, other days it works very badly. Even if they don't reflect it to the user, I think they are secretly throttling the performance.
The more projects, the better. Some don't lead anywhere good, and it's better to have your hands full than to end up empty-handed. Days are short.
Although design shapes thought, functionality comes first. I shouldn't make the mistake of jumping into design without nailing down the project's functionality ever again.
Instead of providing a long prompt, it works much better when I point to the task.md file and ask it to perform the tasks there.
The project name is also an interface and changes the way you think about the project.
It's 6 hours until midnight and I still haven't produced anything tangible (Lack of sleep is not good at all).
Time constraints make implementing features like membership systems difficult. One of my next projects might be a membership template.
2026-03-03 13:43:40
Generated with Claude (Anthropic) based on a real session with @ankitg12.
Tags: #claude #ai #aiassisted
This is the second post in a two-part series. The first post covers getting Unix tools (grep, ls -lt, sed, awk) working in PowerShell — and cost ~102,000 tokens to figure out. Once the post was written, the natural next question was: can we skip the browser and publish it programmatically?
Turns out — yes, but not how you'd expect. Here's what we tried and what actually worked.
Medium was the first choice. They used to have a clean REST API. Not anymore.
"Medium will not be issuing any new integration tokens for our API and will not allow any new integrations."
The GitHub repo was archived in March 2023. Existing tokens still work — but if you don't already have one, you can't get one. A paid membership doesn't help either.
Workaround: Publish to dev.to first, then use Medium's Import a Story tool to pull the URL in. Medium even sets the canonical URL back to dev.to, which is good for SEO.
devto-cli by sinedied is the right idea — a purpose-built npm CLI for publishing markdown files to dev.to:
npm install -g @sinedied/devto-cli
dev push article.md
Hit this immediately:
No GitHub repository provided.
Use --repo option or .env file to provide one.
devto-cli requires a GitHub repo because it rewrites relative image URLs to point to raw GitHub content. Smart for image hosting — but unnecessary overhead for a text-only post with no images.
Could pass --repo username/repo and move on, but at this point it felt like the tool was doing more than needed.
dev.to has a straightforward REST API. The API key is still alive and well (unlike Medium's). A minimal test first:
curl -s -w "\nHTTP:%{http_code}" \
-X POST https://dev.to/api/articles \
-H "api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"article":{"title":"test","body_markdown":"hello","published":false}}'
Got HTTP:201. No firewall, no issues.
For the real article, build the JSON payload with Python (to handle escaping cleanly) and pass it via --data @file:
import json
with open("article.md", encoding="utf-8") as f:
content = f.read()
# Strip H1 title — dev.to uses the title field separately
body = "\n".join(content.split("\n")[2:]).strip()
payload = {
"article": {
"title": "Your Article Title",
"body_markdown": body,
"published": False, # draft first
"tags": ["powershell", "windows", "tutorial", "ai"]
}
}
with open("payload.json", "w", encoding="utf-8") as f:
json.dump(payload, f)
Then publish:
curl -s -X POST https://dev.to/api/articles \
-H "api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
--data @payload.json
To update an existing draft (get the article ID from /api/articles/me/unpublished):
# Get draft IDs
curl -s -H "api-key: YOUR_API_KEY" \
https://dev.to/api/articles/me/unpublished \
| python3 -c "
import sys, json
for a in json.load(sys.stdin):
print(a['id'], a['title'])
"
# Update draft
curl -s -X PUT https://dev.to/api/articles/ARTICLE_ID \
-H "api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
--data @payload.json
The URL returned by the API is not directly accessible:
https://dev.to/username/article-slug-temp-slug-XXXX
To preview a draft, go to dev.to/dashboard, open the draft, and dev.to generates a preview URL with a token:
https://dev.to/username/slug?preview=<long_token>
This is expected behavior — drafts are private until published.
Write markdown locally
↓
python3 build_payload.py # build payload.json
↓
curl --data @payload.json # POST to dev.to (draft)
↓
Review on dev.to dashboard
↓
Hit Publish
↓
Import URL into Medium # medium.com/p/import
| Method | Status | Notes |
|---|---|---|
| Medium API | Dead | No new tokens since 2023 |
| devto-cli | Works | Needs GitHub repo even for no-image posts |
| curl + Python payload | Works | Simplest, no dependencies |
Python urllib
|
403 | SSL/TLS difference vs curl on Windows |
| Medium Import tool | Works | Pull from dev.to URL after publishing |