2025-10-24 09:00:00
Just one week after I published my Setp 2025 edition of How I vibe coding, Anthropic released Claude Sonnet 4.5. That's a quick reminder that in this field, SOTA doesn't stay still. This post updates my mindset after a month with the new model.
There is one addition to my mindset: slow is fast.
The more time I spend on coding, the more I realize that attention is my most important resource. The bottleneck is increasingly becoming which task I choose to focus on. Faster models don’t necessarily make people more productive, they consume attention rather than free up capacity.
There is a matrix around quality and speed: {high, low} * {quality, speed}. Obviously we should avoid low quality and low speed, but high quality and high speed is not achievable. The tradeoff then comes down to choosing quality or speed, that’s the question.
My understanding now is that quality is the most important thing. It's more acceptable to me that a model is slow but produces high quality results. At the current stage, attention is my most valuable resource, and I need to allocate it wisely. A slow but correct model consumes less of my attention and lets me focus on finding the right path. A fast but flawed model drains all my attention just to fix its errors, remove unnecessary comments and tests, or correct the lies it keeps telling.
The improvement brought by model updates seems smaller and smaller. Claude Sonnet 4.5 may be better than Claude Sonnet 4, but still not as good as Claude Opus 4.1. I believe people like Claude Sonnet 4.5 mainly because it is much cheaper and faster than Claude Opus 4.1. People said that Claude Sonnet 4.5 is good enough. But I don't think so after stuck in a dead loop for long time. Now I care more about quality than cost (both in money and time).
A fast model that drains my attention is slower in the long run. Slow models that get things right free me to think, and that’s what real speed feels like.
No change, still using codex, xlaude and zed.
More details could be found at Setp 2025 edition
One experiment I tried this month is using Codex via Cloud, Slack, and GitHub. Here are my thoughts:
It’s tempting to spawn tasks over the cloud, just delegate and walk away. I can monitor progress from my browser or phone, which feels convenient.
But it doesn’t work well for me because:
As a result, I rarely use Cloud Codex for real tasks. The only useful cases I’ve found are for asking questions and exploring codebases—like querying the system as if it were an architect. These tasks aren’t time sensitive, so running them in the cloud feels cozy. I just check the results later.
I integrated Codex into our team’s Slack. But it’s still at a very early stage: It’s just a quick shortcut to start a new task. All you get are two notifications: “I’m running” and “I’m finished.” It can’t submit PRs directly from Slack, can’t comment on results in chat, and can’t hold a conversation. Every operation requires clicking through to the Cloud task page.
This design makes Slack Codex integration far less useful than I expected. I haven’t tried Cursor yet, does it work better?
Codex also has GitHub integration: you can enable Codex reviews and request changes. But again, it’s essentially just a way to trigger a Cloud Codex task. The overall UX is underwhelming.
That said, I’ve noticed that Codex’s code reviews are surprisingly high quality. It catches real, important bugs in PRs, not the meaningless nitpicks that GitHub Copilot often delivers. When everything looks fine, it just leaves a simple 👍. I like this approach.
The maintainer still needs to review PRs themselves, but CodeX reviews have at least been helpful instead of adding more work for maintainers. I believe it’s already a big improvement.
Today, Kimi announced a coding plan for users. Nearly all major model providers have shifted to subscription-based models. I believe this is the trend. Just like how everyone now uses mobile data plans instead of paying per GB. That’s human nature: people prefer predictable, affordable pricing.
This shift also pushes service providers to improve cost efficiency and quality. I’m not overly concerned about providers “dumbing down” their services because the market is fiercely competitive. Reducing quality means losing users to competitors.
One of my friends complains there’s too much new stuff to keep up with every day. I agree. It’s true. But you can allocate your attention wisely: focus only on SOTA models and use the best available in your daily work. Don’t get caught up in debates over whether Model A is better than Model B on Task X.
We are developers. Our attention is our most valuable resource. Allocate it carefully. It’s not our job to hunt down the absolute best models, the market will reveal them. Sometimes it’s unclear what’s truly the best, but SOTA is easy to identify. As of now, the top choices are OpenAI, Anthropic, and Google. I’m not including Qwen since Qwen3 Max isn’t affordable for coding use. Pick one from these three based on your preference.
Reevaluate this decision every one to three months. At other times, you’re still a developer. AI news should occupy no more than 10% of your input information. Focus instead on your language, tools, frameworks, and industry trends.
Hope you're enjoying the coding as before.
2025-09-22 09:00:00
I wrote How I vibe Coding? back in June. Nearly three months later, things have changed. Time to update this article to reflect my current setup. In this piece, I’ll share how I use my tools to do vibe coding. As always, I’m writing this down not just to document my journey, but to inspire you and welcome your feedback.
My current toolset consists of codex, xlaude, and zed.
I switched from Claude Code to Codex solely to access GPT-5-High, and now there's a better version called GPT-5-Codex-High. I’m not happy with Claude’s recent drama, and I’m glad GPT-5-High is outstanding. It’s on par with Opus 4.1, if not better.
Some obvious good points about GPT-5 I can tell:
git status -sb to check the diff properly.Codex doesn’t have the ecosystem that Claude Code offers, and its agentic workflow isn’t as rich. It only just added resume support and still doesn’t support resumes based on project paths. Generally, I think Codex isn’t as strong as Claude Code, but the gap is shrinking fast. The team behind Codex has been doing great work lately. Still, Codex has some clear advantages:
Overall, I’m happy with this switch to a better model even if the agent capabilities are slightly worse.
Xlaude, Xuanwo’s Claude Code, is a CLI tool I built for myself to manage Claude instances using git worktrees for parallel development workflows. Originally designed for Claude, it now supports Codex too.
Whenever I need to work on something, I run xlaude create abc. Under the hood, it does this:
<project>-abc in the parent folder of <project>
When I’m done, I run xlaude delete to remove the worktree.
Xlaude also includes a dashboard powered by tmux that runs Codex inside a persistent session, so I don’t lose my work if I accidentally close the terminal. But these days, I prefer the simpler xlaude create command. It’s easier to track. Another great feature is its list of active worktrees, so I never lose sight of ongoing projects.
I’m now running Codex inside Ghostty directly instead of using Zed’s terminal tab. After spending more time with the code agent, I’ve realized I don’t need an IDE open 80% of the time. I only need it in two scenarios: when a task is nearly complete and I’m ready to review, or when a task fails and I need to dig deep to understand what’s going wrong and guide the code agent on what to do next.
Zed is perfect for these two cases. I can open Zed in the current directory from Ghostty by typing zed .. It starts instantly, letting me keep my train of thought without losing momentum while waiting. Zed also offers excellent diff views at both the project and file levels. The overall review experience is pretty great.
So in conclusion: I’m using Codex + Xlaude + Zed for vibe coding. Codex for coding, Xlaude for task management, and Zed for code reviewing and trouble shooting.
It’s interesting that my mindset hasn’t changed much since three months ago. LLMs are still pretty much like a recent graduate at a junior level, maybe a bit sharper but still junior. As the driver, we still need to stay in control of the task and be ready to take over anytime.
Here’s what I’ve learned this year:
Three months ago, Claude Opus 4 was unbeatable. Then Opus 4.1 came out and was slightly better. But now, in September 2025, GPT-5-Codex with high reasoning is superior. The best part is you only need ChatGPT Plus at $20 to access the top coding model. With Claude, people can only use Sonnet 4 on the Pro plan at $17; to get access to Opus, you need at least the Max plan at $100.
Three months later, MCP for coding is still a lie. People don’t really need MCP. Any MCP can just become a plain CLI or simple curl call, which LLMs have already mastered. Adding too many MCP servers is just a waste of context.
Use tools instead of configuring MCP servers in coding.
Subscription is the future. Code Agent is designed to be token-intensive. Just subscribe to the best model you want instead of paying per request or token. It makes no sense anymore. I especially can’t understand why anyone pays over $200, even $1000, for Cursor or APIs. By simply switching to subscription-based services, you achieve massive cost savings.
More and more subscription services are emerging. OpenAI lets ChatGPT users access Codex, and Cerebras has announced Cerebras Code with a similar pricing strategy. GLM also have their own plans. It’s not hard to predict that Google will soon join this battle and let Workspace users access Gemini.
Stop paying for tokens. Use subscription now. By the way, don’t pay yearly. Stick to monthly. Always stay ready to switch to a better option.
Hope you're enjoying the coding vibes: create more, hype less.
2025-08-06 09:00:00
It's often said that Rust has a steep learning curve. I disagree with this notion. I'm a strong believer in learning by doing. Rust is a programming language, and like any language, it should be learned by applying it to real projects rather than relying solely on books or videos. However, learning by doing can't solve every problem that newcomers might encounter. While it helps with grasping the basics, when it comes to mastering Rust's advanced features like ownership, traits, lifetimes, async, we need more than just hands-on practice. We need to understand. We need to reason. Thanks to Code Agents, I discovered something even better: learning Rust by reasoning (with Code Agents).
When we reason with code, we're doing more than just following its execution. We're trying to piece together the thought process behind it. We're imagining the mindset of the person (or the AI) who created it, and questioning it.
| Reading | Reasoning |
|---|---|
| "This line uses Pin<&mut Self>." | "Why do we need Pin here? What breaks if we remove it?" |
| "Here we use a match statement." | "Why not if let? Would it change the behavior?" |
| "The field is wrapped in Arc." | "Is sharing needed? Who else uses this data?" |
Reasoning always involves a question, not just a fact. The power of AI-assisted programming is not in generating code. It’s in giving us something to reason about.
Reasoning mimics how we truly understand complex systems.
A Reasoning-Driven Learning Loop looks like the following:
This is not passive consumption. This is active debugging: not of the code, but of our understanding.
Not all diffs are equally educational. Here’s what I look for:
Sometimes it’s just one line, but one line is enough if we go deep enough.
Let's try it step by step.
First, we need an idea to work on or a problem to solve. It should be moderately challenging, not as simple as fixing a typo or renaming a type, but not so complex that it turns into an entire project. Ideally, it could be something like a PR with about 250 lines of code that is self-contained.
If you don't have an idea, feel free to pick an issue from Apache OpenDAL so you can contribute to open source while learning Rust!
I will use an example from OpenDAL. Let's take the good first issue: Migrate all layers to context-based as an example.
First, we need to discuss with the maintainer about the issue and the general idea of how we can implement it. Then, we will start Claude Code. Note that we are currently Rust newbies, so let's focus on getting Claude code working on this issue first before reasoning through it.

Claude's code will address this issue and handle the work behind the scenes. Let's just prepare a cup of coffee and wait for them for a while. We might be interrupted a few times if something goes wrong, but eventually, we'll have a basic implementation. Remember to ask Claude to run cargo clippy to ensure the code compiles.

Don't rush here; our work has just begun. Please NEVER submit a PR without thoroughly reasoning it out. Otherwise, it could waste time for both us and the maintainer.
We will need to review the code line by line. Let me give you an example. In this PR, we have a manually implemented Stream:
impl<S, I> Stream for LoggingStream<S, I>
where
S: Stream<Item = Result<Buffer>> + Unpin + 'static,
I: LoggingInterceptor,
{
type Item = Result<Buffer>;
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
...
}
}
There are many new concepts for a Rust beginner: Unpin, Pin, Poll, Context. Let's focus on the first one: what does Context mean here? What happens if I don't pass it correctly in inner.poll_next_unpin()? To prevent Claude from interpreting our questions as a code improvement suggestion, it's better to make a git commit first and then ask Claude clearly.
I’m a rust learner and I’m just learning by reasoning your code. Please tell me what’s the context means in the Stream trait impl? What if I didn’t pass it done in the inner.poll_next_unpin()?

Most of the time, Claude can provide good explanations like this one. However, the most important thing is that you must keep reasoning and not trust them without verification. Sometimes, you might need to ask Claude to provide runnable examples and repeat the process. It can be challenging at first, but most of your problems will disappear after several iterations. You'll also find that more and more of your reasoning becomes project-related and not just about Rust.
When that happens, enjoy programming by reasoning with code agents.
Why does this method work? I believe it aligns with how humans think and understand concepts. By reasoning through existing code, we learn how code is actually written in the real world and connect it with the pieces already in our minds. Most importantly, we transform the learning process so we can focus only on what we don't understand or have concerns about, instead of repeatedly reading what we already know. This saves a lot of time compared to reading entire books just to answer one small question.
Code agents are not here to replace our learning. They are here to generate raw materials for our thoughts. Learning Rust by reasoning is not about trusting the AI. It's about building our own judgment on top of it.
The last thing I want to emphasize is: reasoning comes first! Don’t generate low-quality PRs in this way. We must put our own thinking into it and take responsibility for the results. We are the captain. AI can help generate routes for us, but We must decide the next actions ourselves.
If you want to try reasoning with a real-world issue, welcome to checkout OpenDAL good first issues
Generate a diff. Ask why. Ask what if. And don’t stop until it makes sense.
That's all, thanks for reading and happy learning!
2025-07-24 09:00:00
I've been pinged by @vaibhaw_vipul to read and share my thoughts on Just make it scale: An Aurora DSQL story. It's actually quite enjoyable. I agree with @iavins's comment that "It's less about database development and more like a love letter to Rust."
I've talked about rewrite bigdata in rust (RBIR) quite a lot, and this article offers great examples for my theory: Rust is a good choice for data-intensive infrastructure. It's time for us to rewrite bigdata in Rust.
While each database service we’ve launched has solved critical problems for our customers, we kept encountering a persistent challenge: how do you build a relational database that requires no infrastructure management and which scales automatically with load? One that combines the familiarity and power of SQL with genuine serverless scalability, seamless multi-region deployment, and zero operational overhead? Our previous attempts had each moved us closer to this goal. Aurora brought cloud-optimized storage and simplified operations, Aurora Serverless automated vertical scaling, but we knew we needed to go further. This wasn’t just about adding features or improving performance - it was about fundamentally rethinking what a cloud database could be.
As a brief background, Aurora DSQL is a serverless, distributed SQL database built on PostgreSQL. It is well-suited for global operations and highly consistent, distributed OLTP workloads. Comparable products include Google Cloud Spanner, CockroachDB, and TiDB.
To validate our concerns, we ran simulation testing of the system – specifically modeling how our crossbar architecture would perform when scaling up the number of hosts, while accounting for occasional 1-second stalls. The results were sobering: with 40 hosts, instead of achieving the expected million TPS in the crossbar simulation, we were only hitting about 6,000 TPS. Even worse, our tail latency had exploded from an acceptable 1 second to a catastrophic 10 seconds. This wasn’t just an edge case - it was fundamental to our architecture. Every transaction had to read from multiple hosts, which meant that as we scaled up, the likelihood of encountering at least one GC pause during a transaction approached 100%. In other words, at scale, nearly every transaction would be affected by the worst-case latency of any single host in the system.
GC pauses are real. We talk about GC every day, which makes it seem like a well-solved problem, but it's not. In high-load distributed systems, this can be a serious issue that renders your system unusable. As the author mentioned: "they were very real problems we needed to solve"
The language offered us predictable performance without garbage collection overhead, memory safety without sacrificing control, and zero-cost abstractions that let us write high-level code that compiled down to efficient machine instructions.
Well, it's boring, right? I know some friends who almost experienced PTSD when it comes to "memory safety", but please let me expand the discussion about memory safety in C and C++ one last time.
Many people treat this as a skill issue, claiming that it's possible to write memory-safe code in C and C++ as well. I believe them. I trust that they can write memory safe C in right projects, with the right people, at the right time.
I emphasize right here because I also recognize that they're human (correct me if I'm wrong!). Humans make mistakes. Some projects might lack documentation, so we may not realize we need to read within a 1024-byte limit; sometimes new team members join and aren't yet familiar with the hidden context; or there are days when we simply can't think clearly.
In other words, while we can write safe code in C or C++, Rust ensures our code is safe through zero-cost abstractions. That changes everything. As a reviewer, we no longer need to catch every pointer usage. Instead, we can focus on reviewing business logic.
Returning to the title of this article: just make it scale. Rust is a scalable language; it can grow with our team and our project's size.
Rather than tackle the complex Crossbar implementation, we chose to start with the Adjudicator – a relatively simple component that sits in front of the journal and ensures only one transaction wins when there are conflicts. This was our team’s first foray into Rust, and we picked the Adjudicator for a few reasons: it was less complex than the Crossbar, we already had a Rust client for the journal, and we had an existing JVM (Kotlin) implementation to compare against.
Good choice.
I've seen many teams fail when migrating to Rust because they try to switch everything at once. They send out a team message saying, 'Hey everyone, we're rewriting everything in Rust now!' Soon after, they find their team burning out quickly: they need to learn an unfamiliar language to build features, and they still have to be oncall for the old services. It can be really tough.
Starting with a small project or module within your existing projects is always a great idea. It allows you to evaluate Rust's value and gives developers some time to learn Rust first.
But after a few weeks, it compiled and the results surprised us. The code was 10x faster than our carefully tuned Kotlin implementation – despite no attempt to make it faster. To put this in perspective, we had spent years incrementally improving the Kotlin version from 2,000 to 3,000 transactions per second (TPS). The Rust version, written by Java developers who were new to the language, clocked 30,000 TPS.
I've decided not to comment on this too much. THIS IS RUST, guys.
We decided to pivot and write the extensions in Rust. Given that the Rust code is interacting closely with Postgres APIs, it may seem like using Rust wouldn’t offer much of a memory safety advantage, but that turned out not to be true. The team was able to create abstractions that enforce safe patterns of memory access.
I wonder if they are using pgrx or simply building a Rust API against Postgre's C API.
At first, things went well. We had both the data and control planes working as expected in isolation. However, once we started integrating them together, we started hitting problems. DSQL’s control plane does a lot more than CRUD operations, it’s the brain behind our hands-free operations and scaling, detecting when clusters get hot and orchestrating topology changes. To make all this work, the control plane has to share some amount of logic with the data plane. Best practice would be to create a shared library to avoid “repeating ourselves”. But we couldn’t do that, because we were using different languages, which meant that sometimes the Kotlin and Rust versions of the code were slightly different.
I wonder if they've considered implementing it in Rust and exposing it through JNI. It seems natural to me, but I'm not sure why they didn't mention this option in the post.
Rust turned out to be a great fit for DSQL. It gave us the control we needed to avoid tail latency in the core parts of the system, the flexibility to integrate with a C codebase like Postgres, and the high-level productivity we needed to stand up our control plane. We even wound up using Rust (via WebAssembly) to power our internal ops web page.
This statement reminds me of Niko Matsakis's Rust in 2025: Targeting foundational software.
I see Rust's mission as making it dramatically easier to create and maintain foundational software.
Rust is an excellent fit for foundational software like DSQL and your project!
We assumed Rust would be lower productivity than a language like Java, but that turned out to be an illusion. There was definitely a learning curve, but once the team was ramped up, they moved just as fast as they ever had.
Given the popularity of code agent tools like Claude Code, I want to point out that Rust is especially well-suited for vibe coding.
It's quite easy for Claude Code to use tools like cargo clippy to build and fix compiler errors in Rust code. Users just need to provide the correct instructions for code agents. Once it builds and passes all tests, we can use it with confidence, without worrying about memory safety or runtime type mismatches.
Worth a try!
Rust is hyped for solid reasons. It's a foundational language. Welcome it, embrace it, enjoy it!
2025-06-29 09:00:00
TL;DR: I'm leaving Databend to join LanceDB.
I still remember the day I spoke with Bohu about joining Databend. It was a winter day with the sun shining brightly (Actually, I don't really remember if it was sunny and bright). We talked a bit about data warehouses, Rust, Snowflake, and open source. It was a time before AI, lakehouses, and table formats.
In the years at Databend, I went through my fastest growth both in life and in tech.
During my years at Databend, I married my wife, who has become the center of my life. We bought a cozy house near Tianjin, where we live with our two lovely dogs (Naihu and Theo) . Databend witnessed every major milestone in my personal life.
At the same time, Databend stood behind every milestone in my career.
OpenDAL entered and then graduated from the ASF Incubator; I was nominated as an ASF Member, and I was invited as a committer for iceberg-rust, etc. If QingCloud was a school that taught me how to work, then Databend is the place that allowed me to shine and fully realize my potential.
Databend is now entering a new and steadier phase, exactly what we set out to build back in that Rust-and-Snowflake conversation with Bohu. My heartbeat is still synced to the 0 → 1 phase. Designing protocols, chasing regressions, turning bold ideas into first commits. So I chose to hand over my modules while everything is smooth. I leave with deep gratitude, knowing that Databend's codebase, and the friendships behind it, will continue to grow long after I step aside.
My next adventure is LanceDB, where vector lakes meet Rust. Different waves, same ocean.
Readers may know that my personal vision is Data Freedom, and I am currently working on Rewriting Big Data in Rust. I really don't want to work at a big tech company (honestly, I'm not sure I could pass the interview), and I also don't want to become a competitor to Databend (even though they haven't asked me to sign an NCC—I just don't want to).
LanceDB appeared: LanceDB is an open-source multimodal database designed for efficient storage, retrieval, and management of vectors and multimodal data. It's quite appealing to me to create a new format that could set the standard for the new AI era. Concretely, Lance's columnar container layout and on-disk HNSW index open a playground for Rust-level SIMD tuning, exactly the kind of hard-core storage work I've been missing. I would have the opportunity to work on low-level storage optimizations to support vector search and unstructured data storage. This would also be a great chance to learn many new things.
LanceDB is strongly committed to open source. By joining LanceDB, I could stay active in many open source communities like opendal, arrow, datafusion, iceberg, and even databend in the future. I could still connect them all to build something truly fantastic.
Joining LanceDB feels like the most straightforward way to achieve that vision of Data Freedom, one vector at a time.
TL;DR in one line: grateful waves goodbye, curious waves hello. See you at the next PR review — perhaps in LanceDB, perhaps back in Databend, always in open source.
2025-06-26 09:00:00
Hello everyone, long time no see. I've been evaluating various AI copilots extensively lately and have developed a fairly stable workflow that suits my context and background. I'm now writing it down to hopefully inspire you and to receive some feedback as well.
I'm Xuanwo, an open source Rust engineer.
Open source means I primarily work in open source environments. I can freely allow LLMs to access my code as context, without needing to set up a local LLM to prevent code leaks or meet company regulations. It also means my work is publicly available, so LLMs can easily search for and retrieve my code and API documentation.
Rust means I spend most of my time writing Rust. It's a nice language has great documentation, a friendly compiler with useful error messages, and top-notch tooling. Rust has a strong ecosystem for developer tools and is highly accessible. Most LLMs already know how to use cargo check, cargo clippy, and cargo test. Writing Rust also means that both I and AI only need to work with code in text form. We don't need complex workflows like those often seen in frontend development: coding, screen capturing, image diffing, and so on.
Engineer means I'm an engineer by profession. I earn my living through my coding work. I'm not a content producer or advertiser. As an engineer, I choose the most practical tools for myself. I want these tools to be fast, stable, and useful. I don't need them to be flashy, and I don't care whether they can build a website in one shot or write a flappy bird with correct collision detection.
My current toolset consists of Zed and Claude Code. More specifically, I run claude in a Zed terminal tab, which allows me to access both the code and its changes alongside the LLM.

To give claude-code its full capabilities, it's actually running in a container I built myself. Whenever I need to run claude, I use docker run instead. I also have an alias claudex for this purpose:
# claudex
alias claudex='docker run -it --rm \
-v $(pwd):/workspace \
-v ~/.claude:/home/user/.claude \
-v ~/.claude.json:/home/user/.claude.json \
-v ~/.config/gh:/home/user/.config/gh \
-v ~/Notes:/home/user/Notes \
xuanwo-dev'
Before introducing my workflow, I want to share my current mindset on LLMs. At the time of writing, I see LLMs as similar to recent graduates at a junior level.
As juniors, they have several strengths: They possess a solid understanding of widely used existing techniques. They can quickly learn new tools or patterns. They are friendly and eager to tackle any task you assign. They never complain about your requests. They excel at repetitive or highly structured tasks, as long as you pay them.
However, as juniors, they also have some shortcomings. They lack knowledge of your specific project or tasks. They don't have a clear goal or vision and require your guidance for direction. At times, they can be overly confident, inventing nonexistent APIs or using APIs incorrectly. Occasionally, they may get stuck and fail to find a way out.
As a mentor, leader, or boss, my job is to provide the right context, set a clear direction, and always be prepared to step in when needed. Currently, my approach is to have AI write code that I can personally review and take responsibility for.
For example, I find that LLMs are most effective when refactoring projects that have a clear API and nive test coverage. I will refactor the service for aws first, and then have the LLMs follow the same patterns to refactor the azure and gcs services. I rarely allow LLMs to initiate entirely new projects or create completely new components. Most of the time, I define the API myself and ask the LLMs to follow the same design and handle the implementation details.
My workflow is quiet simple: I arrange my day in 5 hours chunk which aligns with claude usage limits. I map those two chunks to every day's morning and afternoon.
In the morining, I will collect, read, think and plan. I will write my thinking down in my Notes, powered by Obsidian. All my notes is in markdown formats, so LLMs like Claude Opus 4 can understand without any other tools. I will feed my notes to claude code directly, and request them to read my notes while needed.
In the afternoon, I will run claudex inside my projects, as I mentioned earlier. I will monitor their progress from time to time and prepare myself to step in when necessary. Sometimes, I use git worktree to spawn additional Claude instances so they can collaborate on the same projects.
Claude works very quickly, so I spend most of my time reviewing code. To reduce the burden of code review, I also design robust test frameworks for my projects to ensure correct behavior. rust's excellent developer experience allows me to instruct the LLMs to run cargo check, cargo clippy, and cargo test on the code independently. They may need to repeat this process a few times to get everything right, but most of the time, they figure it out on their own.
While reviewing code, I pay close attention to the public API and any tricky parts within the codebase. LLMs are like junior developers. Sometimes, they might overemphasize certain aspects of a task and lose sight of the overall context. For example, they can focus too much on minor details of API design without realizing that the entire approach could be improved with a better overall design. This also reinforces my belief that you should only allow LLMs to write code you can control. Otherwise, you can't be sure the LLMs are doing things correctly. It's very dangerous if the LLMs are working in a direction you don't understand.
In my workflow, I only need claude and zed. claude excels at using tools and understanding context, while zed is fast and responsive. As a Rust developer, I don't have a strong need for various extensions, so the main drawback of zed, its limited extension support, isn't a major issue for me.
Here are some tips I've learned from my recent exploration of AI agents and LLMs.
Claude 4 Sonnet and Opus are the best coding models available so far.
Many people have different opinions on this and might argue: hey, o3, gemini-2.5-pro, or deepseek-r1 are better than Claude 4, they can build a working website in one shot! Unfortunately, I disagree, at least for my needs right now. As a Rust developer, I don't care if a model can build a website or demonstrate strong reasoning. What matters to me is whether it can use tools intelligently and efficiently. LLMs used for vibe coding should have a strong sense of planning and be skilled at coding. A smart model that doesn't know how to edit files can't truly serve as your coding copilot.
I'm not a content creator; I'm an engineer. I need a reliable tool that can help me complete my work. I'm not building demos or marketing materials. This isn't a game or a show that can be restarted repeatedly. I'm working on a project with downstream users, and I have to take responsibility for whatever the LLMs do. I need to collaborate with LLMs to achieve both my goals and my company's goals.
Claude 4 is the right tool.
MCP is uesless for vibe coding.
Claude 4 is good at using tools. As long as you let it know that a tool is installed locally, it can use the tool effectively. It can even use --help to learn how to use it correctly. I've never encountered a scenario where I needed to use an MCP server. I tried the GitHub MCP server before, but it performed much worse than simply letting LLMs use the gh CLI locally.
Use tools instead of configuring MCP servers.
Integrate AI into your existing workflow instead of adapting yourself to AI.
AI workflows are constantly evolving. Stay calm and add the best tools to your toolkit. Don't change yourself just to fit a particular AI workflow. It's the tool's problem that can't be integrated into your existing workflow.
I've had some unsuccessful attempts at using Cursor or Windsurf. My progress began when I started incorporating Claude Code into portions of my daily workflow, rather than completely switching to a new IDE.
Thank you for reading my post. I also recommend the following posts if you want to try vibe coding:
Hope you're enjoying the coding vibes: create more, hype less.