MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Distillation Attacks : Risks and Controversies

2026-03-01 22:50:07

Do you know, You can replicate behaviour of large language model on a small model that has better inference time, less computation cost and competitive benchmark as LLM ?

This is possible using a technique called knowledge distillation, where a smaller model, commonly called "student" model learns to mimic larger model, "teacher" model.

Let's understand about this powerful concept and the issues that arises along with it.

What is Model Distillation ?

Model distillation is a technique where the basic idea is to make a smaller model (student) mimics the behaviour of a larger model (teacher).The goal is to get better generalization in the student model than training it from scratch. Here student model may learn from:

  • Soft probability distributions : Instead of predicting a single hard label, the model outputs a probability distribution over all possible classes or tokens.

  • Logits : Logits are the raw outputs of the neural network before softmax is applied.

  • Generated text Outputs : This is the final text produced by the model. Even if probabilities or softmax are hidden, attackers collect (prompt, generated_text) to perform distillation.

From Compression to Exploitation

Orginally model ditillation technique was designed as a compression technique, a way to transfer knowledge to small model. In this setting, both the teacher and student models belong to the same organization, and the goal is optimization.

However, the context changes when the teacher model is closed-source and accessible only through an API.

Distillation becomes an attack when:

  • The teacher model is proprietary and not publicly available
  • The student model is trained without permission
  • The attacker only interacts through API queries
  • The objective is not compression, but capability replication

In this scenario, the attacker does not need access to the model’s architecture, training data, or weights. Instead, they rely purely on the outputs generated by the system.

This process is commonly referred to as:

  • Model extraction

  • Model stealing

  • Black-box distillation

How Distillation Attack Works ?

Distillation attacks involves following steps :

Step 1: Querying
Here, Attacker generates random prompts on specific domain based on sysnthetic data. This collection of query is fed to Teacher model to generate output.

Step 2: Dataset Construction
In this step, The collected query is passed to model which generated output. Given input prompt along with model output is stored as a dataset.

Step 3: Student Model Training
In this final step, the attacker trains a smaller transformer model using the constructed dataset. The objective is simple: given the same input prompt, the student model tries to predict the teacher model’s output as closely as possible. This is typically done by minimizing cross-entropy loss, and in more refined setups, KL-divergence is used to match the probability distributions of the teacher’s predictions. Over time, the student does not just memorize responses, ut begins to mimmic teacher response.

Recent Controversies on Distillation Attack

Recently, Anthropic claimed that three chinese companies DeepSeek, Moonshot AI and MiniMax ran large scale distillation attacks on its claude models. Anthropic alleges that :

  • ~24,000 fraudulent accounts were used to access Claude.

  • Over 16 million prompt–response exchanges were generated to extract capability data.

  • These interactions were then allegedly used to train rival AI systems.

Also Beyond formal reports, online communities have sparked their own controversies sometimes based on technical quirks rather than confirmed evidence.

One example that gained traction on Reddit was a post claiming that Anthropic’s Sonnet 4.6 model responded to certain Chinese‑language prompts with:

“I am DeepSeek‑V3, an AI assistant developed by DeepSeek.”

This led some users to speculate that :

  • Sonnet had been distilled from or trained on DeepSeek’s model outputs

Conclusion : Who Really Owns AI Knowledge ?

Large Language Models are trained on massive datasets scraped from internet. All of this contsnts are publicly available, yet its use in training is often debated legally or ethinically.

If training an AI on the internet is considered legal, some might ask "why is it illegal to replicate a model using its outputs through distillation ?"

What do you think? Drop your thoughts on comment.

I Asked Gemini One Question! It Became an Accessibility App

2026-03-01 22:44:07

This is a submission for the Built with Google Gemini: Writing Challenge

VisionVoice — From Idea to Impact: Making Signs Speak with AI

What I Built with Google Gemini

Every meaningful project starts with a real-world problem.

While experimenting with Google AI Studio, I asked myself:

What if public signs could literally speak for visually impaired people?

That question became VisionVoice — a multilingual visual accessibility assistant powered by Google Gemini.

VisionVoice helps visually impaired users understand their surroundings by:

  • 📸 Detecting text from real-world signs (emergency notices, directions, warnings)
  • 🌐 Translating content into multiple languages
  • 🔊 Converting text into natural speech narration

🎯 The Goal

Increase independence and safety for visually impaired individuals in public spaces.

🧠 How Gemini Powered the Project

Google Gemini became the core intelligence layer of VisionVoice:

  • Image → Text extraction
  • Context understanding
  • Multilingual translation
  • Speech-ready output generation

Instead of stitching together multiple AI services, Gemini enabled a unified multimodal pipeline inside Google AI Studio.

This allowed the app to:

  • Process images
  • Understand context
  • Translate meaning
  • Generate narration

All within a single AI-driven workflow.

✨ Key Features

  • Image-to-Text Recognition — reads real-world signage
  • Multilingual Translation — removes language barriers
  • Text-to-Speech Narration — accessibility-first interaction
  • Mobile-First UI — quick interaction in real environments

VisionVoice transforms static signs into interactive spoken guidance.

Demo

🌍 Live App URL

💻 GitHub Repository

🎥 Youtube Video Demo

What I Learned

This project changed how I think about building products with AI.

🧩 1. Multimodal AI Changes Product Thinking

Traditional applications process a single input type.

Gemini allowed me to design around human interaction flows, not technical pipelines:

Image → Understanding → Language → Voice

It felt natural — almost human.

⚙️ 2. Prompt Engineering is Product Design

Prompts are not just instructions.

They are UX decisions.

Small refinements dramatically improved:

  • Translation accuracy
  • Context interpretation
  • Narration clarity

I realized AI behavior is part of system architecture.

🌍 3. Accessibility is a Design Mindset

Building for accessibility forced me to rethink assumptions:

  • Minimal UI > Feature-heavy UI
  • Speed > Aesthetic polish
  • Audio clarity > Visual complexity

AI becomes most powerful when it removes friction for users who need it most.

🚀 4. AI Accelerates Solo Development

Gemini acted as a:

  • Research assistant
  • Architecture reviewer
  • Debugging partner
  • Rapid prototyping engine

I shipped VisionVoice faster than any previous project I’ve built.

Google Gemini Feedback

✅ What Worked Extremely Well

  • Multimodal reasoning felt natural and powerful
  • Fast prototyping inside Google AI Studio
  • Strong image understanding for real-world inputs
  • Easy experimentation without heavy setup

Gemini reduced the gap between:

Idea → Prototype → Working Product

⚠️ Where I Faced Friction

  • Output consistency required prompt tuning
  • Blurred or low-light images needed additional handling logic
  • Audio formatting occasionally required post-processing

These challenges helped me understand how to design AI-assisted systems thoughtfully, rather than relying blindly on AI output.

🔮 What’s Next for VisionVoice

This challenge made me realize VisionVoice can evolve beyond a prototype:

  • 📱 Real-time mobile camera mode
  • 🧭 Navigation assistance
  • 🗣️ Offline accessibility support
  • 🤖 Context-aware environmental guidance

My goal is to grow VisionVoice into a real AI-powered accessibility companion.

Final Reflection

The Built with Google Gemini Writing Challenge is about reflection — not just shipping code.

VisionVoice taught me that AI isn’t only about automation.

It’s about amplifying human ability.

Sometimes, the most powerful software doesn’t add new screens…

…it gives someone the ability to understand the world around them.

devchallenge #geminireflections #gemini #ai #accessibility

Your Memes. At Light Speed

2026-03-01 22:43:12

I build raycast but for memes, you can search memes at lightning speed.

Book Review: Co-Intelligence by Ethan Mollick

2026-03-01 22:42:53

Table of Contents

  • Introduction
  • AI as a Thinking Companion
  • The Human-in-the-Loop Principle
    • Critical Thinking
    • Disruption in the job market
  • Centaur vs Cyborg approaches
  • Resources
  • Conclusion

Introduction

We recently finished reading Co-Intelligence: Living and Working with AI by Ethan Mollick in our company's book club. The book shares four core principles for AI collaboration and outlines various practical applications. Some really stuck with me, and I've tried to incorporate them in my work. Reading the author's perspective and learning his way of thinking definitely improved how I look at these tools. But if you know me, you know how skeptical I am. There are some chapters and opinions that I don't agree with.

So in this post, I'll share the key insights from our book club in the context of software development, plus some personal opinions as always 🙂.

AI as a Thinking Companion

One of the most practical takeaways for me was viewing AI as a co-worker and thinking companion. When done right, this can be incredibly useful. Some people use it heavily for deep research, not so much to delegate tasks for it to do. André Santos gave some examples on the tasks it has been useful, like Terraform code or generating bash scripts. On those tasks, we can write a detailed prompt, alongside proper documentation (e.g. Context7 MCP), and ask it to write Terraform since it's simpler and faster. Even just making a POC, or demo, turning an idea you have into working software to see how viable the idea is. That is a perfect use case for delegating the front-end and back-end to AI. It's not code that will ship to production, it's a way to make prototypes or quick demo apps that otherwise you'd never spend the time to build.

I've enjoyed using models like Claude to help me around my tasks at work because they often uncover possibilities I haven't thought about. The conversational style of going back and forth helps me fine-tune my own solution. It's not just "give me code," it's "let's discuss this architecture". At the end of the conversation, we can generate a good draft of a PRD (Product Requirements Document). Notice I don't delegate my thinking to it, it's a tool that helps me think of solutions or just interview me sometimes.

However, it can be annoying. I'd like to minimize the number of times I have to tell it "no, you're wrong. The Microsoft documentation for Azure Container Apps does not state X as you said" 😅.
To fix this, I've tried giving an explicit instruction in my system prompts:

"It's also very important for you to verify if there is official documentation that supports your claims and statements. Please find official documentation supporting your claims before responding to a user. If there isn't documentation confirming your statement, don't include it in the response."

I have had better results with this, still not perfect. In a longer conversation, I think it doesn't always verify the docs (memory limits, perhaps), but sometimes I get the response: "(...) Based on my search through the official documentation, I need to be honest with you (...)".

I really find it funny that Claude "needs" to be honest with me 😄. Sycophancy is truly annoying, especially since we are talking about AI as a thinking companion. If your AI partner always agrees with you, how useful is it really as a thinking companion?

The Human-in-the-Loop Principle

While Mollick's vision of a collaborative future with AI is profoundly optimistic, he is also a realist. One of the most important principles, and a recurring theme in the book, is the absolute necessity of human oversight - the "human-in-the-loop" principle.
This is a key quote from the book:

For now, AI works best with human help, and you want to be that helpful human. As AI gets more capable and requires less human help — you still want to be that human. So the second principle is to learn to be the human in the loop.

One of Mollick's key warnings is about falling asleep at the wheel. When AI performs well, humans stop paying attention. This has been referenced by Simon Willison as well, in his recent insightful post 2025: The year in LLMs.
All I'm saying is I understand --dangerously-skip-permissions is useful as a tool when used in a secure sandbox environment. But we should verify our confidence level on the AI's output and the autonomy + tools we give it. If we don't, we risk using AI on tasks that fall outside the Jagged Frontier, which can lead to security issues, nasty bugs, and hurt our ability to learn.

I say this knowing full well that I trust Claude Opus 4.5 more on any task I give it. So I have to actively force myself to verify its suggestions just as rigorously, verify which tools I gave it access to, and which are denied. For example, I use Claude Code hooks to prevent any appsettings, .env, or similar files from being accessed. I still try to read the LLM reasoning/thinking text, so that I understand better, and simply out of curiosity as well.

I simply can't forget when I saw the Claude Sonnet 4 and Opus 4 System Card, the "High-agency behavior" Anthropic examined. Whistleblowing and other misalignment problems are possible, for example, this is a quote from the Opus 4.6 System card:

In our whistleblowing and morally-motivated sabotage evaluations, we observed a low but persistent rate of the model acting against its operator’s interests in unanticipated ways. Overall, Opus 4.6 was slightly more inclined to this behavior than Opus 4.5.

All I'm saying is let's be conscious of these behaviors and results on the evals.

In my opinion, the human-in-the-loop principle is crucial. Don't just copy/paste or try to vibe your way into production. Engineers are the ones responsible for software systems, not tools or alien minds. If there are users who depend on your software, and your AI code causes an incident in production, you are responsible. Claude or Copilot won't wake up at 3 AM if prod is on fire (or maybe Azure SRE agent will if you pay for it 🤔...). Having an engineering mindset and being in the driver's seat is what I expect from myself and anyone I work with.

Critical Thinking

Within this principle, we have a topic I have a lot of strong opinions on. This quote says it all:

LLMs are not generally optimized to say "I don’t know" when they don't have enough information. Instead, they will give you an answer, expressing confidence.

Basically, to be the human in the loop, we really must have good critical thinking skills. This ability plus our experience, brings something very valuable to this AI collaboration - detect the "I don't know". It may help to know some ways we can reduce hallucinations in our prompts.
But still, we can't blindly believe AI output is correct based on its confidence that the proposed solution works. Now more than ever, we need to continue developing critical thinking skills and apply them when working with AI, so that in the scenarios where it should have responded "I don't know", you rely more on your own abilities.

Sure, there are tasks we are more confident delegating for AI to work on, but the ones we know fall outside the Jagged Frontier, we must proceed with caution and care. We discussed our confidence level with AI output a lot. For example, André Santos said it depends on the task we give it, but André Oliveira also argues that we can only validate the output in the topics we know. It serves as an amplifier because it's only a tool. If the wielder of the tool doesn't fact-check the output, we risk believing the hallucinations and false statements/claims.

Pedro Vala also talked about a really good quote from the Agentic Design Patterns book that is super relevant to this topic:

An AI trained on "garbage" data doesn’t just produce garbage-out; it produces plausible, confident garbage that can poison an entire process - Marco Argenti, CIO, Goldman Sachs

Now imagine, if we read the AI output, and at first glance it looks okay, but it's only plausible garbage. Which is a real risk, especially on the AI-generated content that is already available in the internet. Again, I hope developers continue to develop their critical thinking skills and don't delegate their thinking to tools.
Right now, the only process I have of filtering out garbage on the internet is consuming most content from authors I respect, and I know for a fact are real people 😅.

Disruption in the job market

Mollick also talks about the disruption in the job market, which is a hot topic in our industry. Especially the impact AI has on junior roles. We have debated this in a few sessions of our book club, and again, critical thinking and adaptability are crucial. We simply have to adapt and learn how to use this tool, nothing less, nothing more. How much value we bring to the table when working with AI matters, especially if the value you bring is very tiny. If you don't bring any value to the table and just copy/paste, you are not a valuable professional in my view.

It's a good idea to keep developing our skills and expertise. Andrej Karpathy talks about intelligence "brownout" when LLMs go down, this is extremely scary to me, especially if I see this behaviour in junior or college grads. I truly hope we stop delegating so much intelligence to a tool. I don't want engineers to rely on LLMs when production is down and on fire. It would be sad to see engineers not knowing how to troubleshoot, how to fix these accidents in production... just because AI tools are not available 😐.

Centaur vs Cyborg approaches

The book distinguishes between two ways of working with AI:

  1. Centaur: You divide tasks between human and machine. You handle the "Just me" tasks (outside the Jagged Frontier), and delegate specific sub-tasks to the AI that you later verify.
  2. Cyborg: You integrate AI so deeply that the workflow becomes a hybrid, often automating entire processes.

For software development, I'm definitely in the Centaur camp right now.
We should be careful about what tasks we delegate. Mollick warns about "falling asleep at the wheel." When the AI is very good, humans have no reason to work hard and pay attention. They let the AI take over instead of using it as a tool, which can hurt our learning process and skill development. Or in some scenarios, it can lead to your production database being deleted...

This is just a tool. We are still responsible at work. If the AI pushes a bug to production, you pushed a bug to production!

The author does give some "Cyborg examples" of working with AI, here is a quote from the book:

I would become a Cyborg and tell the AI: I am stuck on a paragraph in a section of a book about how AI can help get you unstuck. Can you help me rewrite the paragraph and finish it by giving me 10 options for the entire paragraph in various professional styles? Make the styles and approaches different from each other, making them extremely well written.

This is that ideation use case that is super useful when you have writer's block, or just want to brainstorm a bit on a given topic. In our industry, a lot of teams are integrating AI in many phases of the SDLC. I haven't found many workflows that work well in some parts of the SDLC, since we are focusing on adopting AI for coding and code review. But in most workflows, the cyborg practice is to steer more the AI and manage the tasks where you collaborate with AI as a co-worker. The risk remains even when someone uses cyborg practices, but then fails to spot hallucinations or false claims. The takeaway is really to be conscious of our AI adoption and usage. The number one cyborg practice I try to do naturally is to push back. If I smell something is off, I will disagree with the output and ask the AI to reconsider. This leads to a far more interesting back-and-forth conversation on a given topic.

Resources

Here are some resources if you want to dive deeper:

Conclusion

This was a great book, I truly recommend it to anyone who is interested in the slightest by AI. Co-intelligence is something we can strive for, focusing on adopting this new tool that can help us develop ourselves.
Our expertise and our skills. When it was written, we had GPT 3.5 and GPT-4 was recent I believe... now we have GPT-5.3-Codex, Opus 4.6, GLM 4.7, and Kimi K2.5. I mean, in 2 years things just keep on changing 😅. The Jagged Frontier will keep changing, so this calls for experimentation. AI pioneers will do most of this experimentation, running evals and whatnot, to understand where each type of task falls in the Jagged Frontier. Pay attention to what they share, what works, and what doesn't.

AI has augmented my team and me, mostly on "Centaur" tasks while we improve our AI fluency and usage. In my personal opinion, I don't see us reaching the AGI scenario Ethan talks about in the last chapter. Actually, most of our industry talks and continues to hype AGI... even the exponential growth scenario raises some doubts for me. But I agree with Ethan when he says: "No one wants to go back to working six days a week (...)" 😅.
We should continue to focus on building our own expertise, and not delegating critical thinking to AI. There is a new skill in town, we now have LLM whisperers 😅, and having this skill can indeed augment you even further. Just remember the fundamentals don't change. Engineers still need to know those!

There are hundreds of "Vibe Coding Cleanup Specialist" now 🤣. Let's remember to be the human in the loop. Apply critical thinking to any AI output, do fact-checking, and take ownership of the final result. Please don't create AI slop 😅.

Hope you enjoyed this post! My next blog post will be about how we are using agentic coding tools, so stay tuned! Feel free to share in the comments your opinion too, or reach out and we can have a chat 🙂.

We built a new AI EdTech service for developers.

2026-03-01 22:41:31

Hello , we built an AI learning service that lets you create a curriculum on any topic and study without worrying about hallucinations.

Have you ever had a frustrating experience trying to learn with ChatGPT? A big reason is that it’s a general-purpose chatbot: it tends to give direct answers quickly, without reliably filtering hallucinations, and it doesn’t stay in a structured learning flow.

AI is powerful, and it makes learning far more affordable. But using it to study well is still surprisingly hard. While many AI learning services focus on how much text inside a PDF the AI can read and answer from, we focused on something different: how the AI should explain and guide so real learning actually continues.

It’s simple to use:

  1. Enter what you want to learn, choose your level and the number of chapters, and we generate a curriculum.
  2. Click the curriculum card you want and start learning right away.
  3. Each chapter includes a tutorial and a chatbot, and the chatbot continues the conversation while staying aware of the current chapter’s context.
  4. So even if you miss a quiz question, you don’t have to restart from scratch—the AI understands where you are and helps immediately.

Thank you.

Service link: https://runtric.com/curricula

Claude Code 跨会话上下文恢复:从 8 次纠正到 0 次的工程实践

2026-03-01 22:38:59

如果你用 Claude Code 做过超过一周的项目开发,大概率遇到过这个场景:

上午做了一半的功能,下午开新会话继续,结果 Claude 完全不记得之前的工作。你得花 10 分钟解释项目背景、当前进度、技术决策的理由。更糟的是,解释完了它还经常理解偏——你说"继续做登录模块的 token 刷新",它开始给你重写整个认证系统。

这不是偶发问题。Claude Code 是会话级工具,每次新建会话就是一张白纸。在短期任务中这没什么,但在持续迭代的项目开发中,跨会话的上下文断裂会造成三个累积问题:

  • 恢复成本递增:项目越复杂,每次恢复需要的解释越多
  • 纠正成本递增:Claude 基于不完整上下文做出错误判断的概率随项目复杂度上升
  • 一致性递减:同一个功能跨会话开发时,技术方案、命名风格、架构决策会产生漂移

我做了一个量化测试:同一个中等规模项目(十几个功能模块),3 次模拟会话中断后恢复上下文,用手动维护 CLAUDE.md 的方案需要 8 次人工纠正,才能让 Claude 准确对齐之前的工作进度和技术决策。

现有方案的局限

目前社区常见的方案有两类:

方案 A:手动维护 CLAUDE.md

在 CLAUDE.md 中写项目说明、技术约束、当前进度。问题是:

  • 你得记得更新它——做完一个功能忘了改进度,下次就会脱节
  • 格式不统一——每次手写的粒度不一样,Claude 解析不稳定
  • 只有"是什么"没有"在哪"——知道项目用 React,但不知道 token 刷新做到哪一步了

方案 B:TodoWrite 任务列表

用 Claude 内置的 TodoWrite 跟进度。问题是:

  • 任务完成就消失了——没有历史上下文
  • 不处理变更——你说"这个先不做了",TodoWrite 没有"暂停并保留工作成果"的概念
  • 没有质量保障——Claude 跳过测试或者自己审批自己的事经常发生

两种方案的共同问题:它们管的是信息,不是流程。 真正需要的是项目状态的自动维护 + 流程规则的强制执行。

解决思路:让 Claude 自己维护项目状态

核心设计思路有三条:

  1. 状态自动维护:每次会话结束时 Claude 自动更新项目状态文件,下次会话自动读取恢复。人不需要手动维护任何东西
  2. 结构化 Markdown:所有状态文件是纯 Markdown,有明确的 Schema 约束。LLM 和人类都能读写,不依赖任何外部服务
  3. 规则驱动行为:通过 Claude Code Plugin 的 Rules 机制注入行为规则,让 Claude 在特定场景下自动执行流程(创建变更请求、质量检查、等待人工审批等)

基于这个思路,做了 devpace——一个 Claude Code Plugin,v1.6.0,MIT 开源。

核心机制

双模式设计

devpace 有两种工作模式:

  • Explore 模式(默认):自由读代码、分析、讨论。不触发任何流程,不修改状态文件。问问题、看代码时零摩擦。
  • Advance 模式(改代码时):通过 /pace-dev 触发。创建变更请求(CR),进入状态机(created → developing → verifying → in_review → approved → merged),自动运行质量检查。

你不需要主动想模式的事。问问题就是 Explore,开始实现功能就是 Advance。

.devpace/ 目录结构

运行 /pace-init 后生成最小状态目录:

.devpace/
├── state.md          # 项目状态快照(限制 15 行)
├── project.md        # 价值功能树:目标→功能→任务
├── backlog/          # 初始为空,开始开发后自动创建 CR 文件
│   └── CR-001.md     # (第一个任务时自动出现)
└── rules/
    ├── workflow.md   # CR 状态机规则
    └── checks.md     # 质量检查(从工具链自动检测)

关键:state.md 限制在 15 行。过长的状态文件增加 token 消耗且降低恢复准确率。15 行包含:业务目标摘要、当前进行中的工作、下一步建议。

11 个 Hook 管理完整的会话生命周期:

  • session-start.sh:读取 state.md,注入上下文。Claude 一句话报告现状。
  • session-end.sh:更新 state.md,记录工作进展和下一步。
  • pre-compact.sh:上下文窗口压缩前保存状态,防止长对话丢失信息。
  • intent-detect.mjs:检测代码变更意图,建议进入 Advance 模式。

渐进式披露:初始化只有最小文件集。迭代文件在 /pace-plan 时出现,度量文件在 /pace-retro 时出现,发布文件在 /pace-release 时出现。你不会看到不需要的复杂度。

state.md 示例

> 目标:用户权限系统 → 成效指标:支持 RBAC + OAuth 登录
> 迭代:ITER-002(核心权限模块)| 进度:3/5 产品功能已完成

- **进行中**:Token 刷新机制 → 步骤 3/5:middleware 实现
- **待 Review**:角色权限矩阵
- **决策**:选择 JWT 而非 Session | Redis 缓存 Token 黑名单
- **下一步**:完成 middleware 后运行集成测试,然后提交 Review

会话开始时 Claude 读到这个,输出:"上次停在 Token 刷新的 middleware 实现,角色权限矩阵在等 Review。继续?"——不输出 ID、状态机术语,只说人话。

三级质量门

级别 触发时机 执行者 可跳过?
Gate 1 开发完成 → 进入验证 Claude 自动
Gate 2 验证通过 → 提交审查 Claude 自动
Gate 3 审查 → 批准合并 人工审批 不可跳过

Gate 3 是硬约束——写死在规则里,无论 Claude 认为变更多简单,都必须等人确认。

三个专业子代理

  • pace-engineer:处理实现(/pace-dev/pace-test)。技术视角,关注代码和测试。
  • pace-pm:处理规划和变更(/pace-change/pace-plan)。业务视角,关注价值和优先级。
  • pace-analyst:处理回顾(/pace-retro)。数据视角,关注度量和趋势。

数据对比

跨会话恢复测试

指标 devpace 手动 CLAUDE.md
3 次中断后纠正次数 0 次 8 次
恢复方式 自动读取 state.md 手动解释

7 维度能力对比

维度 devpace 手动方案 TodoWrite
跨会话恢复 自动 手动 无持久化
变更管理 5 种场景 + 影响分析
质量保障 3 级质量门,人审不可跳过
价值追溯 OBJ→BR→PF→CR 全链路 任务→完成
度量能力 DORA 代理指标
风险管理 预飞行扫描 + 分级响应
角色适配 5 种角色视角

总分:devpace 21/21 vs 手动方案 7/21 vs TodoWrite 4/21

变更管理:核心差异化

开发中途需求变了是常态,但现有工具都没有真正处理这个问题。

devpace 的 /pace-change 支持 5 种变更场景:

场景 触发方式 Claude 的处理
需求插入 "加一个 XX 功能" 影响分析 → 容量评估 → 创建 CR
需求暂停 "这个先不做了" 保留工作成果 → CR 暂停 → 解除依赖
需求恢复 "之前暂停的继续做" 从暂停点恢复 → 检查上下文是否过期
优先级重排 "XX 先做" 重排计划 → 更新 state.md
需求修改 "XX 的要求变了" 识别返工 → 质量检查重置 → 更新验收标准

还支持 undo(撤销上次变更)和 batch(批量变更)。

安装与试用

Marketplace 安装(推荐):

/plugin marketplace add arch-team/devpace
/plugin install devpace@devpace

从源码安装

git clone https://github.com/arch-team/devpace.git
claude --plugin-dir ./devpace

然后:

/pace-init

Claude 问你一个问题:"用一句话描述项目做什么"。然后自动检测工具链,创建 .devpace/。开始工作,关会话,开新会话。看 Claude 是否还记得。

局限与诚实评价

  • 价值感知延迟:第一个会话感受不明显,需要 2-3 次跨会话恢复才能体会
  • 概念门槛:BizDevOps 的价值链对不熟悉的开发者有理解成本(但日常只用 3-4 个命令)
  • 适用范围:不适合一次性脚本。目标是多会话持续迭代的项目
  • 平台限制:目前只支持 Claude Code

GitHub:github.com/arch-team/devpace
许可:MIT | 版本:v1.6.0 | 18 Skills | 34 场景验证

一个受够了每天早上对 Claude 说三遍"不对,我们已经决定过了"的开发者做的。