2026-01-07 15:49:21
Compute governance is the critical bottleneck for AI scaling. From hardware consumption to core asset, this long-undervalued path needs to be redefined.

I have officially joined Dynamia as VP of Open Source Ecosystem / Partner, responsible for the long-term development of the company in open source, technical narrative, and AI Native Infrastructure ecosystem directions.
I chose to join Dynamia not because it’s a company trying to “solve all AI problems,” but precisely the opposite—it’s because Dynamia focuses intensely on one unavoidable, yet long-undervalued core issue in AI Native Infrastructure: compute, especially Graphics Processing Units (GPU), are evolving from “technical resources” into infrastructure elements that require refined governance and economic management.
Through years of practice in cloud native, distributed systems, and AI infrastructure (AI Infra), I’ve formed a clear judgment: as Large Language Models (LLM) and AI Agents enter the stage of large-scale deployment, the real bottleneck limiting system scalability and sustainability is no longer just model capability itself, but how compute is measured, allocated, isolated, and scheduled, and how a governable, accountable, and optimizable operational mechanism is formed at the system level. From this perspective, the core challenge of AI infrastructure is essentially evolving into a “resource governance and Token economy” problem.
Dynamia is an AI native infrastructure technology company rooted in open source DNA, driving efficiency leaps in heterogeneous compute through technological innovation. Its leading open source project, HAMi (Heterogeneous AI Computing Virtualization Middleware), is a Cloud Native Computing Foundation (CNCF) sandbox project providing GPU, NPU and other heterogeneous device virtualization, sharing, isolation, and topology-aware scheduling capabilities, widely adopted by 50+ enterprises and institutions.
In this context, Dynamia’s technical approach—starting from the GPU layer, which is the most expensive, scarcest, and least unified abstraction layer in AI systems, treating compute as a foundational resource that can be measured, partitioned, scheduled, governed, and even “tokenized” for refined accounting and optimization—aligns highly with my long-term judgment on AI native infrastructure.
This path doesn’t use “model capabilities” or “application innovation” as selling points in the short term, nor is it easily packaged into simple stories. However, with rising compute costs, heterogeneous accelerators becoming the norm, and AI systems moving toward multi-tenant and large-scale operations, these infrastructure-level capabilities are gradually becoming prerequisites for the establishment and expansion of AI systems.
As Dynamia’s VP of Open Source Ecosystem / Partner, I will focus on technical narrative of AI native infrastructure, open source ecosystem building, and global developer collaboration, promoting compute from “hardware resource being consumed” to governable, measurable, and optimizable AI infrastructure core asset, laying the foundation for the scaling and sustainable evolution of AI systems in the next stage.
Joining Dynamia is an important milestone in my career and a concrete action demonstrating my long-term optimism about AI native infrastructure. Compute governance is not a short-term trend that yields quick results, but an infrastructure proposition that cannot be bypassed for AI large-scale deployment. I look forward to exploring, building, and landing solutions on this long-undervalued path with global developers.
2026-01-04 10:25:48
I’ve been spending more time recently experimenting with vibe coding tools on real projects, not demos. One of those projects is my own website, where I constantly tweak content structure, navigation, and layout.
During this process, I started using Verdent’s standalone Mac app more seriously. What stood out was not any single feature, but how different the experience felt compared to traditional AI coding tools.

Verdent doesn’t behave like an assistant waiting for instructions. It behaves more like an environment where work happens in parallel.
Most AI coding tools begin with a conversation. Verdent begins with tasks.
When I opened my website repository in the Verdent app, I didn’t start with a long prompt. I created multiple tasks directly: one to rethink navigation and SEO structure, another to explore homepage layout improvements, and a third to review existing content organization.
Each task immediately spun up its own agent and workspace. From the beginning, the app encouraged me to think in parallel, the same way I normally would when sketching ideas on paper or jumping between files.
This framing alone changes how you work.
Switching contexts is unavoidable in real development work. What usually breaks is continuity.
Verdent handles this well. Each task preserves its full context independently. I could stop one task mid-way, switch to another, and come back later without re-explaining the problem or reloading files.
For example, while one agent was analyzing my site’s navigation structure, another was exploring layout options. I moved between them freely. Nothing was lost. Each agent remembered exactly what it was doing.
This feels closer to how developers think than how chat-based tools operate.
Parallel work only becomes truly safe when code changes are isolated. When parallelism moves from discussion to actual code modification, risk management becomes essential.
Verdent solves this with Workspaces. Each workspace is an isolated, independent code environment with its own change history, commit log, and branches. This isn’t just about separation—it’s about making concurrent code changes manageable.
What this means in practice:
I intentionally let different agents operate on overlapping parts of my project: one modifying Markdown content and links, another adjusting CSS and layout logic. Both ran in parallel. No conflicts emerged. Later, I reviewed the diffs from each workspace and merged only what made sense.
This kind of isolation removes significant anxiety from AI-assisted coding. You stop worrying about breaking things and start experimenting more freely, knowing that each change exists in its own contained environment.
Parallelism doesn’t mean that all agents complete the same phase of work at the same time—instead, by isolating and overlapping phases, what was once a strictly sequential process is compressed into a more efficient, collaborative mode.
In Verdent, each agent runs in its own workspace, essentially an automatically managed branch or worktree. In practice, I often create multiple tasks with different responsibilities for the same requirement, such as planning, implementation, and review. But this doesn’t mean they all complete the same phase simultaneously.
These tasks are triggered as needed, each running for a period and producing clear artifacts as boundaries for collaboration. The planning task generates planning documents or constraint specifications; the implementation task advances code changes based on those documents and produces diffs; the review task, according to the established planning goals and audit criteria, performs staged reviews of the generated changes. By overlapping phases around artifacts, the originally strict sequential process is compressed into a workflow that more closely resembles team collaboration.
The value of splitting into multiple tasks is not parallel execution, but parallel cognition and clear collaboration boundaries.
While it’s technically possible to put multiple roles into a single task, this causes planning, implementation, and review to share the same context, which weakens role isolation and the auditability of results.
Beyond the workflow model itself, Verdent exposes a surprisingly rich set of configurable capabilities.
It allows users to customize MCP settings, define subagents with configurable prompts, and create reusable commands via slash (/) shortcuts. Personal rules can be written to influence agent behavior and response style, and command-level permissions can be configured to enforce basic security boundaries. Verdent also supports multiple mainstream foundation models, including GPT, Claude, Gemini, and K2. For users who prefer a lightweight coding experience without a full IDE, Verdent offers DiffLens as an alternative review-oriented interface. Both subscription-based and credit-based pricing models are supported.

That said, Verdent makes a clear set of trade-offs. It is not built around tab-based code completion, nor does it offer a plugin system. If it did, it would start to resemble a traditional IDE - which does not seem to be its goal. Verdent is not designed for direct, fine-grained code manipulation; most changes are mediated through conversational tasks and agent-driven edits. This makes the experience clean and focused, but it also means that for large, highly complex codebases, Verdent may function better as a complementary orchestration layer rather than a full-time development environment.
There are many AI-assisted coding tools emerging right now. Some focus on smarter editors, others on faster generation.
Verdent feels different because it focuses on orchestration, not just assistance.
It doesn’t try to replace your editor. It sits one level above, coordinating planning, execution, and review across multiple agents.
That makes it particularly suitable for exploratory work, refactoring, and early-stage design - exactly the kind of work I was doing on my website.
Using Verdent’s standalone app didn’t just speed things up. It changed how I structured work.
Instead of doing everything sequentially, I started thinking in parallel again - and letting the system support that way of thinking.
Verdent feels less like an AI feature and more like an environment that assumes AI is already part of how development happens.
For developers experimenting with AI-native workflows, that shift is worth paying attention to.
2025-12-31 18:02:01
The waves of technology keep evolving; only by actively embracing change can we continue to create value. In 2025, I chose to move from Cloud Native to AI Native—this year marked a key turning point for personal growth and system reinvention.
2025 was a turning point for me. This year, I not only changed my technical direction but also the way I approach problems. Moving from Cloud Native infrastructure to AI Native infrastructure was not just a migration of content, but an upgrade in mindset.

This year, I conducted a large-scale refactoring of the website and systematically organized the content. Beyond the technical improvements, I want to share my thoughts and changes throughout the year.
At the beginning of 2025, I made an important decision: to reposition myself from a Cloud Native Evangelist to an AI Infrastructure Architect. This was not just a change in title, but a strategic transformation after careful consideration.
As I witnessed the surge of AI technologies and the rise of Agent-based applications reshaping software, I realized that clinging to the boundaries of Cloud Native might mean missing an era. So, I systematically adjusted the website’s content structure, shifting the focus toward AI Native infrastructure.
This transformation was not about abandoning the past, but extending forward from the foundation of Cloud Native. Classic content like Kubernetes and Istio remains and is continuously updated, but new topics such as AI Agent and the AI OSS landscape have been added, forming a more complete knowledge map.
Agents represent a major evolution in software for the AI era. When I tried to understand Agent design principles, I found fragmented information everywhere but lacked a systematic knowledge base.
So I created content that analyzes the Agent context lifecycle and control loop mechanisms, summarizing several proven architectural patterns. To make complex knowledge easier to digest, I organized it into logical sections so readers can learn step by step.
AI tools and frameworks are emerging rapidly, with new projects appearing daily. To help readers quickly grasp the ecosystem, I built a comprehensive AI OSS database.
This database covers everything from Agent frameworks to development tools and deployment services. I not only included active projects but also established an archive mechanism, preserving detailed information on over 150 historical projects. More importantly, I developed a scoring system to objectively evaluate projects across dimensions like quality and sustainability, helping readers decide which tools are worth investing time in.
In 2025, I wrote over 120 blog posts. Compared to previous years, these articles focused more on observing and reflecting on technology trends, rather than just technical tutorials.
I started paying attention to deeper questions: How will AI infrastructure evolve? What does Beijing’s open source initiative mean for the AI industry? What ripple effects might a tech acquisition trigger? These articles allowed me and my readers to not only see “what” technology is, but also “why” and “what’s next.”
No matter how good the content is, if it can’t be easily found and read, its value is greatly diminished. In 2025, I invested significant effort into website functionality, with one goal: to provide readers with a smoother reading experience.
As the volume of content grew, the original search function could no longer meet demand. I redesigned the search system to support fuzzy search and result scoring, and optimized index loading performance. More importantly, the new search interface is more user-friendly, supporting keyboard navigation and category filtering so users can find what they want faster.
Mobile reading experience has improved significantly. I refactored the mobile navigation and table of contents, making reading on phones much smoother. Dark mode is now more refined, fixing several display issues and ensuring images and diagrams look good on dark backgrounds.
A major change was optimizing the WeChat Official Account publishing workflow. Previously, publishing website content to WeChat required manual handling of many details; now, it’s almost one-click export. This workflow automatically processes images, metadata, styles, and all details, reducing a half-hour task to just a few minutes.
Additionally, I added a glossary feature for technical term highlighting and tooltips; improved SEO and social sharing metadata; and cleaned up outdated content. These seemingly minor improvements quietly enhance the user experience.
Looking back at content creation in 2025, I found clear changes in several dimensions.
Early content leaned toward technical tutorials and practical guides, showing “how to do.” This year, I focused more on “why” and “what are the trends.” I wrote more technology trend analyses, ecosystem maps, and in-depth case studies. These may not directly teach you how to use an API, but they help you understand the direction of technological evolution.
AI is a global wave and cannot be limited to the Chinese-speaking world. In 2025, I wrote bilingual documentation for almost all new AI tools, and important blog posts also have English versions. This increased the workload, but allowed the content to reach a broader audience.
Text is efficient, but not all knowledge is best expressed in words. This year, I used many architecture and schematic diagrams to explain complex concepts, adding 59 new charts. These visual elements lower the barrier to understanding, making abstract concepts more intuitive. I also optimized image display in dark mode to ensure consistent visual experience.
2025 was not only a year of shifting content themes toward AI, but also a year of deep practice in AI-assisted programming.
I developed a VS Code plugin and created many prompts to automate repetitive tasks. I experimented with various AI programming tools and settled on a toolchain that suits me. I even migrated the website to Cloudflare Pages and used its edge computing services to develop a chatbot. These practices greatly improved development efficiency, giving me more time to focus on thinking and creating rather than mechanical coding.
This made me realize: AI will not replace developers, but developers who use AI well will replace those who do not. I also shared more insights to help others master AI-assisted programming.
Looking back at 2025, the site underwent a profound transformation—from a Cloud Native tech blog to an AI infrastructure knowledge base. But this is just the beginning, not the end.
Looking forward to 2026, I plan to continue deepening in several areas:
2025 was a year of change and growth. From Cloud Native to AI Native, from technical practice to ecosystem observation, both the content and functionality of the site have made qualitative leaps.
What makes me happiest is that this transformation allowed me and my readers to stand at the forefront of the technology wave. We are not just learning new technologies, but thinking about how technology changes the world and the way we write software.
The waves of technology keep evolving; only by actively embracing change can we continue to create value. Thank you to every reader for your companionship and support. I look forward to sharing more insights and practices in 2026.
Further Reading:
2025-12-30 11:30:51
The success or failure of AI applications often lies not in the technology itself, but in the ability to scale delivery and create a closed loop.

On December 30, 2025, a piece of news went viral: Manus was acquired by Meta for billions of dollars (Manus Joins Meta for Next Era of Innovation). This startup, founded in China and under pressure from tech giants since its inception, completed a whirlwind journey in less than a year—from explosive growth, relocating to Singapore, to being acquired by a global giant.
According to Manus’s official statement, its products and subscriptions will continue to be available via the app and website, and the company will remain operational in Singapore. The team will join Meta to provide general Agent capabilities for Meta’s consumer and enterprise products (including Meta AI).
Rather than focusing on “who won,” I’m more interested in the chain reaction this event triggered: it activated completely opposite judgment systems among different groups, and this split is reshaping the growth paths and strategies for AI applications and startups.
After Manus was acquired, the mainstream sentiment in social circles was one of congratulations and excitement. Many saw it as a stellar example of a Chinese team going global—achieving remarkable results in the most competitive field in a very short time.
Meanwhile, the comment sections of public accounts became “venting valves for counter-narratives,” with skepticism centering on three main points:
This divergence isn’t about who understands AI better, but about different evaluation frameworks: social circles focus on “trajectory and outcome,” while comment sections focus on “legitimacy and worthiness.”
Many people are impressed by Manus’s marketing buzz and controversies, which can lead to skepticism. But if it achieved a “strict $100M ARR” in 10 months, one fact is clear: its revenue doesn’t depend on broad consensus, but comes from a highly concentrated group of global users with strong willingness to pay.
Manus’s core user profile is closer to “individuals as production units,” including freelancers, indie developers, independent researchers, and key deliverers in small and medium businesses. They don’t care about debates over “wrapping” or not; they care about “can I deliver end-to-end tasks,” and “can this help me hire one less person, work fewer late nights, or avoid juggling ten tools.”
This leads to a counterintuitive phenomenon: those who discuss the most may not pay, while those who pay steadily are often silent.
For these users, tools are not identity badges—they are profit levers.
Based on the above, the Manus case offers three lessons for entrepreneurs:
Growth No Longer Equals Positive Reviews
AI applications can commercialize first and build consensus later. Public opinion can remain divided for a long time, but cash flow doesn’t wait for unified recognition.
“Heavy Marketing” Is Becoming a Capability, Not a Stigma
As foundational models and capabilities spread rapidly, differentiation is quickly erased. Being seen, understood, and paid for is itself part of the moat. Not all marketing deserves respect, but “distribution and mindshare” have become unavoidable battlegrounds for AI applications.
Globalization Is No Longer a Bonus, but May Be a Survival Strategy
From payment willingness, compliance boundaries, talent density to valuation systems, market structure means many teams “can only complete the loop overseas.” It’s not romantic, but it’s reality.
As someone long engaged in cloud native and AI infrastructure, I’m used to evaluating products by their “technical barriers.” But cases like Manus remind me: at the AI application layer, barriers may not first appear in models or code, but often in organizational speed, productization capability, delivery loop, and distribution efficiency.
When a system can reliably turn “capability” into “results,” it has built a commercial moat—even if its tech stack doesn’t meet outsiders’ ideals of “purity.”
The biggest butterfly effect of Manus being acquired by Meta may not be the deal itself, but making more entrepreneurs realize: in the AI era, the winning move is shifting from “what model you use” to “whether you can deliver results at scale.”
The acquisition of Manus by Meta is not just a convergence of capital and technology, but also a microcosm of the changing growth paradigm in the AI application era. For entrepreneurs, understanding and mastering “user structure,” “distribution capability,” and “global closed loops” will be key to future competition.
2025-12-25 18:01:13
Institutionalized open source marks a new starting point for China’s AI Infra, but true breakthroughs and risks lie in the engineering and governance details.
Using the simultaneous release of open source ecosystem plans by Beijing and Shanghai as a lens, and drawing on China’s past foundation practices and international open source governance experience, this article explores the real opportunities, structural constraints, and potential risks as AI Infrastructure (AI Infra, Artificial Intelligence Infrastructure) enters a new phase of institutionalized open source.

It is rare for me to write an article solely because of a local policy document. However, during Christmas, both Beijing and Shanghai’s Bureaus of Economy and Information Technology released their respective open source ecosystem construction plans:
This time, the fact that both cities released their plans on the same day sends a signal worth serious attention: China is attempting to advance open source in a more systematic and institutionalized way, especially regarding open source capabilities related to AI Infra.
If you only look at Beijing’s plan, it is easy to interpret it as a local industrial policy upgrade. But when you consider both Beijing and Shanghai’s plans together, it looks more like a clearly defined “dual-center structure.”
The question is no longer whether to develop open source, but:
In the AI era, what institutional forms, engineering paths, and governance models will open source take?
Both Beijing and Shanghai’s plans reflect a highly consistent judgment:
Open source is no longer seen as a spontaneous community activity, but as an industrial infrastructure capability that requires systematic construction.
This is especially evident in the field of AI Infra.
Issues such as computing power scheduling, model evaluation, toolchains, data elements, license compliance, and supply chain security—previously hidden in “engineering details”—are now systematically incorporated into policy language for the first time. This at least shows that decision-makers have realized:
In this respect, Beijing and Shanghai are highly aligned.
When we zoom in, the differences between the two plans become clear.
Beijing: “Foundation-Oriented” Open Source Path for AI Infra
Beijing’s plan focuses on:
This is a typical perspective of “treating AI as an infrastructure problem.”
It is less concerned with the number of projects or community size, and more with:
To some extent, Beijing is answering the question:
How can open source become a “governable, auditable, and scalable public capability”?
Shanghai: “Scale and Internationalization” Path for AI Platform
In contrast, Shanghai’s plan has a different focus:
Shanghai cares more about:
This is a path of “treating open source as a global digital product and platform capability.”
When viewed together, Beijing and Shanghai’s plans form a more complete picture:
Beijing is responsible for “making open source solid,” while Shanghai is responsible for “taking open source global.”
Structurally, this is a clear division of labor:
These two paths are not in conflict; in theory, they are even complementary. The real question is whether they can form positive feedback in practice, rather than operating in silos.
Precisely because both plans are so “systematic,” I am even more cautious.
The reason is simple: this is not China’s first attempt to promote open source through foundations, associations, or platforms.
Over the past decade, we have seen similar paths repeatedly, and recurring structural problems:
These problems will not disappear just because the plans are more comprehensive.
If we are to “listen to their words and watch their actions,” I would focus on the following four risks:
Will Metrics Hijack Engineering Reality
When “internationally influential projects,” “star projects,” and “first-release projects” become hard metrics, will this induce packaging, migration, and short-term hype, rather than truly solving engineering problems?
Will It Slide Toward Platform Centralism
The long-term pattern of AI Infra is closer to a model that prioritizes protocols, standards, and interoperability. If it eventually evolves into “a few platforms concentrating resources and discourse power,” it may be efficient in the short term but will suppress external participation and international collaboration in the long run.
Is Internationalization Underestimated as an “Operational Issue”
True international collaboration is never just about language, sites, or events; it also involves governance structures, compliance boundaries, and supply chain trust.
Will Application Demonstrations Become One-Off Projects
If “first plans” and “computing vouchers” are just procurement tactics without continuous iteration and community feedback mechanisms, the long-term benefit to the ecosystem will be very limited.
If we review the success of this round of institutionalized open source after three years, I would look for three types of results:
If I were to use a North Star metric to measure the success of these plans, it would be the emergence of several outstanding open source commercial companies rooted in China and serving the world.
The open source ecosystem plans of Beijing and Shanghai mark a new phase of institutionalization and engineering for AI Infra open source in China. Over the next three years, the real achievements will not be about meeting targets, but about forming sustainable engineering capabilities, de facto standards, and maintenance mechanisms. Only through continuous participation and practice can open source become the public foundation of AI infrastructure.
2025-12-24 22:59:11
In 2025, the core of software engineering is no longer just about code itself, but about runtime controllability and cost governance. This shift is fundamentally reshaping the industry’s underlying logic.
Looking back at 2025, I became increasingly aware that this year was not about “code becoming unimportant,” but rather that the value coordinates of engineering have shifted as a whole. For more than a decade, software engineering has focused on code quality, architectural evolution, and delivery efficiency. But starting in 2025, the key to system success is shifting—towards whether the runtime is controllable and whether costs are governable.
This is not just a slogan, but a conclusion repeatedly validated by my real-world experiences throughout the year.
In my annual review, I noted a clear change: I spent less time on “how to write a good system,” and more time on “how to keep the system running stably, reliably, and affordably.”
This shift in focus is a natural extension of a decade of cloud native evolution.
The following timeline diagram illustrates how my focus has changed over recent years:
My focus shifted from cloud native platform engineering to LLM application engineering, then to AI infrastructure, and finally to Agentic Runtime with governance and cost control.
When AI workloads truly enter business scenarios, the core challenges engineers face also change:
These issues go far beyond the code level.
By 2025, an industry consensus is emerging: AI is rewriting software engineering. But the real change is not happening in the IDE or code completion speed—it is reflected in the migration of engineering complexity.
Previously, complexity was concentrated in code and interfaces, and problems were solved through abstraction, refactoring, and testing.
Now, complexity has shifted to the runtime, resource, and cost layers, and must be addressed through scheduling, isolation, observability, and governance.
This is why the same AI tools:
AI tools amplify whether you truly understand how systems run in production.
In traditional cloud native systems, low CPU utilization is often just an efficiency issue; but in AI systems, low GPU utilization is often a cash flow problem.
In 2025, I repeatedly encountered scenarios like:
The root cause of these phenomena is not model selection, but the lack of a runtime and cost control plane tailored for AI workloads.
The following flowchart visually illustrates the cyclical relationship between GPU resources and cost pressures:
In AI systems, limited GPU supply leads to queuing and waiting, which causes throughput to drop. Attempts to solve this through blind scaling only increase unit costs and create budget pressure, ultimately forcing the adoption of finer scheduling and governance strategies.
Engineering problems ultimately manifest as cost issues.
In 2025, Agent (Intelligent Agent, Agent, Intelligent Agent) became a hot topic; by 2026, it will enter the “can it actually run” stage.
The challenge for Agents has never been about “how smart they are,” but rather:
These capabilities form the outline of Agentic Runtime (Agentic Runtime, Intelligent Agent Runtime) that I have been trying to clarify throughout the year.
The following flowchart shows the core capability layers of Agentic Runtime:
Agentic Runtime builds from the foundation of Agents and workflows, connecting through orchestration and tool protocols, with the runtime managing state, memory, and evaluation. It provides secure execution environments (Sandbox and Policy), and ultimately implements a resource and cost control plane that unifies GPU, quota, and billing management.
Without a runtime, an Agent is just a demo; without cost constraints, an Agent is just a risk amplifier.
Looking ahead to 2026, I remain cautiously optimistic.
I do not believe the future belongs to “those who write the best prompts,” but more likely to:
From 2025 onwards, software engineering is no longer code-centric, but runtime and cost-centric. This is not a regression, but a return: a return to being responsible for the whole system and for real-world constraints.
For me personally, this is both a year-end summary and the direction I will continue to invest in for the coming years.
In 2025, the focus of software engineering has shifted from code itself to runtime and cost governance. The rise of AI and Agents has not diminished the value of engineering, but has pushed complexity to a higher level. In the future, understanding runtime, managing compute and cost will become the new core competencies for engineers. I hope this year-end review provides some inspiration and reflection for fellow professionals.