2026-01-09 22:26:13
Over the Christmas period I spent some of my down time catching up on blog posts and open source projects that I had bookmarked to "read later". I have been interested in learning about opinionated workflows that use agentic tools like Kiro CLI to generate better output. One of those projects looked at how you can improve the output from agentic tools through something called player/coach (I think it might be similar to the actor/critic method that I used back in the reinforcement learning days of Deep Racer). The basic gist of it is this: you generate output from one agent (the player), which is then assessed and reviewed by another (the coach). The coach provides feedback back to the player based on the output generated by the player, and thus a (hopefully) virtuous circle is formed.
I am firm believer that the journey is more interesting than the destination, and I was sure I would encounter rough edges that would help me learn more about this area. As it turns out, I was looking for an opportunity to try out Kiro CLI's new subagent capability that allows Kiro to orchestrate subagents from your Kiro session. This blog post is a write up of how that went, what I learned and some of the sharp edges that I am still trying to figure out.
At the end of the post I share the code and resources so you can experiment for yourself. My hope is that some of you will be sufficiently interested to try and experiment for yourself and see what kind of workflows you might want to create for your use cases.
So my plan was to create three custom agents in Kiro:
Each of these would have their own specific system prompt and context optimised for the work they were doing. Specifically,
I started by creating a new project workspace and the custom agents in Kiro CLI, each with their own system prompt.
.kiro
├── agents
│ ├── coach.json
│ ├── orch.json
│ └── player.json
└── shared
├── COACH.md
├── ORCH.md
└── PLAYER.md
Each custom agent had its own system prompt (.kiro/shared), where I tuned the behaviour based on the role needed for this workflow. As I experimented with this approach, I did have to make changes. I don't they they are perfect, but I did not want to over-engineer this for this initial attempt.
Once I had this setup, I started the orchestrator using "kiro-cli --agent orch" which took me to Kiro CLIs prompt and from there I just could ask it to do anything and the workflow would kick in.
I spent around 2-3 hours setting this up and then probably another 2-3 hours testing and trying it out on different requests (all code related). I am going to see how this works on non code tasks too and see if I get a feel for use cases where this approach might be optimised. No answers or intuition yet.
Some of the sharp edges that I saw whilst I was this approach were:
"hanging" - frequently the coach or the player would start a server/service, and progress would stall - it was waiting for something that was never coming. I have actually seen this quite a bit in the agentic tools I use, and typically this needs a manual intervention. When this is happening in a subagent, typically a CTRL+C will be enough to break out of the loop. What I did find though is that sometimes it will leave those services running in the background, and so as it then goes to retry it gets port conflicts. It is smart enough to work around this and move ports though. I was able to mitigate and improve this with better steering documents to provide specific guidance
creating files in unexpected places - another issue that came up was that occasionally my workflow would not be followed. For example, I asked for all updates to happen to a file within a specific directory but they would be created in different locations.
one-shot execution - on a few occasions the workflow decided that rather than break down the request into a series of tasks, it was going to do it as a one-shot execution.
quality of output - I didn't spend enough time improving the coach context files so aside from picking up on a few tasks that had not been completed correctly, I am not entire sure how to evaluate whether the coach was improving the output or not. I have some ideas on how I could do this (for example, creating a baseline subagent that does not use the coach and then comparing the output between the two)
cost/efficiency - an iteration of this workflow, from request to completed task consumed 4.3 credits in Kiro. When I tried to do this one shot using standard vibe coding it was less than 1 credit, so I am not sure how cost effective this approach is.
subagent configuration - in the early versions of my subagents, I did not configure the tools permissions correctly and so I was forever being prompted. Once I resolved this it was ok, but it did then lead me to discover a current constraint of subagent tool calling - currently on a subset of tools can be used by subagents automatically (which you can read about here). This means that currently those subagents can't explore MCP tools or make web calls, but I suspect this will change over time (especially as the Kiro CLI team are on fire and released updates faster than I can keep up).
As I was working through some examples, I did see some material improvements as I made some changes. With approaches like this, it does take some time and experimentation to see you can you can affect the output. Some of the things that had a big impact on the workflow were things like:
Evaluation - I am confident that going deeper with more specific (both deep and breadth) criteria would generate more meaningful review. At the moment the setup feels more like player/review than player/coach. I have seen some significant improvements when I have added items to the EVAL context, for example "After reviewing the submission from the player, review against Python PEP-8 and suggest the most impactful improvement they should make".
Steering and Context - I had some glitches during my experiments due to either contradicting or lack of context. I did resolves these by tuning the custom agent resource configurations, which allow you to precisely configure what you use as context. This had the biggest impact on changing the behaviour of the player, coach and orchestrator agents.
Creating a baseline - as mentioned above, understanding whether this approach generates better output is currently lacking in the approach I started with. I have started experimenting with ideas such as creating a baseline output (player, without coach), but I am going to look at other mechanisms for understand and evaluating the quality of the output.
It is also fair to say that I was just using the Kiro CLI tool "as is", operating within its capabilities. Some of the things that came out of the journey in creating this was that perhaps creating my own tool using something like Strands Agents might be a way to have more control and flexibility in how this workflow might work. Something I will be looking into as it happens, so keep your eyes posted for that one.
Whether the output generated using a player coach model is better than a more traditional linear approach is hard for me to tell. I started off with trying to explore and understand this model better, and after just a few hours work I feel that I have some new ideas and approaches that might be useful going forward. I think that is the key thing for me at the moment, where agentic AI is still so new, exploring new ideas and approaches might sometimes lead to great outcomes. I didn't get a major aha moment today, but I still learned something and I am happy with that.
I have shared the code in this GitHub repo where you can try this out for yourself. I have also put together a short video of this in action which you can see here.
In this player/coach workflow I asked it to create me a simple application.
You can get started with Kiro CLI today for free. Download it from this link. I have created a workshop if you are new to Kiro CLI and the Kiro CLI workshop will walk you through getting started with the terminal based Kiro CLI tool, and provides a comprehensive overview as well as advanced topics.
Finally, if you did find this post interesting, helpful, or useful, I would love to get your feedback. Please use this 30 second feedback form and I will be forever grateful.
Made with ♥ from DevRel
2026-01-09 22:25:38
Building CloudWise has given me a unique view into AWS spending patterns across hundreds of accounts.
Many teams overlook orphaned EBS volumes, leading to unnecessary charges.
After analyzing $2M in AWS costs, here's what we discovered:
Ignoring orphaned resources while optimizing compute costs.
This happens because most teams focus on compute optimization but overlook the "boring" stuff that adds up quickly.
Utilize the AWS CLI to identify and delete orphaned EBS volumes.
aws ec2 describe-volumes --filters Name=status,Values=available
aws ec2 delete-volume --volume-id <volume-id>
What patterns have you noticed in your AWS bills? Drop a comment - I love learning from other developers' experiences.
I'm building CloudWise to help developers get clarity on AWS costs. Always happy to share insights from our data analysis.
2026-01-09 22:17:01
The PDF file format is very complex and contains many features to boost interactivity. One such feature is the ability for PDF files to contain annotations which allow you to draw over, highlight, label, and comment on documents without modifying the existing content.
While this is a useful feature, you may want to remove annotations from documents, for example to remove text highlighting that has been done with annotations.
The PDF toolkit JPedal allows you to remove annotations from PDF files with only a few lines of code!
To do this you need to download a JPedal jar, then run the following code:
final PdfManipulator pdf = new PdfManipulator();
pdf.loadDocument(new File("inputFile.pdf"));
pdf.removeAnnotations();
pdf.apply();
pdf.writeDocument(new File("outputFile.pdf"));
pdf.closeDocument();
You may also want to consider other sanitization options to clean your documents.
You can download a trial jar from the website.
Learn more about the PDF Manipulator API.
We can help you better understand the PDF format as developers who have been working with the format for more than 2 decades!
2026-01-09 22:16:32
I want to make an social media type App for my school friends to connect with and like a place where people can create study groups of the college and hold events customizable for every schools i can do basic react native for the front end but i am really new to backend and real project stuffs i tried firebase but as the storage for the pictures was paid i dropped learning it and i see myself drawing too much from one to other and cant focus on one . I hear a lot of you tube coders telling you don't need to learn syntax so that puts me to a verge thinking then what to learn ? what to use Ai on i am confused and would be grateful to any experienced developer who guide me . Thanks for viewing this and have a good day ahead .
2026-01-09 22:14:56
Last quarter we hit a production incident that looked “healthy” at first — until it wasn’t.
Traffic spiked from 100 to 1000 req/sec.
Kubernetes HPA did exactly what it was designed to do.
Our database did not.
The math no one notices under pressure:
15 pods × 50 connections = 750 required
Database capacity = 200
Result:
“FATAL: too many clients already”
CrashLoopBackOff across new pods.
HPA scales compute.
It does not understand downstream limits.
Databases don’t autoscale like pods.
Connection pools multiply silently.
By the time alerts fire, you’re already down.
When this happens, logs alone are not enough.
You need to:
This is where most time is lost during incidents.
After seeing this failure pattern repeatedly, I stopped relying on memory and ad-hoc runbooks.
I ended up building CodeWeave, a DevOps copilot that forces a structured incident-response flow for production infrastructure.
For this class of incident, it explicitly:
The important part isn’t automation — it’s making unsafe decisions harder under pressure.
The goal isn’t speed.
It’s reducing risk when systems are already unstable.
I’m curious how others handle this failure mode.
Do you cap HPA replicas based on database limits — or rely entirely on pooling layers?
If you’re curious, I shared a short demo of this exact flow using **[CodeWeave]
https://www.linkedin.com/posts/activity-7415388599883292672-Q05O?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAqyZpkBTUNyc9y0g8Qnow5IZiIzJ9MbUGc
I’d mainly love feedback from other DevOps and SREs who’ve dealt with similar scaling failures.
2026-01-09 22:10:34
When building production AI systems, you often need the best of both worlds: the creativity and adaptability of LLM-based agents combined with the reliability and determinism of structured workflows. KaibanJS's structured output chaining feature makes this seamless.
In this article, we'll explore how to chain ReactChampionAgent (LLM-powered) with WorkflowDrivenAgent (workflow-based) to build robust AI systems using a real-world product review analysis use case.
Modern AI applications frequently require:
Traditional approaches require manual data transformation, error-prone mappings, and brittle integration points. KaibanJS solves this with automatic structured output chaining.
Structured output chaining automatically passes validated, schema-constrained outputs from one agent to another within a team. When a ReactChampionAgent task has an outputSchema that matches a WorkflowDrivenAgent workflow's inputSchema, the system handles the data transfer seamlessly.
Let's build a complete review analysis system that:
First, we establish our data structures using Zod:
import { z } from 'zod';
const reviewSchema = z.object({
product: z.string(),
rating: z.number().min(1).max(5),
text: z.string().min(1),
date: z.string().optional(),
author: z.string().optional(),
});
const processedDataSchema = z.object({
metrics: z.object({
averageRating: z.number(),
ratingDistribution: z.object({
1: z.number(),
2: z.number(),
3: z.number(),
4: z.number(),
5: z.number(),
}),
totalReviews: z.number(),
validReviews: z.number(),
invalidReviews: z.number(),
averageTextLength: z.number(),
commonKeywords: z.array(
z.object({
word: z.string(),
count: z.number(),
})
),
}),
reviews: z.array(reviewSchema),
summary: z.string(),
});
The WorkflowDrivenAgent handles deterministic data processing:
import { Agent, Task } from 'kaibanjs';
import { createStep, createWorkflow } from '@kaibanjs/workflow';
// Validation step
const validateReviewsStep = createStep({
id: 'validate-reviews',
inputSchema: z.object({
reviews: z.array(reviewSchema),
}),
outputSchema: z.object({
validReviews: z.array(reviewSchema),
invalidReviews: z.array(
z.object({
review: z.any(),
errors: z.array(z.string()),
})
),
totalCount: z.number(),
validCount: z.number(),
}),
execute: async ({ inputData }) => {
const { reviews } = inputData;
const validReviews = [];
const invalidReviews = [];
reviews.forEach((review) => {
const result = reviewSchema.safeParse(review);
if (result.success) {
validReviews.push(result.data);
} else {
invalidReviews.push({
review,
errors: result.error.errors.map(
(e) => `${e.path.join('.')}: ${e.message}`
),
});
}
});
return {
validReviews,
invalidReviews,
totalCount: reviews.length,
validCount: validReviews.length,
};
},
});
// Metrics extraction step
const extractMetricsStep = createStep({
id: 'extract-metrics',
inputSchema: z.object({
validReviews: z.array(reviewSchema),
invalidReviews: z.array(z.any()),
totalCount: z.number(),
validCount: z.number(),
}),
outputSchema: z.object({
metrics: z.object({
averageRating: z.number(),
ratingDistribution: z.object({
1: z.number(),
2: z.number(),
3: z.number(),
4: z.number(),
5: z.number(),
}),
totalReviews: z.number(),
validReviews: z.number(),
invalidReviews: z.number(),
averageTextLength: z.number(),
commonKeywords: z.array(
z.object({
word: z.string(),
count: z.number(),
})
),
}),
validReviews: z.array(reviewSchema),
}),
execute: async ({ inputData }) => {
const { validReviews, invalidReviews, totalCount, validCount } = inputData;
// Calculate metrics
const totalRating = validReviews.reduce(
(sum, review) => sum + review.rating,
0
);
const averageRating = validCount > 0 ? totalRating / validCount : 0;
const ratingDistribution = { 1: 0, 2: 0, 3: 0, 4: 0, 5: 0 };
validReviews.forEach((review) => {
ratingDistribution[review.rating.toString()]++;
});
const totalTextLength = validReviews.reduce(
(sum, review) => sum + review.text.length,
0
);
const averageTextLength = validCount > 0 ? totalTextLength / validCount : 0;
// Extract keywords
const wordCount = {};
validReviews.forEach((review) => {
const words = review.text
.toLowerCase()
.replace(/[^\w\s]/g, '')
.split(/\s+/)
.filter((word) => word.length > 3);
words.forEach((word) => {
wordCount[word] = (wordCount[word] || 0) + 1;
});
});
const commonKeywords = Object.entries(wordCount)
.map(([word, count]) => ({ word, count }))
.sort((a, b) => b.count - a.count)
.slice(0, 10);
return {
metrics: {
averageRating: Math.round(averageRating * 100) / 100,
ratingDistribution,
totalReviews: totalCount,
validReviews: validCount,
invalidReviews: invalidReviews.length,
averageTextLength: Math.round(averageTextLength),
commonKeywords,
},
validReviews,
};
},
});
// Data aggregation step
const aggregateDataStep = createStep({
id: 'aggregate-data',
inputSchema: z.object({
metrics: z.object({
averageRating: z.number(),
ratingDistribution: z.object({
1: z.number(),
2: z.number(),
3: z.number(),
4: z.number(),
5: z.number(),
}),
totalReviews: z.number(),
validReviews: z.number(),
invalidReviews: z.number(),
averageTextLength: z.number(),
commonKeywords: z.array(
z.object({
word: z.string(),
count: z.number(),
})
),
}),
validReviews: z.array(reviewSchema),
}),
outputSchema: processedDataSchema,
execute: async ({ inputData }) => {
const { metrics, validReviews } = inputData;
const summary = `Processed ${metrics.validReviews} valid reviews out of ${
metrics.totalReviews
} total.
Average rating: ${metrics.averageRating}/5.
Rating distribution: ${metrics.ratingDistribution['5']} five-star, ${
metrics.ratingDistribution['4']
} four-star, ${metrics.ratingDistribution['3']} three-star, ${
metrics.ratingDistribution['2']
} two-star, ${metrics.ratingDistribution['1']} one-star reviews.
Average review length: ${metrics.averageTextLength} characters.
Top keywords: ${metrics.commonKeywords
.slice(0, 5)
.map((k) => k.word)
.join(', ')}.`;
return {
processedData: {
metrics,
reviews: validReviews,
summary,
},
};
},
});
// Create and configure the workflow
const reviewProcessingWorkflow = createWorkflow({
id: 'review-processing-workflow',
inputSchema: z.object({
reviews: z.array(reviewSchema),
}),
outputSchema: processedDataSchema,
});
reviewProcessingWorkflow
.then(validateReviewsStep)
.then(extractMetricsStep)
.then(aggregateDataStep);
reviewProcessingWorkflow.commit();
// Create the workflow-driven agent
const reviewProcessorAgent = new Agent({
name: 'Review Processor',
type: 'WorkflowDrivenAgent',
workflow: reviewProcessingWorkflow,
});
Now we create LLM agents that receive the processed data and generate insights:
// Sentiment analyzer with structured output expectation
const sentimentAnalyzerAgent = new Agent({
name: 'Sentiment Analyzer',
role: 'Sentiment Analysis Expert',
goal: 'Analyze sentiment, themes, and patterns in product reviews',
background:
'Expert in natural language processing, sentiment analysis, and identifying patterns in customer feedback.',
type: 'ReactChampionAgent',
tools: [],
});
// Insights generator
const insightsGeneratorAgent = new Agent({
name: 'Insights Generator',
role: 'Business Insights Expert',
goal: 'Generate actionable insights and recommendations based on review analysis',
background:
'Expert in business analysis and strategic recommendations. Specialized in translating customer feedback into actionable business insights.',
type: 'ReactChampionAgent',
tools: [],
});
Here's where the magic happens - we define tasks that automatically chain:
// Task 1: Process reviews (WorkflowDrivenAgent)
const processReviewsTask = new Task({
description:
'Process and analyze the product reviews: {reviews}. Extract metrics, validate data, and calculate statistics.',
expectedOutput:
'Structured metrics including average rating, rating distribution, common keywords, and processed review data',
agent: reviewProcessorAgent,
});
// Task 2: Analyze sentiment (ReactChampionAgent)
// This automatically receives processedData from Task 1
const analyzeSentimentTask = new Task({
description: `Analyze the sentiment and themes in the processed reviews.
Focus on:
- Overall sentiment trends (positive, negative, neutral)
- Main themes and topics mentioned by customers
- Common pain points and complaints
- Positive aspects and strengths highlighted
- Emotional patterns across different rating levels
Use the processed metrics and review data to provide comprehensive sentiment analysis.`,
expectedOutput:
'Detailed sentiment analysis with themes, pain points, strengths, and emotional patterns identified in the reviews',
agent: sentimentAnalyzerAgent,
});
// Task 3: Generate insights (ReactChampionAgent)
// This automatically receives outputs from both Task 1 and Task 2
const generateInsightsTask = new Task({
description: `Generate actionable business insights and recommendations based on the review metrics and sentiment analysis.
Provide:
- Key findings and trends
- Priority areas for improvement
- Strengths to leverage
- Specific actionable recommendations
- Strategic suggestions for product development and customer satisfaction`,
expectedOutput:
'Comprehensive business insights with actionable recommendations and strategic suggestions for product improvement',
agent: insightsGeneratorAgent,
});
Put it all together:
import { Team } from 'kaibanjs';
const team = new Team({
name: 'Product Reviews Analysis Team',
agents: [
reviewProcessorAgent,
sentimentAnalyzerAgent,
insightsGeneratorAgent,
],
tasks: [processReviewsTask, analyzeSentimentTask, generateInsightsTask],
inputs: {
reviews: [
{
product: 'Smartphone XYZ Pro',
rating: 5,
text: 'Excellent product, very fast and great battery life. The camera is impressive and the screen looks incredible.',
date: '2024-01-15',
author: 'John P.',
},
// ... more reviews
],
},
env: { OPENAI_API_KEY: process.env.OPENAI_API_KEY },
});
// Execute the team
const result = await team.start();
The automatic chaining happens through KaibanJS's task result system:
Task 1 completes: The WorkflowDrivenAgent processes reviews and returns structured data matching processedDataSchema.
Automatic extraction: The system extracts the result from Task 1 and stores it in the task store.
Task 2 receives data: When Task 2 (sentiment analysis) executes, it automatically receives Task 1's output as part of its context.
Type-safe access: The LLM in Task 2 can access the structured data via task result interpolation in the description.
Task 3 receives multiple results: Task 3 automatically receives results from both previous tasks.
┌─────────────────────────────────────┐
│ Task 1: Process Reviews │
│ Agent: WorkflowDrivenAgent │
│ ─────────────────────────────────── │
│ Input: { reviews: [...] } │
│ Output: { processedData: {...} } │
└──────────────┬──────────────────────┘
│
│ ═══════════════════════
│ AUTOMATIC SCHEMA CHAIN
│ ═══════════════════════
▼
┌─────────────────────────────────────┐
│ Task 2: Analyze Sentiment │
│ Agent: ReactChampionAgent │
│ ─────────────────────────────────── │
│ Input: Task 1 result (auto-injected)│
│ Output: Sentiment analysis │
└──────────────┬──────────────────────┘
│
│ ═══════════════════════
│ MULTIPLE RESULTS CHAIN
│ ═══════════════════════
▼
┌─────────────────────────────────────┐
│ Task 3: Generate Insights │
│ Agent: ReactChampionAgent │
│ ─────────────────────────────────── │
│ Input: Task 1 + Task 2 results │
│ Output: Business insights │
└─────────────────────────────────────┘
For even more control, you can use explicit outputSchema on LLM tasks:
const sentimentSchema = z.object({
overallSentiment: z.enum(['positive', 'neutral', 'negative']),
themes: z.array(z.string()),
painPoints: z.array(z.string()),
strengths: z.array(z.string()),
emotionalPatterns: z.record(z.string(), z.string()),
});
const analyzeSentimentTask = new Task({
description: 'Analyze sentiment in reviews...',
outputSchema: sentimentSchema, // Explicit output schema
agent: sentimentAnalyzerAgent,
});
// Now you can create a workflow that expects sentimentSchema
const insightsWorkflow = createWorkflow({
id: 'insights-workflow',
inputSchema: sentimentSchema, // Matches Task 2's outputSchema
// ... workflow steps
});
When schemas match, KaibanJS automatically passes the data at the root level. If they don't match, the data is still available but nested under the task ID.
Structured output chaining in KaibanJS provides a powerful way to combine the flexibility of LLM agents with the reliability of deterministic workflows. By leveraging automatic schema-based data passing, you can build robust AI systems that are both powerful and maintainable.
The key advantages:
Ready to build your own chained agent system? Check out the KaibanJS documentation and try the live example.
Resources: