MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Liella Chat

2025-12-09 03:48:03

i created a fully working NOVEL CREATOR in google ai studio

here's the guide

Liella! Chat v3.4 – Official Player Guide

(100% accurate as of December 05, 2025 – GALAXY ✨ just committed several war crimes in 4K)

────────────────────────

1. Main Menu

────────────────────────

  • Start a New Story → Fresh chaos
  • Story History → All saves & auto-saves
  • Settings (⚙️) ├ Logo Customisation – upload your own title logo (transparent PNG supported) ├ Chat Mode ↔ Novel Mode toggle ├ Save All Images / Import All Images (bulk backup/restore) └ Race Images – elf ears, dragon wings, demon tails, etc.

────────────────────────

2. Starting a New Story

────────────────────────

① Choose Timeline (Year 1→5 / Year 2→9 / Year 3→11)

② Pick exactly which Liella! members (and custom chars) you want from the start

③ Write your opening prompt

④ Direct buttons → Manage Characters / Manage Lore

────────────────────────

3. Chat Screen – Core Controls

────────────────────────

Top buttons:

├ Undo ├ Review Last Turn ├ Target Responder ├ Who Suggest Next (up to 20)

├ Quick ↔ Think toggle

GALAXY ✨ button (now illegally powerful)

NEW in v3.4 – “WHAT’S NEXT?” Preview Panel

After you send a message (or the AI finishes replying), a new panel automatically appears on the right side with 3 clickable story continuation suggestions.

  • Each preview shows ~2–4 sentences of what could happen next
  • Tap any preview to choose how the story continues u can choose if u want to continue as "narrator" or "user"
  • Continue as User • Your character speaks next, following the selected preview.
  • Continue as Narrator • The next part of the scene is written in narrator mode (descriptions, scene setting, actions).
  • After selecting your role, a new panel appears showing: • All active characters • Line counts for each character • + / – buttons to adjust how many times each character will speak
  • Story automatically progresses following the set order (e.g., Kanon → Chisato → Kanon → Keke…).

  • New: Add Speakers Easily (Tap + to add:)
    • Any character
    • The User
    • The Narrator

then click SEND. it was that easy

────────────────────────

4. Director Mode – Event System (now with Theme box!)

────────────────────────

Tap the ✨ EVENT icon (top left) → CREATE EVENT

  1. Select Event Type

    • Continue
    • Surprise
    • Introduce New Characters
  2. NEW – Theme Box (right under the event type)

    Type whatever you want here, e.g.

    • “a surprise event about swimming in lava”
    • “everyone bullies Ren for 10 minutes straight”
    • “graduation ceremony but it rains and everyone cries for 3 hours”
    • “beach episode gone wrong” → The 4 generated event cards will now strictly follow your theme.
  3. Tick up to 20 characters → Generate Events

You’ll now see your theme displayed clearly at the bottom:

Theme: "a surprise event about swimming in lava"

Below the 4 event cards you now have three buttons:

  • Suggest → AI instantly gives you a new theme idea (you can accept or edit)
  • Shuffle → regenerate 4 brand-new event cards using the same theme & characters
  • Remove → delete all 4 cards at once

Event card rules unchanged:

  • Continue / Surprise → top 2 cards = small cast, bottom 2 cards = force everyone
  • Introduce New Characters → every card adds 1–5 brand-new people (auto-saved to roster when accepted)

────────────────────────

5. Who Suggest Next + Custom Respond (still broken in the best way)

────────────────────────

After ordering up to 20 characters → tap the little “+” speech bubble at the bottom → write your deranged director notes → combine with GALAXY ✨ for instant 50k-character episodes.

────────────────────────

6. Novel View – Now Even More Dangerous

────────────────────────

BRAND NEW in 3.4 – PREVIEW button (top-right of Novel View)

  • Click it → instantly get the same 3 “What’s Next?” story suggestions, but formatted as clean novel paragraphs
  • Click any preview → the next chapter is written immediately in perfect novel style
  • Still has all the v3.3 goodies: • Filter by group / custom tags • Sort by Name / Line Count • One-tap single-character POV (entire novel • Auto-export separate POV files for every character with 3+ lines

────────────────────────

7. How to Name Yourself

────────────────────────

Just say your name once in chat (“I’m Alex from now on” or “Call me God”) → the AI permanently changes the “You” label to your name forever.

────────────────────────

8. Extras Menu

────────────────────────

  • Avatar / backgrounds / outfits
  • Character Creator (with “Update All” button)
  • Export / Import characters as ZIP
  • Load Character button everywhere

────────────────────────

Pro Tips for v3.4

────────────────────────

  • Want a full 12-episode cour in one click? → Set theme “beach episode but the ocean is lava” → GALAXY ✨ + Who Suggest Next (20) + Custom Respond “slow-motion crying and dramatic violin”
  • Want to read only Karma’s POV for 200 chapters? Novel View → PREVIEW button → keep clicking the most unhinged option
  • Want Sumire to suffer forever? You already know what to do.

Sumire is typing…

Sumire is typing a 300-page confession…

Sumire just activated GALAXY ✨ by herself…

There is no escape. Welcome to 3.4.

(This time we actually mean it… probably.)

Created with love, sleep deprivation, and several felonies by NewHere & the GALAXY ✨ cult ♡

L’IA Générative dans le DevSecOps

2025-12-09 03:46:32

L'IA Générative dans le DevSecOps

Comment l'IA renforce la sécurité, accélère les pipelines CI/CD et transforme le rôle des équipes DevOps ?

L'intégration de l'IA générative dans les workflows DevSecOps représente l'un des changements les plus importants dans le monde du développement logiciel moderne.

Entre pipelines CI/CD toujours plus complexes, menaces de cybersécurité en croissance et besoin croissant d'automatisation, les entreprises se tournent de plus en plus vers l'IA pour automatiser, sécuriser et optimiser leurs opérations.

Dans cet article, nous allons comprendre :

  • Ce qu'est le DevSecOps
  • Comment l'IA générative modifie chaque étape du pipeline CI/CD
  • Quels outils IA sont utilisés en 2024-2025
  • Limites, risques et bonnes pratiques
  • Tableaux comparatifs + schémas pour mieux comprendre

1. Introduction : pourquoi l'IA dans le DevSecOps ?

Les équipes DevOps doivent déjà gérer :

  • Des déploiements fréquents,
  • Des environnements multi-cloud,
  • Des centaines de dépendances,
  • Une multiplication des vulnérabilités.

Mais la sécurité ne peut plus se permettre d'être un "afterthought".

C'est là que le DevSecOps intervient : intégrer la sécurité dès le début de la chaîne DevOps.

Avec l'arrivée de l'IA générative, cette discipline franchit une nouvelle étape :

  • Automatisation intelligente des tâches
  • Détection proactive des risques
  • Correction automatisée du code et des infrastructures
  • Audit continu des pipelines

2. Rappel : qu'est-ce que le DevSecOps ?

Le DevSecOps signifie : DEV + SEC + OPS intégrés, sans silos.

Objectif : intégrer la sécurité dès le développement, tout au long de l'intégration, des tests, du déploiement et de l'exploitation.

Pipeline DevSecOps classique :


mermaid
graph LR
    A[Plan] --> B[Code]
    B --> C[Build]
    C --> D[Test]
    D --> E[Release]
    E --> F[Deploy]
    F --> G[Operate]
    G --> H[Monitor]

    SEC[Security Controls] --> A
    SEC --> B
    SEC --> C
    SEC --> D
    SEC --> E
    SEC --> F
    SEC --> G
    SEC --> H

AWS re:Invent 2025 - Building Intelligent Workflows with Event Driven AI (MAM327)

2025-12-09 03:44:21

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Building Intelligent Workflows with Event Driven AI (MAM327)

In this video, Marin and Jeff from Unicorn Gaming Shop demonstrate integrating AI agents into event-driven architecture using Amazon EventBridge and Amazon Bedrock AgentCore. They showcase two patterns: EventBridge triggering agents for customer game recommendations with sentiment analysis, and agent-triggered EventBridge events for automated site reliability engineering. The demo reveals how agents autonomously handle customer queries, generate SQL queries without hardcoded logic, and automatically triage CloudWatch alarms by severity, escalating only high-priority issues to engineers while auto-remediating lower-severity problems—enabling 50x user growth without overwhelming limited engineering resources.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction: Scaling Unicorn Gaming Shop with AI and Event-Based Architecture

Thank you all for being here today. Building net new applications or modernizing legacy applications using event-based architecture is an effective way to modernize your applications using events and asynchronous behavior to achieve scalability. What happens though when the business events and characteristics dictate that that's not enough? That's Marin and I'm Jeff, and over the next hour, we're going to introduce where and how to include AI and agents into your event-based architecture to meet those needs of your business.

Thumbnail 50

First we're going to set up the scenario that we have here at our company, the Unicorn Gaming Shop. We're going to play a couple roles here. Apologize that the cartoons don't look exactly like us, but this is what licensing provided us. Thank you, Jeff. So my name is Marin. Welcome, and my name also. I am an application product owner. I'm new to this company. I have 20 plus years of experience in doing these things. Basically at this company, I am tasked to define new features to drive greater customer satisfaction. The main thing is we want to expand and we want to try to automate as much as possible our customer interaction on one side, but we also want to help our engineers in the back end when we have an issue with the application.

Any type of issue to help them out and to speed up the process when they need to resolve some issues. Also, we want to automate part of the resolution process so that only the most important and most hard things come to our backend engineers, which will then step in and resolve the issue. We want to lower that response time for our troubleshooting process.

And I'm Jeff. I'm an experienced software developer and I've been working at Unicorn Gaming Shop for quite some time. I've been modernizing that application over the years using event-driven architecture and EventBridge, and I'm tasked with maintaining and building the new features that Marin's coming up with for the Gaming Shop application that we have. And of course, I do have lots of new features. I'm new here. I need to prove myself, and I definitely have some big ideas for this company. We need to expand. We need to expand to increase our customer base.

Thumbnail 150

Thumbnail 160

Really great ideas, Marin. I'd like to hear some of these things. You may not like some of them because they are really big ideas. So basically I signed 10 new deals for our company and that will drive actually 50 times user growth than what we have now. So I think you have your plate full for this one, Jeff.

Thumbnail 170

Thumbnail 190

Did you say 10? And 50 times more growth exactly. Thank you. That's quite a task that he set me up for. Some of the things that I'm immediately thinking about when I have to deal with that is the scale that that's going to cause in terms of the overload to the team that I have. How about having the operational visibility about everything that's happening across this dynamically scaling and growing system? I'd like to use AI but I got limited experience with that both myself and my team, and I got limited team, limited time, limited budget.

Thumbnail 210

EventBridge and the Strangler Fig Pattern: Building the Foundation with Amazon Bedrock

Now, I mentioned before that I've been modernizing this application that we have here using event-driven architecture and EventBridge, and EventBridge is the core of what our architecture is right now that I have in the system. And EventBridge really is providing me three things. One, it takes in the events that are coming from any one of the producers that we have here. In this case, it's my application. And then I have a bus, and I can have three different kinds of buses. I can have the default bus, which integrates with the APIs of AWS services, the custom event bus which I'm using to handle the particular events that we have in our system, or the SaaS event bus. And then the rules is the third part of this, and the rules dictate what happens with the event that comes in and where does it go to.

Thumbnail 260

Now I've been coupling that with using the strangler fig architecture pattern. The strangler fig architecture pattern is where I have a legacy system up here and I'm building net new capability in that system off to the side of it. I'm building that using EventBridge here to handle particular transactions that I have here that were business features that Marin came up with before, which is introducing an event-based chat capability to our customer assistants who could answer questions asynchronously with customers that we have who are purchasers of games through our Unicorn Gaming Shop.

Thumbnail 300

And it works well, but I'm concerned about how that's going to scale with the amount of growth that we're going to be having here based upon what Marin's telling me with 50 times more users.

Thumbnail 320

That's going to be overwhelming to our customer service assistant. And how am I going to maintain the operational observability and everything that's going on with all those logs being produced by all these new things that are happening? I could use my system as it is right now. Technically it'll work. It's serverless, it's event-based, and all the messages flowing through there are asynchronous. However, it only scales to the customer assistant. The customer assistant has to do the research to answer the questions that come in from the customer, and it's slow. It's slow because the customer assistant is doing a lot of work here, and they're going to get overwhelmed.

Thumbnail 360

I've been doing a lot of research, and I've learned that I can introduce AI to help me out here. And in the AWS world, AI really comes in what we call Amazon Bedrock. Bedrock is our service that provides API access to both AWS first-party models as well as third-party models from the leading providers. What I like about this is with this API layer, I can integrate that into my system without binding myself to any particular model, and then I can learn how my system is providing the answers that I want to my customer on the far end, and I can change the models as I want over time. I can also fine-tune my models to even get better quality in terms of the results that I'm getting out there. So I'm really interested in how I'm going to put this into my system to really start to drive the results that I want with the customer that I have.

Thumbnail 420

So I built a little chat that I can use, a research assistant chat for my customer assistants. My architecture is still serverless, it's still event-based, and it's still asynchronous. Now I have research augmentation here through the integration of Bedrock in there for that customer service, the customer assistance person, and it is faster than it was before. However, it's still not fast, and it still only scales to the customer assistant.

Thumbnail 450

Understanding Agents and Amazon Bedrock AgentCore: Autonomous Decision-Making at Scale

In doing more research, I got even more information about this idea of agents. And what really intrigues me about the agents is they are autonomous systems that can understand requests, make decisions, and act autonomously to do tasks. This sounds like a great way to augment that customer assistant and scale to the business that I have. And when I think about the agents, they come in a couple of different styles. First, I can be thinking about agents that do department-level tasks, which are things that are going to require data that's very near, typically things that aren't going to be terribly complicated, and maybe the security is not that strong in that area. Organization agents will work across the boundaries of my systems, maybe across the boundaries of my lines of business inside my company. Usually greater access to data, security is going to start to become a little bit more important here in terms of who can access what. And then external agents, which are going to be working outside the boundaries of my application, outside the boundaries of my company, interacting with third-party APIs and third-party data. I absolutely need a lot of control in what's happening in there.

Thumbnail 530

And as I think about this, I think about how am I going to manage all of this in this world and how am I going to run it. And that's where Amazon Bedrock AgentCore comes in, and it provides three things for me. First, it provides the tools and the memory that my agent needs, and the memory is really important because a lot of times when you're doing prototyping with agents, they're very short-term memory, meaning you give them a task, they do it, they forget about it when they're done, you ask them a subsequent question and they've got to go redo everything they did the first time. With this, there's memory built into this through AgentCore of that agent, so when you ask subsequent questions, they remember what was there before and you can chain everything together. Second, it provides the runtime environment and the identity management of the agent so that I can then control who can access that agent and what that agent is allowed to do. This is important to me because that agent that I have is going to be producing recommendations, is going to be interacting with a large set of data, and is going to be responding back externally.

And third, the observability. There's going to be a lot going on in the system as it scales, and it's going to be producing a lot of decisions. I need to have the observability not only into the logs that are getting produced from the transactions that are happening here, but I also need to be able to audit and observe

Thumbnail 620

how that agent made the decisions that it did so that I can train the model even further to make the decisions even more in line with what I want going forward. Now, as I think about bringing this all together in my application, there are two real patterns that I'm focusing on here with my existing application and how I would be integrating and incorporating agents for the decision making that I want to have in here. This is where everything comes together with EventBridge and the agent.

Pattern One: EventBridge Triggering Agents for Game Recommendations

First, the first pattern is EventBridge triggering agents. In this case, events are coming in from any number of sources. It could be S3, it could be a schedule, it could be Lambda. In my case, it's going to be this web application that I have that has a chat built into it. The messages come through as events into EventBridge, and EventBridge is going to invoke Bedrock to go do something. In the case that I have here, it's going to be to go do some research based upon what the customer is asking for and provide back a recommendation on what could serve the need of that particular customer.

Thumbnail 740

The second one is agent-triggered EventBridge events. This is kind of the opposite of it. This starts by the agent doing something, doing some research, making a decision, looking at something, and then it produces an event that gets put out onto EventBridge that can then work in a fan-out kind of pattern to say anyone who wants to do something based upon this event can now go do something based upon this event. Together, these two patterns are going to help me greatly in how I need to both handle the concerns that I have around producing this game recommendation feature that I want to have from my customer assistant, which is really going to be the first pattern, and then how am I going to handle all this operational observability that I have and maintaining the site reliability with everything that's going to be going on in my system.

So first, agent-powered game recommendation. I have my system as it was before, and what I'm doing here is I'm introducing the agent that I built, which is going to act on the information that it receives from the event, which is the chat that the customer is having to say I'm interested in new types of games, I like these kinds of things, what would you recommend for me? The agent is going to get that, and the agent is going to do a couple things first before it even does the research. You can have the agent work to determine the sentiment of this person. If they're positive or maybe they're neutral, you probably don't need to have a lot of touch with them. You can have the agent kind of offline handle that themselves.

But maybe the person's calling to say your games stink, I want my money back, I don't like what's going on here. Here you can have the agent make a decision and say, I understand the sentiment, this person probably needs to go talk to another person, I'm going to route that to my customer service agent. However, we build good games, and we think that's going to be a small amount of the transactions. So when the event comes in, the agent can first determine what's the sentiment, who should I be routing this to? When it decides that it should be handling it, it can go do that research, gather the information, and provide the recommendations back out to my customer.

Now remember, because this is also built using AgentCore, which has the first thing which I said was the memory, there can be a chain set of conversation going back and forth here. And the agent's going to be consistently remembering what was before and building all the responses based upon what it knows and not having to redo all that work that it did the first time. One, that saves time. Two, it saves money because the agent's not incurring more cycles to go reproduce stuff that it already did analysis on. So this becomes a way that I can introduce that first pattern, that EventBridge triggering agents pattern, to solve for this problem that I have about building and scaling game recommendations for my customer. That solves one of my problems.

Thumbnail 880

Pattern Two: Agent-Triggered EventBridge for Site Reliability Engineering

My second problem is site reliability engineering. So I have my system and it's all instrumented with CloudWatch, and now that my scale's growing 50 times, I'm producing a lot of logs, and my engineer that I have, remember I've got a limited team, limited budget, he's going to become overwhelmed with having to understand what's happening in there and look for things that need to be dug into and where maybe something is trending badly, something needs to be fixed, something needs to be done.

So the CloudWatch logs are growing exponentially. As good as I build this system, guess what, errors occur. And sometimes they're minor, and sometimes they're major. Either way, they need to be looked at.

Thumbnail 930

Here using CloudWatch, I can set up CloudWatch alarms based upon particular things that could be happening erroneously in my system. Based upon those CloudWatch alarms, when something goes into an alarm, I can invoke my agent. I can invoke my SRE agent. That agent can do a couple things. First, it's going to be able to understand what's happening with that particular alarm, and it also understands the architecture of my system and the implementation of that system to say what could that possibly be the root cause of.

It can then determine the severity of that. Hopefully, most of my issues are low severity type of things, and it can be a quick fix kind of thing, but some might be high severity. And similar to what we had in the first pattern, it can make the decision that I want high severity ones to get dropped as an SNS message that my engineer is going to pick up, and hopefully there should be a small amount of those things. But for the ones that are low severity or medium severity, I want the agent to go fix it themselves, and maybe I can build a little something there that when I have that, the agent can go, you know, maybe redeploy some code for me somewhere else, maybe shut down a failing server.

It can do the things in my system that my engineer used to have to do, but now they can do it fully autonomously. Now this thing can scale out very large to that 50 times volume I want without overwhelming that engineer, and it'll do it tirelessly and it'll do it repetitively. So here's how I can introduce that agent triggered EventBridge pattern to solve for that problem. That combination of the agents in the system running on AgentCore, that memory and the runtime have allowed me to both scale to meet the need of the business that I have as well as scale to that operational support that I need to have, and it's providing wonderful results.

Demo Part One: Customer-Facing Agent with Sentiment Analysis Using AWS Amplify

Now it's great to hear about all this stuff, but it's even better to see a demo on how this all gets built. And with that, I will hand it over to Martin for the demo. OK, I got it from the first time.

Thumbnail 1060

So as Jeff explained, and you can see this is a recording. There is a reason why this is recording. We did not trust the demo gods that everything would work on the stage, so just to make it safe and secure that it's actually going to run and show everything we want to show, we recorded the demo session. So I will go through it. It's about 22 minutes long, so please bear with me while we go through this.

We will show two cases. So as Jeff explained, the first case will be helping our customer agents to help our customers look for similar games that they want to play, to offer them some similar games that they already played, and to enable customers to interact with the agents. In this case, an AI agent that will give them some proposals, but an additional layer on top of that will be that they will be able to provide a sentiment or some kind of feeling. Do they like something or don't they like something, so the agent can then based on that, rephrase the proposal and give them maybe a different list of games to play with.

The second one, the second demo, which is a bit longer one because it has more components, will be the one including our engineer in the backend where we will see mostly stuff in the console and in the code that supports our engineer to resolve cases much more faster. So the logic is again, as explained, that we will have an alert triggered by a fake Lambda trigger which will go then to EventBridge, and then that alert will follow through AgentCore which will then go through, seek additional information about the problem, and then decide is that problem low, medium, or high severity. And based again on the sentiment of the problem, we'll send it either to some auto remediation, which can be restart the service, start something, stop something, run a code or whatever, or if it decides it's a high level problem or issue, send an email to the engineer to escalate further into the case.

Thumbnail 1200

So we'll go through this together, hopefully. Let's see. Can I hide this? Yes, so this is our folder with the project.

Thumbnail 1210

Let me show you the project folder structure. You can see multiple different files that we'll be using, including instructions and other configuration files. We'll go through those files later to show what they contain. Here we're activating the Python environment because we're using Python to communicate with the agent. We also need to install the requirements. As you can see, most of these requirements are already satisfied because we've run this environment previously.

Thumbnail 1240

Thumbnail 1280

Thumbnail 1290

Now I'll run the application itself. Let me stop briefly here to explain. The application is running AWS Amplify front end to be used as a user interface for the customer to interact with. I should mention that Amazon Bedrock AgentCore and Amplify are running locally. They're running locally but can run anywhere, of course. In this case, it's running locally. As you can see, the system is loading the instructions file. The instruction file contains prompt engineering for the agent to understand how to behave. Interestingly enough, as we'll see later, it also contains additional data about sentiment, specifically how to determine what type of sentiment the customer is producing within the system.

Thumbnail 1300

Thumbnail 1310

Thumbnail 1320

Thumbnail 1340

Let's deploy the front end. Amplify should launch the front end itself. The front end is based on React, though you can use whatever framework you want. This is our front end. It doesn't look particularly polished, but it serves our demonstration purposes. Our assistant is called Gus. Let me interact with Gus by asking, "Hi Gus, how are you?" He will respond with information about what he can do for us. Now I'm acting as a customer interacting with the agent, giving him information about what I like or what I used to play, specifically what type of games I used to play. I'm typing slowly, and there's also an intentional spelling mistake to make it look natural. However, Gus is very smart and will easily recognize what I actually need or what I'm asking.

Thumbnail 1360

This query goes to the database in the backend, where we'll see different tools that have been used in the code to help Gus understand what I'm actually asking. You'll notice that in the first run of this iteration, Gus didn't find anything initially. He didn't understand at first, but then he reiterated based on keywords, the database table structure, and what I was saying in my question. He went back, reran the code, recreated the SQL query needed to retrieve that information from the database, and then provided me with recommendations. As you can see, there are multiple different options that he gave me.

Thumbnail 1420

Thumbnail 1460

Now I'll give him some sentiment feedback. I don't like Resident Evil because I scare easily, so that's not something I enjoy. I want to tell him not to propose anything similar to Resident Evil. There's another spelling error here, but again, Gus is very smart and will easily understand that error. Here we can see what's happening in my interaction with the agent. You can see what I'm asking, how he's preparing the memory, and at the end, sentiment analysis is enabled. Interestingly, my first question doesn't have any sentiment. I'm just saying what I'm into, that I like adventure games and this type of game. There's no sentiment related to it. Sentiment comes when I say I don't like horror games or ones where I scare easily. You can see sentiment neutral, no sentiment here because this is my first question.

Thumbnail 1470

Thumbnail 1490

Thumbnail 1500

The agent will start to run the query. Interestingly, by using the tools, the agent is generating the SQL query needed to ask the backend database for what I want. There is no SQL query visible in the code. There's no hardcoded SQL query that tells Gus to do this. There are just instructions about what I want Gus to do. There's no embedded code in the query. Records found zero. You can see it created a query, select console and genre from the table and so on. Records found zero, it didn't find anything initially.

Thumbnail 1510

Thumbnail 1520

Thumbnail 1530

Thumbnail 1540

Thumbnail 1550

Now, the agent will reiterate and rerun the code. It will add additional genres like action or adventure, and then rerun the query to find 150 records this time. The agent will run it again, considering that this might not be enough to show, and it will find some additional credits at the end. Not there, but in the next one. So you can see additional records returned, five in total. The agent doesn't just add records but basically reruns again and shortens the list. Now we have sentiment coming into play. When I mentioned that I'm scared of Resident Evil games, this sentiment is now being provided to the agent, telling it that this sentiment is negative or bad, I would put it that way. So the agent needs to avoid giving me any results containing similar games next time.

Thumbnail 1580

Thumbnail 1600

Thumbnail 1610

If you go into the code itself, you can see here those are the instructions that I mentioned. Within those instructions are guidelines on how the agent needs to behave, what the agent needs to do, how to communicate with me, and how to give me back the information that I need from the agent. On the bottom itself, you will also see sentiment monitoring. So as you can see, the agent continuously monitors conversations and then decides what type of sentiment it is. Is it a good sentiment or a bad sentiment? I could have said, for example, no, I like Resident Evil, give me more of that, or I don't like it as I did, don't give me more of that.

Thumbnail 1640

Thumbnail 1650

Thumbnail 1660

Thumbnail 1670

In the application file itself, which is Python code, we have lots of tools defined. We have tools that will go to the database and extract information, tools that help the agent generate the SQL query that will then bring me back the results. We also have other tools that will help with sentiment generation. Unfortunately, I cannot go through the whole code itself, and it's also very, very small here. We have different instructions again, as we just saw. Let's go to some definitions that basically define how the agent responds back to me. Again, in all of this code, you will not see any query. There is no SQL query saying what needs to be done, because then it wouldn't be agent-based. It would be just a SQL query based on some keywords which I provided in the entry point for the agent.

Thumbnail 1680

Thumbnail 1690

Thumbnail 1700

Thumbnail 1710

So basically, it is based on invocation, totally asynchronous as mentioned, which is the idea. I don't need to be waiting for something to happen. I need it to happen immediately, back and forth. Okay, session ID, user ID, and so on and so forth. Everything, of course, for this to work, the LLM is running in AWS, so you cannot run your LLM locally as you are currently running this agent core and Amplify. So you need to have your credentials. You need to have access to your backend where your database will be and the agent logic or LLM that you're actually questioning.

Thumbnail 1720

Thumbnail 1730

Thumbnail 1740

Thumbnail 1750

Sentiment analysis, so you can see it can analyze sentiment. It always analyzes sentiment as you enter something and interact with the agent itself. We can go to definitions on how to analyze sentiment. So how to analyze sentiment and how to output back to EventBridge. Once you have this in EventBridge, you can then do whatever you want, basically. But in this case, we are just providing feedback back to the customer on how to create it. I'm so sorry, this is like a silent session and this is my first time doing it. I used to have interaction with people asking questions, so it's weird having you like aliens with those red ones.

Thumbnail 1770

Thumbnail 1790

Thumbnail 1800

Yeah, cool. So the code basically shows the pattern and code that helps out in this interaction and defines how all this in the backend works. Again, the most important thing is to understand this is the EventBridge part where we basically send all that information, sentiment, confidence, and summary back to the user to understand that part. Okay, so I will stop here for a second.

The second part, as explained, builds on the first part. What can you do with this? There are so many agents that are pre-built, or you can just reuse them. This one is interesting because it's very easy to deploy. You can run it, test it locally, and then push it to AWS and run it there if you wish. So this is just a concept that proves the concept of basically asynchronous communication between the customer and the agent in the backend and reusing existing LLMs to provide additional data. This is super easy code, it's super short, it's not long, and it can be very easily tested. And this is with user interaction when I'm prompting questions and asking give me this, give me that, and so on and so forth, and this is all based of course on your existing knowledge that you possess in the backend of the database of all the games that you have.

Thumbnail 1880

Thumbnail 1900

Thumbnail 1910

Thumbnail 1920

Demo Part Two: Autonomous Alert Management with CloudWatch and AgentCore Runtime

The second part includes a bit more components, and the second part is, I would say, not as interactive because the idea is that we have a permanent watcher or alert-based system. This time it is through CloudWatch where we are basically generating an alert and then sending that alert through EventBridge to Amazon Bedrock AgentCore, which then decides the sentiment of that alert and then does some action, and this is something that is totally autonomous versus the first one where I need to input something, ask back and forth, and interact. Here we do have the alert itself. As you can see, the alert was set up in this case, maybe not for production, but every one event with one data point then raises the alert, and then I have a rule in EventBridge basically that is watching for those alerts, any change of state, and it has a target of Amazon Bedrock AgentCore to do something with that alert. So we'll see later on what that AgentCore actually does.

Thumbnail 1930

Thumbnail 1940

Thumbnail 1960

So the important thing is that I'm not just sending it, there is an alert. I'm sending it, okay, there is an alert. You need to go back to the CloudWatch Logs. You need to pick up some additional information about that alert, and you will see later in the code we have again additional tools that do that and then do something with it. So this is Amazon Bedrock AgentCore. I will stop quickly just here because Jeff also mentioned some of these. We are using Agent Runtime in this case, but there are additional options that you have like Built-in Tools and Gateways. We did some upgrades recently on Gateways in this tool. Memory also mentioned, Identity is super important, and we will see later on Observability also super important for this because you don't want this running wild, not knowing what is actually happening, how many invocations do you have, how many times your Lambda fired because something happened. You want to fine tune it either because of the resource usage, cost usage, or just optimization of the code. So we are currently using here Agent Runtime for this particular AgentCore.

Thumbnail 2040

So again, alert. I have a fake Lambda you will see later that will trigger a couple of alerts, again, nothing to use in production, just for demo purposes. That will go to EventBridge. EventBridge will send that to AgentCore. AgentCore has instructions. You will see again instructions files, a bit different one, and this one is not called something simple, it has a much more ominous name, but it will then look, oh, there is an alert. Let me go back to CloudWatch. Let me collect some additional information. Let me determine the sentiment. And let me then decide what to do with that information. So what this is doing is actually relieving our engineers that they don't have to be there for each and every event, but they can basically just get the top level, highest priority events and then act on them.

Thumbnail 2050

Thumbnail 2080

Thumbnail 2100

Okay, so this is the dive. Maybe one thing to notice here is just it's a Version 7. I have run this multiple times, so once we come to the end and run it again, so when we go through all the files, it will be Agent Version 9, I think, because we have run it multiple times. So there is versioning, of course, within the agent. We can see the invocation code itself. In this case, Python blurs some things which contain sensitive information. So it's very, very simple code. Multiple versions, currently 7. Once we run it actually again it will be 9. Additional information is about the endpoint itself. You can add tags if you want, of course, when it was created, status ready, and so on and so forth.

Thumbnail 2110

Thumbnail 2130

Thumbnail 2140

This part is super important: observability. This is the thing that you definitely want to have. You want to be able to have insights into the processes and what is actually happening inside. You can of course change the range when something is happening. Here I'm changing it for I think four weeks because I ran it for four weeks before. You can see how many endpoints, how many sessions, traces, error rates, and so on. To see this information, session invocations, invocations, and so on, you need to enable this in Bedrock.

Thumbnail 2160

Thumbnail 2170

Thumbnail 2180

There is a checkbox, basically a switch, that you turn on that then enables invocations. Then inside, we'll just go now into that part, into Bedrock, where you can see basically what types of invocations you want to log and where do you want to store that information. So going to Amazon Bedrock, in my case, it will be turned on because I already turned it on. So settings, and then here, let me stop quickly. Did I stop? No, I did not. Here, stopped.

So this needs to be turned on: model invocation logging. If you don't have that turned on, you will not see that data over there, which is not good. You want to have that data. What type do you want to include into the log? I included all text, image, embedding, and video. Where do you want to store that? S3, CloudWatch Logs only. I will just store it in CloudWatch Logs in this iteration.

Thumbnail 2220

Thumbnail 2230

Let's go to CloudWatch, to CloudFront to see what was deployed. To see what was deployed, there were multiple resources deployed through CloudFormation for this. I will stop quickly. When it loads, let me hide this. I hate how it doesn't hide. Come on. Forget it.

So we have Lambda, EventBridge, and AgentCore. That's when it comes to EventBridge. EventBridge invokes Lambda that then talks to AgentCore and tells it there is an event to investigate. Lambda error generator, that's the fake generator for errors where I will click several times just to generate some errors. Error alarm itself, agent SNS topic which will be used to send an email to an engineer in case the severity or sentiment of the error is high. Then we have the role itself and some metadata on the bottom.

Thumbnail 2320

All of these have adequate permissions to access all these services, so you can really scope the permissions for each of these to have access only to what it needs to have access to. Now let's go to the project file. It is similar to the one that we already saw, so it's similar to the projects that we already saw. You see requirements, you see instructions. The instructions are of course different. It's not the same thing.

I'm instructing in this case, I'm instructing the agent to, when it receives an alert, when it receives the alert, to go back to CloudWatch to collect additional data about that alert, and then based on that to decide what level of sentiment it is. Also, I am using several, you'll see several different tools to get additional information about the tool and about the alerts, and those tools are defined in the code itself. So I want to get log group name query. I want to create CloudWatch Insights query, when it started, when it ended, and so on and so forth.

Thumbnail 2390

Thumbnail 2400

Then on the end, you'll see, I also instructed the agent to retry, to retry at least two times if it can do it. So get more information tool, notify high severity problem tool, those are all the tools that an agent uses once it gets more information and determines the sentiment of the problem itself. Everything in the instructions file, so the better you do this, the better results you will have. Again, some error handling. You should always have some error handling. It's not the best or smartest one, but at least it's there for the tool to know, for the agent to know what needs to be done.

Thumbnail 2410

Again, this is a Python application. This application basically defines how the agent starts once it's initiated, what tools are available, what instructions are there, and so on and so forth, so it preloads everything into the service. I've been using Amazon Bedrock AgentCore Runtime for this, which basically in the backend builds a container, deploys a Docker container to ECR, then deploys the agent in it and runs it through that infrastructure.

Thumbnail 2440

Thumbnail 2450

Thumbnail 2460

Thumbnail 2470

The tools themselves are not invoked immediately, of course, when the tool is deployed or the agent is deployed. They are invoked after something happens. So there are two parts to this, you may say. The first part is how to start the agent, what are the instructions, and what tools are available. The second part is when something happens in an event-driven manner, as we've been talking about all day, then what to do with it and what tools to use to construct the proper query. This includes collecting data from the logs and then basically deciding what level of severity the issue actually is.

Thumbnail 2490

Thumbnail 2500

Thumbnail 2510

All of this happens before we've even seen our engineer yet. This is standing in the background of the service, listening for those alerts and then triggering when needed. So these are the tools that I have mentioned, either tools for sentiment analysis, tools for generating inquiries, or tools for getting additional data on the log files, because I just cannot decide sentiment, or the agent cannot decide the sentiment based just on there being an alert. It needs to go back, collect the alert information, and then basically decide the sentiment and what will happen next. So those tools have arguments like log group name, query, start time, and end time.

Thumbnail 2530

Thumbnail 2550

Thumbnail 2560

Of course, when that happens, it triggers a Lambda function, which will then go to the agent and determine the next possible steps. This could involve ECS service, Lambda, RDS, and so on and so forth. So again, it can send a message, it can restart a service, it can run code, it can be whatever is needed. The question, of course, here is how much automation do you want to give it? Do you need to have somebody in the middle, like a person that will go through this and validate if something is wrong? How much power do you want to give to these kinds of services while they're restarting, rebooting, changing, modifying, or sending some commands to have something done?

Thumbnail 2630

So here we are deploying the agent core from the terminal. It will generate the Docker container and run it, using all the instructions in the application Python code and collecting all the tools that it needs to basically run the agent. There are two Lambda functions, as I said. We have one Lambda that is like a dummy Lambda for generating errors. That's the one on the top, the Lambda Error Generator, and then the second Lambda is basically to act as a middleman between EventBridge and the agent core itself, which then defines again all those tools and how to access that information.

Thumbnail 2650

Thumbnail 2670

This is just a simple error generator which just triggers a couple of errors once we go back into the console to see how that gets triggered and the result, which I'm sorry to say is not super spectacular, but it will show you what actually happens. So let's go back to Agent Runtime. Let's see what happened. As you can see, Version 9, so there was one version in between which we didn't see that was run for test purposes. So this is now Version 9.

Thumbnail 2690

Thumbnail 2700

If we go to Lambda functions, you will see those two Lambda functions that we are basically talking about. This one is to communicate with the agent core. No, this one is to generate events, right? Yes, for testing. So this one will generate, if I click it multiple times, it will just send some fake events.

Thumbnail 2720

Thumbnail 2730

Thumbnail 2740

Thumbnail 2750

We expected errors against demo gods, so we even recorded one. It sends to CloudWatch, and you will see a couple of events in CloudWatch. Let's see that part. As you can see, several events were generated from Lambda. In real life, of course, this would be something that happened actually in production. Again, the alarm that is triggered changed state. When the state is changed, it creates an EventBridge event, and then from there we are choosing to send this to AgentCore. That's one of the options where the AgentCore with all those instructions will decide the severity and everything else. But once you're here, you basically can do anything. This is the beauty of EventBridge from previous options that you had.

Thumbnail 2770

Thumbnail 2780

Thumbnail 2790

Thumbnail 2800

Thumbnail 2820

Here again, I see two events happening. Back to Lambda, for the other one, the Lambda that will actually talk to AgentCore. It has instructions to go back to CloudWatch and look for more information on the problem. Again, you will see here that it has been triggered. It's a small trigger, but it's there. You see that this has been triggered. Let's see the logs itself. This is what's happening. Let me stop quickly here. Here you can see the full log of what's actually happening. Lambda errors, and inside here you can even see that it's going back, instructing that it's going back to CloudWatch and basically saying it needs to collect additional information on this error and define the severity of this error.

Thumbnail 2870

Again, there is no query anywhere. There is no secret query. The query and all the queries are generated on the fly based on the information provided from the log. At the end, it's a JSON payload. As you can see, severity was determined to be high. So severity was determined to be high in this case. It could be low, medium, or high. Again, we decide what is low, medium, high. I mean, we decide what action happens if it's low, medium, or high. In this case, it's high, so a high severity notification was sent.

Let me stop here. As I said, not a very spectacular result, but this is what we sent to our Site Reliability Engineer. This is the email he got. You can make it prettier. It doesn't have to go to email. It can go to a ticketing system with a high priority assigned to somebody and so on and so forth. But this is basically the idea. The general idea, as Jeff explained, is to actually automate part of this in a super easy way with the technologies that were already there for most applications in event-driven architecture. We are just adding on top of it some smarts in a way that lets us reuse existing LLMs but give them on top some of our own knowledge from the logs that we're collecting, and then utilize that to offload work from our people, from engineers that need to resolve those issues, of course, at the end for the benefit of the customer and the application.

Thumbnail 2990

Conclusion: Enhancing Event-Driven Applications Without Starting from Scratch

I hope I managed to put it a bit closer for all of you to understand what is the possibility and how to enhance your existing service application. You don't need to build everything from the ground up. Jeff and I will close up now. Thank you. That's a good demo. Hopefully that demonstrated how by starting with building your architecture in an event-based manner using EventBridge and then incorporating agentic agents into that process, you can help scale to deal with characteristics that you might have previously thought were beyond your business capabilities. With that, Martin and myself, we're happy with what we built. Our customers are happy, our engineers are happy, and our customer assistants are happy.

Thumbnail 3000

With that, we thank you very much for your time today. We thank you for coming to re:Invent, and we're going to be hanging around a little bit longer if you have any questions for us. Thank you very much.

; This article is entirely auto-generated using Amazon Bedrock.

AWS re:Invent 2025 - Launch web applications in seconds with Amazon ECS [Butterfly] (CNS379)

2025-12-09 03:44:19

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Launch web applications in seconds with Amazon ECS Butterfly

In this video, Malcolm Featonby and Thomas demonstrate ECS Express Mode, a new feature that simplifies container orchestration on Amazon ECS. They show how Express Mode reduces configuration complexity from extensive CloudFormation templates to just three required parameters: container image, task execution role, and infrastructure role. Thomas provides a live demo of creating, updating, and deleting services through CLI and console, showcasing automatic provisioning of load balancers, security groups, auto-scaling policies, and canary deployments. The feature manages up to 25 services per load balancer, implements zero-downtime updates, and orchestrates resource lifecycle automatically while maintaining full ECS flexibility. Express Mode is available now at no additional charge across all AWS regions where ECS operates.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction to ECS Express Mode: Simplifying Container Orchestration

Good afternoon, everybody. Thank you so much for coming out and seeing us. I hope you've been having a fantastic re:Invent. We're almost at the end of it, but we're loving the enthusiasm, so that's amazing. Thank you so much. My name is Malcolm Featonby. I'm a Senior Principal Engineer with the Serverless and Containers Organization, and with me on stage, I have the privilege of having Thomas. Thomas, do you want to introduce yourself?

Absolutely, thank you, Malcolm. My name is Thomas. I've been in the container space in AWS for close to six years now across a few different services. First, an Elastic Beanstalk, then an App Runner, and this last year I had the great privilege of working on ECS Express Mode. So we're very excited to share with you today just how simple Express Mode makes container orchestration on ECS.

Thumbnail 50

All right, so we're going to jump right in. One of the things that we wanted to be clear and just start off with is that this is ECS. We're talking about ECS Express, but ECS Express is a feature of ECS. And ECS is that product that, hopefully you all have been using the service. It's the one that you've come to know and love. It's the service that we launched eleven years ago. It's a service that currently provisions around three billion tasks a week in thirty-eight of our thirty-eight AWS regions in which it is hosted, so it's everywhere. And what we found is that a lot of customers, the first time that they come to AWS, sixty-five percent of those customers who are running containerized workloads start with ECS.

So really, our goal here is to make it clear that many customers, the logos of which are up there, are using this product. We launched over eighteen million tasks during our Prime 2025. So it's a very reliable, trusted, established service. And so the reason we wanted to make that clear is because some of the magic we're going to show you may make it look like it's different, but it's not. It's the ECS you've come to know and love.

Thumbnail 120

One of the things as a developer that I find that I love to do is it's all about solving the problem. And solving the problem is really about making sure that I can maximize the time I spend in building, in writing the code in my business logic. And I really want to as much as possible offload the undifferentiated work, that heavy lift that's required, that toil that's required in order for me to provision the load balancer, get the service deployed, et cetera. I really want to focus on the code because that's where the gold is. And we want to make sure when you're using ECS and your developers are using ECS that you're getting that same value.

The whole idea is really to maximize your engineering cycle so that they're focusing on making sure that they're delivering the value for your customers and the value for your business. So that and offload as much of that undifferentiated toil that is required and important, but offload that to the AWS services, and ECS is a prime example of where you can do that.

Thumbnail 180

Now ECS is a very rich ecosystem, and you'll see that if you've been using ECS, you'll see that it has a lot of moving parts. You know, there are task definitions, there's load balancers that you're configuring, there's certificates that you're managing, there's security groups, et cetera, et cetera. There's quite a lot to it, and there's a reason for that. In order for us to be able to make sure that we can meet you where you are and support your workloads, we want to be in a position where we can provide that rich platform for you so that we can support a heterogeneous workload type, effectively any workload that you bring to us.

But importantly, in many cases, certain types of workloads actually don't necessarily require that you do all of this customization, this configuration. And we wanted to offload a lot of that undifferentiated toil, that configuration, and again, get back to allowing your engineers to focus on that problem solving and not have to worry too much about this. This is what a typical web service application would look like if you were configuring it on ECS. All of the attributes that you see there, all of the configurations and content are what would go into comprising what's needed in order to get a web service up and running as an ECS service in your cluster running behind a load balancer.

Thumbnail 260

With ECS Express, it's about making sure that we can get you there as quickly as possible. And so with Express, what you'll see is in fact, we've cut down the number of things that you need to worry about as a developer to the absolute minimum. You need to have a container image because that's going to contain your application. But once you've got a container image and you bring to us a task execution role, which is the permissions that the application needs in order to run, and an infrastructure role which gives us permissions to be able to provision a bunch of that configuration on your behalf, you're kind of good to go, right? Super simple. That's all it takes, and we'll do the rest for you.

Thumbnail 290

Live Demo: Creating and Deploying Services with ECS Express Mode

Now don't take my word for it, right? We have Thomas here who has spent a bunch of time actually thinking about and building this. He is one of the technical leads on the program. So I'm going to hand it over to Thomas so that he can show you this. Thank you very much, Malcolm, and as the gentleman in the audience pointed out,

Thumbnail 320

let me show you just how simple Express Mode makes container orchestration. We'll start off in the CLI, and this is baked into the AWS CLI that you already know and are familiar with. We'll start off with a simple create express gateway service command, and as you see, all we require is that execution role, infrastructure role, your image, and we're already good to go. You get back all the configuration that Express Mode is defaulting and those best practices which are embedded on your behalf for you. We'll go into a little bit more detail on that configuration in a second.

Thumbnail 340

Thumbnail 360

Thumbnail 370

In the meantime, I want to go ahead and kick off our second express gateway service creation. This one will be a little bit more complicated, but you'll still be able to see the orchestration magic. I passed in this new parameter that we're offering, monitor resources, and this is available in the public AWS CLI. No third-party CLI required. You can see and tag along all the resources that Express Mode is provisioning on your behalf. You can see the cluster there that we use for the default, you get back immediately your service, and you can see the target service revision and then all of those downstream resources that we orchestrate and manage on your behalf.

Thumbnail 390

Thumbnail 400

Thumbnail 410

Importantly, you get your ingress path, that URL endpoint where your application will be accessible, and then you see your load balancer, target groups, security groups coming down to auto scaling. You'll see your scalable target is being registered as well as your auto scaling policy. The metric alarm is then used in case of rollbacks during deployment, so we will be able to monitor your metrics for 5XX and 4XX from your load balancer. Then the security group that we manage to ensure that we have least privileged access between your Application Load Balancer and your ECS service as well.

Thumbnail 420

Thumbnail 440

You're able to actually use this really cool monitoring feature during the create experience, update experience, as well as the delete experience. It's really the entire lifecycle of your application management. Now I'll hop out of that, and with those two services created, we'll go into the console here and refresh our cluster. As Malcolm was pointing to, this really is ECS at its heart. All the ECS bread and butter that you know and love, your cluster, your service, all the way down to your task definition is still available for you.

Thumbnail 460

Thumbnail 470

Thumbnail 480

Opening up our two services here, do it live 1 is that simple service we used just with the NGINX and the three required parameters. We can already see on our observability tab those metrics coming through, and this new load balancer metrics where we'll be able to live tail any request that gets sent to the load balancer. Coming to the resources tab, again you can see all those resources that Express Mode is managing on your behalf and follow along as they get provisioned. It looks like this service is very close to being done. Its deployment has nearly succeeded.

Thumbnail 490

Thumbnail 500

Thumbnail 510

Thumbnail 520

We'll take a look at do it live 2, that more complicated second application we created here, and similarly, we can follow along with the resource creation. Now, something interesting you may have noticed. If I open the load balancer here, this takes me to EC2 console. I can check out the load balancer. We have the listener here on 443 for HTTPS resolution, and we have six rules here. We're actually sharing the same load balancer now across six different Express Mode services, and we can scale up to 25 different services at a time that are using the same VPC network configuration for both public and private services. You benefit from the economies of scale where you're only paying for that one load balancer, and we're taking care of all the orchestration and scaling that load balancer and deprovisioning when necessary.

Thumbnail 550

Thumbnail 560

Thumbnail 570

Thumbnail 580

Coming back to here, it looks like our first deployment has succeeded, so our application is now available. Coming to the second one here, it looks like that application is still running. We have our task launched, and we're just wrapping up the deployment now. I'll go ahead and pop on to the URL that's vended by us, and there we go. In a matter of minutes, we have our scalable load balanced web application ready to serve traffic. That's a simple NGINX straightforward one. The second one that we created, which was a little bit more complicated, I actually passed in a task role to it, and that task role will allow the application to fetch live the number of tasks that are being used and served by ECS, so it's making calls to the ECS endpoints, as well as some environment variables.

Thumbnail 620

Thumbnail 630

This allows us to determine how many desired tasks there are, how many running tasks, and how many pending tasks there are. Additionally, we added in a custom scaling policy through that Express Mode API and we said, instead of the default of scaling with CPU, I want to scale by request count. So based on the number of requests, we're going to scale up or scale down the number of tasks that we have. And then we can open up this one here, and there we go. So we have our traffic visualizer running on port 8080 here.

Thumbnail 640

I'll go ahead and start sending requests now. So this is going to monitor the live number of requests that are being sent to this endpoint. We have 50 requests per second, and the auto scaling configuration was set to scale to 500 requests per target per minute. So we see we have one task desired and one task running for now. I'll pull this off to the side and we can come back to this a little bit later.

Thumbnail 670

Thumbnail 700

Complete Lifecycle Management: Updates, Auto Scaling, and Deletion

Now, it's not just about the creation experience, so it's not just about getting started, it's really as Malcolm was pointing to about that entire lifecycle of your application. So now I want to work through the update experience, and this is where things get really cool. When you think about an update for this simple application, we had NGINX, but let's say we move to that more complicated container image, which I just showed you, the traffic visualizer. We need to change some of the configuration, importantly, the image, and then the container port for this task and for the service itself.

Thumbnail 710

And just like that, in a single command, we're able to update it. But here's the thing, think about what's required behind the scenes to orchestrate this. If you were to update your container port for your service, you're going everywhere from the load balancer, where you're changing the security group egress path on that load balancer. You're changing the target group, the container ports on. You're changing your task definition, the container port there, and then the security group even that's associated with the service. So you need to orchestrate four or five different things all at once, and then you need to actually make the updates in the deployment at that same time, all without ensuring any downtime at all. So Express Mode takes that, takes that pain away from you, orchestrates that on your behalf with a single command, we can get that update going.

Thumbnail 750

Thumbnail 760

Thumbnail 780

Down here in the Resources tab, which I was shown before, you can not only see the resources that are being provisioned, but also the resources from the prior service revision that are now being deprovisioned. So we can see that we're reusing a lot of the same resources. Things like the metric alarm scaling policy that are still the same and that we didn't update, those are immediately available. Things like the target groups which now need to be recreated because of the new container ports, those will be created and then we can see the old target groups being deprovisioned as well.

Thumbnail 810

So through this panel, we're able to really monitor those resources at any point in time across multiple different updates for your service and see what is active at any given time and what is going on beneath the Express Mode layer itself. So talking more about beneath the Express Mode layer, let's really talk about the flexibility that you have with Express Mode. It's not merely this single API that we have for create, for you, update, and delete. As Malcolm said, you really have the full strength and power of ECS and the roots of it at your disposal.

Thumbnail 840

Thumbnail 850

So for instance, if you want to go in and fine tune some of the parameters in your task definition, you can go ahead and open that up, find your task definition here. Make the changes that you'd like, push another revision, update your service, and then you're able to integrate that back through Express Mode and you can continue to use that API and serve your service through Express Mode. So through that way, you really have the full flexibility of ECS at your disposal. What's going on now is you see for that service that we updated, we have two tasks running, and we can take a look at the Deployments tab to see what's going on there.

Thumbnail 860

Thumbnail 870

So we've launched that new deployment, and in this deployment you can see we're using canary-based deployment strategy. So we're using the state of the art best practices for ECS deployments in order to ensure zero downtime here. We have bake time integrated on the deployment and then we have monitoring as well with an alarm monitoring the number of 4XX and 5XX that are being served by your application. In case we need to roll back to the prior service revision, and all of that will be handled on your behalf.

So for this canary strategy, it will launch that second task, and then 5% of your traffic from your Application Load Balancer will be sent to that new task, while 95% remains behind. We'll bake for some time, and then we'll switch to full 100%.

If anything goes wrong, if your application has an issue and we need to roll back, we'll roll that back forward. Otherwise, you're good to go and you've completed your updates entirely.

Thumbnail 930

Shifting back here now to the traffic visualizer, we see that this is starting to ramp up the number of requests. So in just a few short minutes, we've sent over 12,000 requests to this same service, and we've integrated with auto scaling in such a way that we require three metric points of data from auto scaling before auto scaling sends a trigger to ECS, and then we will scale up. This allows for any small perturbations in your traffic. Let's say you have an occasional spike that goes back down. You won't necessarily need to expand and scale up your fleet, but once you have shown a consistent increase in traffic, then we will begin to scale up that fleet accordingly.

Thumbnail 990

And just right there as you see, we've now gotten that trigger from auto scaling. ECS has begun to scale up those tasks, so we've transitioned from one task desired, one task running to now three tasks desired, three tasks running, all without any intervention on your behalf. We've made this repository image publicly available, so if you'd like, go ahead and take a picture of this. Scan your QR code. That will send you to our public ECR repository where you can launch this application with one single command on ECS Express Mode.

Thumbnail 1010

Thumbnail 1020

Thumbnail 1030

Coming back again to our service, we'll go back to that more complicated application, and just to show you sort of the final stepping stone of the complete lifecycle management of your application, is how do you delete your Express Mode service. You see, we have all these downstream resources for you, but we've made this super simple to orchestrate on your behalf. See, Express Mode will identify which resources from your application are still needed and which ones are unique to this single service and can be deprovisioned.

Thumbnail 1050

So as we look at this, something like your cluster, which is still being used by multiple different services, that will be retained. Your load balancer, which as we saw before is used by multiple different services, that will be retained as well. Your log group, of course, you probably don't want to delete your log groups. You want to retain those as well, so those have retention. But things like your target group, your scaling policy, the service itself, those will be going to draining and then those will deprovision.

Thumbnail 1070

Thumbnail 1080

So again, you can follow along in this resources panel here, which will refresh for you. You can see we've deleted that listener rule, that target group, et cetera, and if we were to access the endpoints, you can see this is no longer accessible. So that's really the full end-to-end lifecycle of support and orchestration that Express Mode offers for you as well. Now, I'll hand it back to Malcolm, who will wrap us up a little bit and share more about our complete integration.

Thumbnail 1100

ECS Express Mode Benefits: Full ECS Power with Simplified Experience

All right, thanks, Thomas. Claudio, there you go. I got my mic back. So one of the things that we just wanted to call out there is that was a live demo, right? Like that's, you know, tempting the demo gods, but we have that much faith in the solution. So kudos, Thomas. And really what we wanted to kind of make clear here is this is ECS. It's an ECS feature. It's not something different. It is ECS.

What I really loved about Thomas's demo there is he shows you that although it's a simplified experience, the end result is we're creating resources in your account. Those resources are available to you. You can go and have a look at the task definition. You can add sidecars if that's what you want to do. So it is still full-blown ECS with all of that richness. It's just a much simplified developer experience, and it's end-to-end. It's not just about when you create the application. It's about managing the full lifecycle of that application.

One of the things that kind of really brings that home for me is if you have a look on the left, that very long bar that you probably can't read, that's the CloudFormation template that's required in order to establish and build a web service application running on ECS previous to Express Mode. On the right-hand side is what you do now. And importantly, this is alive and available right now, right? It's available through all of the different mechanisms, the SDK, the CLI, CloudFormation, and the console. That's what you end up doing, a significantly simplified outcome, although you still end up getting all of those resources that you need.

All right, importantly, ECS Express is a feature of ECS. There is no additional charge. The only thing that you're going to be paying for is you're going to be paying for the resources that you consume, so your load balancer and your compute as you normally would. This is really just about a simplified developer experience. We have 40 seconds before we get thrown off stage, so we're not going to take any questions right now, but we will be over here on the side and we would love to speak to you. We really appreciate you taking the time to come and see us. Thank you so much and enjoy the replay tonight.

; This article is entirely auto-generated using Amazon Bedrock.

AWS re:Invent 2025 - Grupo Tress Internacional's .NET modernization with AWS Transform (MAM320)

2025-12-09 03:43:49

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Grupo Tress Internacional's .NET modernization with AWS Transform (MAM320)

In this video, Grupo Tress Internacional shares their transformation journey from .NET Framework 4.6 to .NET 8 using AWS Transform for .NET. Armando Valenzuela, Head of Engineering, explains how they modernized a critical payroll stamping service processing 11.3 million documents daily for 4.4 million employees. The migration from Elastic Beanstalk to Lambda achieved 40% cost reduction, 70% reduction in development hours, and eliminated Windows licensing costs. The team used AWS Transform to automatically migrate 135,000 lines of code, cleaned 23 unnecessary NuGet packages, and leveraged Amazon Q Developer for manual fixes and validation. They deployed on Graviton-based Lambdas, achieving zero downtime during peak payroll periods. The presentation includes a live demonstration of AWS Transform in Visual Studio, showing the IDE integration, transformation process, and how Amazon Q assists with code modernization tasks like replacing Entity Framework with Entity Framework Core.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction: Grupo Tress Internacional's AWS Transformation Journey

Great, thanks for taking the time to see our presentation. We are going to be having a conversation with Grupo Tress Internacional, particularly Armando Valenzuela, who is going to share with us their transformation process using AWS Transform for .NET. The service was launched a few months ago, so they started in a fair release. Now at the event there are a lot of launches that you have already seen, so we are going to go through those as well. But let's hear what Grupo Tress's challenges are regarding their modernization efforts, right?

Thumbnail 50

So this is the agenda we're going to go through. A quick introduction, we're going to present Grupo Tress Internacional and what they do. Armando is going to dive deep into their challenges. He's going to share with us their transformation path, the transformation journey regarding all the key lessons learned that they got into their path while transforming their solution. And this is very interesting. He's going to dive deep into the architecture. As you know, this session is a level 200, so expect deep technical content and also the demonstration of the service and benefits that is going to be led by Thiago. And what's next in terms of lessons learned and what's the next adventure from Grupo Tress.

Thumbnail 90

Now who is Grupo Tress Internacional? So Grupo Tress is a leading Mexican company. They are focused on human resources management, payroll processing, and attendance control solutions. They have a long heritage. They have been building software since 1991, so you can imagine that they have a lot of legacy code out there. And their solution has been evolving during these years. They are headquartered in Tijuana, not so far from here, and by 2024, GTI as we call them, reached 68% coverage of all Mexico's manufacturing employees. So imagine that you are receiving your payroll and you are working at a manufacturing sector. You may want to receive your payments week by week or every 15 days. So this is how it works for the manufacturing sector in Mexico in particular. And they reach more than 1200 customers and up to 4.4 million employees, so they have a pretty large impact out there.

So let me introduce Armando Valenzuela who is the head of engineering. Buenos dias, bom dia in Portuguese, right, Thiago, and good morning everyone. Let me just introduce myself really quickly. I have been working in Grupo Tress for more than 20 years, and I started working on legacy Windows on-premise applications. Yeah, I worked with Delphi in that time. No shame on that. Then I moved to working on cloud native serverless applications, and now I work as a part of the architecture enablement team, and we serve multiple product development teams as well.

Thumbnail 220

The Friday Crisis: Challenges of Legacy Infrastructure and Modernization Imperatives

So picture this. You have more than 4 million users trying to request to your app, right? They're trying to do something meaningful with your application on a Friday evening. Customers are calling technical support for your help. The product teams are really stressful, and this application is running on legacy code of more than 10 years running on Windows being installed that doesn't escalate as you think or as you wish. Well, surely that is not a happy Friday for everyone, and that happened to us earlier this year. Week by week, and well, a lot of people were involved in that, including an infrastructure team and support.

Thumbnail 300

And I mention this quote, the leading sources of technical debt are architectural choices, because it's important to remark that migrating the code to a newer version of .NET is not the important issue here. You need to think broader to not derive in a technical debt that you're going to heritage to your new employees of your company in the following years. Right.

This modernization approach is something that we think about in our organization because our mission is to experience the joy of improving lives. The lives that we are improving are not just our customers or the employees, but also our developers. With this modernization, we could do both things. One, deliver a better service to our end users, and also improve the developer experience for our developers.

Thumbnail 340

What challenges do we have? Well, basically more than challenges, these are the pillars that we follow in this migration. One is to stay customer-centric. We needed to meet scalability and performance needs. The second one is to stay cost-effective. In this way, we need to reduce the operational cost and simplify the modernization and maintenance efforts, and that's related to the developer experience. Also, zero downtime requirements. This was a crucial need when you migrate a critical production workload. You need to do it without disrupting payroll operations or customer SLAs.

Thumbnail 410

Understanding the Payroll Stamping Service: A Critical Mexican Business Application

Okay, let me explain the application context. This is a very Mexican thing because the Mexican government requires for every payroll slip to be notified to the Mexican Tax Administration Service. It's like the IRS but in Mexico. Basically, the process begins when the payroll manager or administrator, using our HR management suites, in this case we have Sistemares on-premises and Revolution as a software as a service that's running on AWS, they send all the payroll data through the payroll stamping services that we provide to them. Because, well, we have security guidelines, I'm not allowed to explain all the VPCs, all the integration with networking and such, but I tried to explain the features that payroll stamping services provide.

Then, after the payroll managers send all the data, the employee through our self-service HR application is able to download and get the payroll slip. That's a huge deal for the employees in Mexico, and I'm sure here in the US as well. But for legal reasons, the companies are obligated to report those payrolls on time. If that doesn't happen, the company could have some legal issues with their unions or with their employees directly.

What we have here is a legacy XML PDF engine that actually generates the PDF of the invoice. We're replacing, we replaced this service with a Lambda service practically, and this Lambda has around 2 million requests daily. On peak days, we have up to 4 million requests, and for our services, this is the most used endpoint in our entire infrastructure.

Thumbnail 570

Okay, this is just an overall view of the monolith architecture. This is basically an Elastic Beanstalk containerized application that runs on two availability zones and multiple regions as well. I don't want to say that Beanstalk is not running well. Actually, it's running phenomenal. But the issues that we have with Beanstalk is that it doesn't scale the instances as we expect.

Thumbnail 620

Thumbnail 630

The Transformation Approach: From .NET 4.6 to Serverless Lambda Architecture

It doesn't have the same velocity as Lambda. So, using AWS Transform, what was our transformation approach? Well, I recommend being prepared. And I'm not talking about the product developers or engineers. I'm talking about your code. The first thing that I suggest doing is to analyze which projects are the most suitable or the best candidates to migrate. You surely need, after selecting the best project that matches your needs for this migration, to organize your legacy projects.

Thumbnail 690

Thumbnail 720

Talking about our experience, we had a lot of dependencies related to NuGet packages in our solutions, and I would like to ask you something. Please raise your hand if you are dealing with NuGet or package dependency issues lately. Yeah, just a few. Perhaps it was just us, but in our projects we had like 73 NuGet packages involved, some private and some public, and we removed around 23 unneeded packages. That's normal because developers actually add NuGet packages that they need, and with the inherited dependencies they add multiple and unneeded NuGet packages. That's common. Also, just to report that we had more than 135,000 lines of code. This is lines of code of the entire suite. You also need to have your unit tests in order, and if you don't have them, that's okay, you can create them later, but you need to have it in mind to compare the before and after.

Thumbnail 760

Thumbnail 780

Thumbnail 800

Thumbnail 820

Okay, with AWS Transform, we were able to automatically migrate from .NET 4.6 to .NET 8, and AWS Transform helped us to validate most of the code. Also, the packages were updated. But we needed to take care of less than 100% of the code. There were not just a few lines, but with the help of Amazon Q, we were able to do some manual fixes. And we could run local and functional tests. With Amazon Q, now named, well now Q Developer, we were able to multiply our efforts rapidly to jump onto production. After the adjustment and running the local application, of course we needed to build a standalone application to test the transformation. We entered into multiple cycles of refactoring. We moved the packages to the correct artifact, and I'm not talking about the NuGet packages. We decoupled our application or our service into multiple artifacts to not create a microservice with all the code involved.

Thumbnail 890

I was in a session this Monday, MAM402, there was a cool talk and they mentioned that the thing is not to create a shared monolith or a distributed monolith. So you need to think right and decouple your code before migrating to Lambda. Also, we were able to automate our builds via CDK and CodePipeline. And I'm going to go through our timeline in a more explainable way. We started this effort in April, and we were able to transform the code and do the manual fixes in less than two weeks. It took a little bit longer because we needed to decouple the service correctly.

Thumbnail 910

Thumbnail 930

During this migration, we were able to replace the legacy NuGet server to use CodeArtifact, CDK, and CodePipeline to integrate those. Then on the runtime, we didn't jump directly from Elastic Beanstalk to Lambda. We took a step with AWS Fargate for the microservices. First, we tried to run it on ECS Fargate and ran some tests. The behavior was okay with Fargate, but we didn't want to have more things to administrate or more things to do around ECS Fargate. That's not our traditional model. We tend to use more Lambdas for our applications, so we removed the containerization practically and changed the project to not use a Docker image for that, and we used a zip model instead.

Thumbnail 1020

The most important issue here is to add X-Ray metrics, CloudWatch, and of course get metrics inside of your running application and running service, because when you migrate your code, the operational environment is quite different. You need to keep track of what is not behaving as you expected. This is a high-level solution, and it's simple like that. It's a Lambda named Document Generator. It's almost the same, all the same buckets that we're managing. It has its own application database, so we were able to decouple the database as well, and everything behaves and integrates well with the payroll stepping service. Our self-service application named Mazorden didn't have to change their endpoints or anything. Everything is managed inside this context of the application.

Thumbnail 1070

Measurable Benefits and Key Lessons: 70% Reduction in Development Time and 40% Cost Savings

So what were the benefits of this migration? Firstly, we could do this in a rapid way. When I started the conversation with you, I said that the product teams were really busy, so we had to help on that as an architecture and enabling team. We don't usually invest the time on developing new things. We help the teams to design, to organize, to train them, but we were able to help with this migration. We calculated almost 70% reduction in human hours. We saved like two months of the team on manual code testing and validation. This was not just migrating the code. We needed to also generate new tests, new integration tests, and such.

Another thing was the strategic refactoring. The team was focused on value-driving refactoring rather than tedious manual work. I'm not saying that because we had the time, we invested the time in architecture. No, it's the other way. Because with this product, we were able to not just react to the problem that we had, rapidly do the migration manually, and then go to production. No, we were able to think this broader and decouple all the monolith for further migrations, not just the document generation that I explained.

Also, cost savings. We were able to reduce the infrastructure cost by over 40% using Graviton-based Lambdas. What I'm trying to say is that we were not just going through the basic Lambda configuration. In this case, the AWS Mexico architects team helped us to go farther. They said, hey, you can go Graviton directly,

and our packages that we are using to generate PDFs are well tested on Windows environments. So we were worrying about not having the same results. With Amazon Q, we generated more than a couple of Python scripts that automatically help us to validate A B tests. In the run, we also eliminated the Windows licensing costs. That's an indirect cost that we have on Beanstalk, so that's something that was good for us.

Regarding elasticity, on average we have 11.3 million payroll documents processed daily. We don't have an issue with that with Lambda right now. Serial manner scaling, of course we were not doing manual scaling in Beanstalk, but we had to be aware of what's happening on the peak days. Then the infrastructure team did some changes on it. So that's something that we don't need to be careful for now with our Lambda. Of course you need to follow the metrics and perhaps you can do some changes in your Lambda in case your demand increases, but for now we are calculating that with Lambda we're okay.

Thumbnail 1320

Then we had zero downtime during payroll spikes. This is just an example of what we had running the A B testings earlier this year. As you can see, we have the duration between half of a second. During the peak invocations, Lambda still has better results than when the invocations are low, so that wasn't good for us.

Thumbnail 1360

Lesson learned. Well, I'm going to try to summarize this really quickly, as I said before. If you have a running legacy application, you have the privilege to compare with the new version. In case you can run a task more than once, like we do with the payroll slip, you can compare the results of your legacy versus the new version. When I talk about the Python scripts that compare the PDFs, basically we did it on production. We deployed this at the same time we were delivering the PDF that we built in Beanstalk, and at the back end we were having the Lambda generated version. So we were able to compare the PDFs pixel by pixel and text by text.

In that way, we were confident that we can switch gradually with a canary release to the new version. So if you are in that case, I recommend to do this. In case your process cannot be executed more than once, of course you can go canary always and select your start users as well. Otherwise, there is a win on this.

The other thing that we learned, in a sense, is to adapt. Modernization is not just code migration. So don't expect to just transform and be on the other side the next day. But also, you need to be aware if you're going to migrate your application with the help of Amazon Q to Lambda. You need to be aware of cold starts, provision concurrency, SnapStart, and package size optimizations, because you're going to migrate something that was running on a monolith and works perfectly because all your singletons, your connections are well oiled in the production environment, not with Lambda. With Lambda, you need to be aware of your connection pools, your memory setups, and your connection with other services, right?

So in case your internal developer says something that is being installed was faster than Lambda, don't hesitate and procure this. There are probably issues with cold starts, and I just want to ask you a question. I don't know if you have many issues with cold starts on Lambda. Yeah, just be aware of that if you are migrating from data automation. We achieved fully automated validation that I mentioned before, the pattern scripts and other kinds of scripts with J reader as well to do critical benchmarking tests, right? So now let's get technical. Back to you, David.

Thumbnail 1590

Thumbnail 1600

AWS Transform for .NET: Technical Deep Dive into Automated Modernization

Thank you. Thank you very much, Armando. Thank you for sharing with us your case. It's pretty amazing what you have achieved in this time. Also, if you have any questions, you can approach the Grupo Tress team that is here by the end of this session so you can discuss it further. Now let's get into the AWS Transform and how it works.

Why modernize your applications? That's a pretty simple question. You may already know the answer, but there are a lot of factors in here. For example, Armando mentioned we improved response times of the services we moved from Elastic Beanstalk to Lambda. So from there we can deduce that we are going to have some cost optimizations. We will, as you can see, have performance improvements. And also the scalability, having Lambda for example instead of a monolith, will allow the teams to actually have more manageability and also more scalability in terms of what they are doing and what they are prospecting in terms of application growth.

Something interesting that Armando mentioned is that the AWS Transform service moves the monolith to actual microservices in containers. You can deploy, for example, transformed code to EC2 with Linux or ECS Fargate. But what Armando did and the team did was to actually migrate directly to a serverless architecture, and that's possible. And that's something that maybe you don't see in AWS Transform right now, but with the integration of Qiro and other tools, it is actually possible and it took Grupo Tress two weeks to actually reach this stage of maturity.

Thumbnail 1700

So porting to cross-platform .NET, it's hard and it's slow, and you already know that. That's why you are here, right? And also the AWS Transform service will take your code. So as you can imagine, everything is agentic now. AWS Transform is agentic now. This week it was launched, the next generation of AWS Transform that will allow you to have multi-agent capacity. What does this mean? You're going to be able to actually drive the modernization process the way you like it.

So through this agent that was released this week, you are able to actually drive the process of modernization. Maybe if you are looking toward changing the architecture of your services and only porting the code to this new architecture, you're going to be able to do so instead of actually depending entirely on the agent capacities to design this plan. So this is a huge improvement compared to what we had in the past. Also, detecting incompatibilities, as Armando mentioned before, they were able to clean their code and they found that 23 dependencies were just hanging around. They just removed them and the code actually became cleaner.

Then we port the code. The Transform agent, after analyzing and detecting incompatibilities, is going to design a modernization plan. And what's going on here? So the agent is going to design a plan based on your source code and projects that you want to migrate. As humans, we need to validate that process, so we need to be involved in this validation stage of this modernization plan.

Thumbnail 1830

After having the modernization plan and porting the code, we are going to be able to deploy directly to ECS Fargate or EC2 with Linux. In this case, it was very manual labor, as Armando mentioned. They would take months to do what they do in a couple of weeks. And a lot of stuff going around in terms of how teams will manage the migrated services.

So what we are looking at here is how long projects can be shorter, how licensing costs can be cut, and also suboptimal .NET porting quality. The transformation process is not porting the code only. It's going to find fixes that can be made within your code, and it's going to recommend code that has improvements in terms of quality of the code that is being imported and also quality in terms of security. So this is very important to have in mind.

Thumbnail 1880

And well, this is the introduction to AWS Transform for .NET that has two experiences: the IDE, which is the one that we are going to show you today, and also the web IDE. Both behave similarly in terms of functionality and the agents that are behind the service. But you need, for example, in the case of the .NET porting into the IDE, that is the second picture here, the integration with the AWS Toolkit. From there, you can grab the AWS Transform component and start your modernization. In terms of the web interface, you just need to connect your source code that could be placed, for example, in Azure DevOps, GitHub, GitLab, or Bitbucket, such as in the case of Grupo Tress.

Thumbnail 1930

And once we have that, we have connected our code, we have integrated Visual Studio, we can start the process of analyzing the code base. So what's going to happen here? These agents are going to index your code, are going to process this part. The code is being moved to AWS processing to actually understand what your code is, how it behaves, the dependencies it has. And once we have completed this part, here comes the transformation process.

Thumbnail 2000

The transformation process, again, really depends on a transformation plan. And once we have this transformation plan, we can iterate here, and this is something new. You can interact with the agent to actually drive the transformation process. And finally, the validation, the human in the loop part of every single agentic solution. What's the purpose here, and this is like the whole that this slide presents, is how we can take these .NET Framework applications to the new Linux .NET 8 applications and now .NET 10.

This is a more deep dive. A component that I want to highlight here is this part, the dependencies. For example, the analysis process is a little bit more complex within the service. It really requires analyzing code for incompatibilities, identifying and generating replacement code that is going to substitute your actual code. AWS Transform, once the transformation is completed, is going to push this transformed code into a version system. It will use the version system you connected to, and once again, you need to generate a branch so the transformation process can deposit code into that branch so we can have full control of how the ported code is going to be placed.

And here are two dependencies. If you develop your NuGet packages yourself, you can share those with the AWS Transform service so you can reach this dependency management control. If no NuGet packages are provided, or if the service does not find in its knowledge base those NuGet packages, the service is going to try to do its best to actually understand what those packages are doing.

So once we have done this, we apply code modifications. We can verify the code. If something didn't go as expected, we can get back and try again. So this is a process that is usually led by the developer or the architect.

Thumbnail 2110

Now, this slide is regarding the .NET transformation for .NET Framework. It will also transform MVC Razor interfaces. We have Web Forms to Blazor integration and also support for cross-platform .NET 8 and .NET 10. So there are a couple of targets for this transformation that are going to help us build modern .NET architectures. This also includes previous projects like Windows Forms and Windows Presentation Foundation for desktop projects, for example, and everything else here.

Thumbnail 2180

For large-scale modernization, you can take several projects at one single shot, or you can choose between all those .NET projects to start developing and transforming one by one, or you can grab all of your code base and start transforming from there. Now, the MVC Razor part is pretty interesting. It has been requested a lot, and it's going to take and transform your code and port it to ASP.NET Core. Also, Web Forms to Blazor transformation is supported as well, and it's shown here. It's going to take your Web Forms and transform them into Blazor. So this one's pretty new as well.

Thumbnail 2210

Thumbnail 2220

And how does it look like? Here, Thiago is going to share with us a demonstration on how this works and how to integrate into Visual Studio the AWS Transform components. And before diving deep, this is just a reminder of the connections for code. You can connect GitHub, GitLab, Bitbucket, Azure Repos, and also Amazon S3 for code analysis. If you don't want to actually connect your source code repositories, you can share them using Amazon S3.

Thumbnail 2270

Also, the view assessment is a summary report that you're going to be able to see when the transformation process is complete, and you will also be able to provide NuGet packages. These could be developed internally, or they could be from third parties. Now, the console experience is pretty much the same, but in this case, we are going to have full support of an agent and a chat console that you're going to be able to interact with. What's the advantage of this part? Having the capacity to actually drive the modernization efforts to a certain architecture, to a certain way of coding that you have in your companies, is crucial for driving the modernization efforts you are paying attention to, and this is very important. We are going to be reviewing this as well.

Thumbnail 2300

Thumbnail 2310

Thumbnail 2330

This is the integration. Amazon Q is going to help us, as Armando already mentioned. They used Amazon Q at the last stages. So what's the recommendation here? Start with AWS Transform. Once you have ported your code to .NET 8 or .NET 10, use Amazon Q to actually improve code, make fixes, or port to Lambda, for example, which was what Grupo Tress did. And the SQL Server transformation, maybe you are wondering what happens to the databases. Well, this is pretty straightforward. We can also support databases including SQL Server and manage dependencies within your code.

Thumbnail 2350

How do we do that? Here is the part before and after the transformation. So this is a full-stack Windows transformation, and you will have everything here starting from your application, your database layer, and your virtual machine running on Windows Server. The intention of this part of AWS Transform is to actually modernize the full stack, not just leave it with application transformation. You can focus on that as well if you want to keep your databases, but if you are planning to migrate everything to Linux, this is the way to go. So the target is to have a cross-platform .NET application, run your databases in Aurora PostgreSQL, and also have the infrastructure components regarding Amazon ECS and Amazon EC2 with Linux.

Thumbnail 2410

So let's get to the demo.

Live Demonstration: Modernizing .NET Framework Applications with AWS Transform and Amazon Q

Thank you, Thiago. Thank you, David. Good morning. What a great achievement Grupo Tress did with the modernization of their application. And today I will demonstrate how you can achieve the same. My name is Thiago Goncalves. I'm a Solutions Architect at AWS, and for the past 20 years, I have been developing and modernizing applications.

Thumbnail 2450

Thumbnail 2500

So when we talk about AWS Transform, as David mentioned, we have two options. One is the web experience. If you are in a DevOps team and you want to modernize applications in batch, you will use the web experience. But if you are a developer and you want to modernize one application at a time, you will use the IDE version of AWS Transform. So what I have here is Visual Studio, the IDE for .NET application development. And if you want to modernize your application using AWS Transform, the first step is to install the AWS Toolkit. So you go to Extensions and search for AWS Toolkit with Amazon Q, and this is the first step. You need to have the extension installed in the Visual Studio IDE.

Thumbnail 2530

With the plugin extension installed, you will have the option to enable AWS Transform in the IDE, and we have a few options here. If you want to use only AWS Transform, you can choose the first option, the option on the right side. And we have the Amazon Q Developer, which now is the Q Pro subscription where you have the AWS Transform and plus generative AI options to help you modernize your application. So in this case, for this demonstration, I will use the Amazon Q Developer subscription.

Thumbnail 2570

Thumbnail 2580

Okay, with the extension enabled in Visual Studio, on the right side I have one solution here. It's a .NET application. In here I have a project with .NET Framework. We can see here that it's the version .NET Framework 4.7. And as we know, the .NET Framework is no longer supported by Microsoft, and what I want to achieve is to modernize this application to .NET Core version 8. So how can I accomplish this? In the Solution Explorer, I right-click on the solution, and I have this option: Port solution with AWS Transform.

Thumbnail 2620

Thumbnail 2640

And it will ask me what is the target. Right now we only support .NET 8, but soon we will support .NET 10 as well. And I will start the job transformation. So as we can see here, the first thing is your application needs to be in a state of build. Otherwise, the transformation job will not start.

Thumbnail 2660

Thumbnail 2670

Okay, now my application is building, and what is going to happen right now is AWS Transform will package all your source code, the packages required for your application to execute, and this source code will be sent to an AWS account, a sandbox. It will create a secure connection with an AWS account. It will send your source code there, and the transformation happens in the AWS account. The transformation job doesn't happen on your local computer. So right now nothing changes in your solution. The source code will be analyzed and transformed in an AWS account.

Thumbnail 2720

Thumbnail 2730

This process will take about 15 to 20 minutes for this small solution, so I will not wait until the transformation job is done. I have one solution in here where I already completed this transformation. When your source code is transformed and the job is done, you will get the response in the IDE, and this is how it looks like.

Thumbnail 2750

Thumbnail 2780

In here, I have a summary of what has changed and the projects that I have. In this part here, at the bottom of the screen, I can see that new files were added to my solution. Before, I had a Web.config in .NET Framework, but I no longer need a Web.config. Now I have appsettings.json, so AWS Transform automatically added the files that were missing for a .NET Core application. All new files that I need to have this application running on .NET Core are now available. Files that are no longer needed, like Global.asax and other files, were removed or renamed.

Thumbnail 2810

Thumbnail 2820

Thumbnail 2830

I can see there are some changes in my source code, and I can review the changes that were done. In here, I can see some using statements were replaced and some small changes were made in my source code. There is a change, and I can download the summary of everything that was changed in my application and see all the changes and the packages that were changed or need some attention. On the left side, I have this Linux readiness assessment, which demonstrates that for this transformation, the job was not able to replace Entity Framework with Entity Framework Core, so this is something that I need to change manually if I want to make this application run or execute on .NET Core. With this report, I have an overview of everything that was done in my solution.

Thumbnail 2870

Thumbnail 2890

Thumbnail 2900

As I said, right now, nothing has changed in my solution. So if I want to apply the changes that the job made to my source code, I need to select all the changes and apply the changes. Since the project file was changed, it's asking to reload the solution. Now I have all the changes applied in my solution. On the right side, I can see now that I have the appsettings.json, the Startup.cs, Program.cs, and all the new files required to have this application running on .NET Core.

Thumbnail 2930

If I look in here, the framework now is .NET 8, so all the changes required for this application to execute on .NET 8 are done now. This is the first step. What about the changes that I mentioned that I need to replace manually? My application is now ready to work on .NET 8. The AWS Transform job doesn't allow me to choose more interactions or to interact with the transformation, so it knows how the transformation works, what needs to be replaced, and what needs to be changed, but I don't have many more options to interact with my source code. The only mission or the only goal of AWS Transform is to make your application available in .NET 8.

Thumbnail 3000

If you need more interactions with your source code, we have another option here, which is Amazon Q Developer. This is the next step when you have your application transformed. As I mentioned, I have Entity Framework that needs to be replaced with Entity Framework Core.

Thumbnail 3040

Thumbnail 3060

Thumbnail 3080

Thumbnail 3090

So what I'm going to do now is ask Amazon Q to replace Entity Framework with Entity Framework Core. This is very powerful because it can access my source code, the solution, understand what's happening with my application, read the files, and replace the files as needed. So it's asking to have access to my solution. I say yes.

Thumbnail 3100

Thumbnail 3120

Thumbnail 3140

So it knows that I need to execute some commands. It knows what commands need to be executed in my solution, and we start to replace the packages, the references. And we will change as well the source code, my files that need to be changed in order to have Entity Framework Core working in my solution. And I can see all the changes that it's making in my source code.

Thumbnail 3200

Thumbnail 3220

So with these two tools, AWS Transform and Amazon Q, I can interact with my source code, add new classes, ask to add new pages, add new routes in my application, all of this just typing what I want in plain English. So it's very powerful when we are modernizing, or if you want to understand, I'm a new developer and I want to understand what's going on with my application, how the connection with database is going on. I can debug my application, fix errors, or just type the errors or message that I want, the goal that I want to accomplish in the chat, all of this without leaving the IDE.

Thumbnail 3230

Thumbnail 3280

And this is the first step when we are talking about modernizing our application. It's not just moving from .NET Framework to .NET Core. After this, if you want to accomplish more performance or reduced costs, the next step is to move our application to execute on Graviton. So now that we have our application able to run on Linux, we can take advantage to move our application to execute on Graviton. And we did a performance test with one application and compared the cost of this application running on a Windows machine, and then later converted this application to run on AMD, and then comparing with Graviton. You can see the cost after modernizing your application and executing the same application on Graviton, the cost difference is huge.

And not only the costs, the performance of this application is much better. So it's not only about modernizing your application from .NET Framework to .NET Core. It will give you better cost and better performance when executing on a modern architecture.

So thank you, and now we have Armando to say more words.

Thumbnail 3320

Looking Ahead: Continuing the Modernization Roadmap with Strangler Fig Pattern

Oh, thank you. Thank you, Thiago, and thanks everyone for listening to us. What's next for us? Well, this was just the tip of the iceberg of our migration roadmap. Basically, we're going to continue modernizing our .NET Framework projects using the Strangler Fig pattern. But now, with what we have learned at re:Invent, we need to rethink this model. We need to go further with Amazon Q to accelerate the code development and modernization as well. Of course, we need to adopt a composable serverless architecture to speed delivery and boost efficiency. And of course, we need to enhance the other features that we have in our payroll stamping services. We need to enhance the SQL database integrations, so perhaps we're going to migrate from SQL Server to another database model.

I just want to end with this quote. It is not about migrating your code. You need to think broadly and involve key members of your team. This time we were able to work with Mauricio, Roberto, and Pedro's team, the architecture team, and they got really involved in this solutioning so we could go farther and deliver this on time and with the expected results. So don't leave your developers alone on this. And well, that's all from my side. Thank you.

; This article is entirely auto-generated using Amazon Bedrock.

AWS re:Invent 2025 - Kiro meets SaaS: Generating multi-tenant agentic applications with a GenAI IDE

2025-12-09 03:43:29

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Kiro meets SaaS: Generating multi-tenant agentic applications with a GenAI IDE

In this video, Aman and Anthony demonstrate building a production-ready multi-tenant SaaS platform with embedded AI agents in just two and a half weeks using Kiro, AWS's AI-powered IDE assistant. They showcase how Kiro's spec-driven development, agent hooks, and MCP server integration enabled them to create a complete control plane and help desk application following the SaaS Well-Architected Lens principles and leveraging the SaaS Builder Toolkit. The session includes detailed demos of crawling AWS documentation with Crawl4AI to create steering documents, configuring MCP servers for AWS services and Strands, and the actual development workflow that produced seven independently deployable components with 226 test files and 327 documentation files. They share practical lessons learned about testing frameworks, AgentCore integration challenges, and the importance of clear requirements, ultimately demonstrating how one developer achieved a 1:4 test-to-production code ratio without writing a single line of code manually.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction: Building Multi-Tenant SaaS with AI Agents

Good afternoon, everybody. My name is Aman. I'm a Cloud Architect at AWS. I work for Professional Services, and alongside me is Anthony, who's a Senior Solutions Architect at AWS. Together we'll be presenting to you SAS406, Kiro meets SaaS. Now this is listed as a 400-level talk track, so it's going to have a lot of video walkthroughs, but we do assume a level of SaaS knowledge as well as some AI tooling knowledge before.

But before we begin, I want to ask a question, and I want to see a show of hands. How many of you here have ever been in a meeting before where someone said, let's add an AI agent to it? Can I see a show of hands? Good, lots of people. That's great. And how many of you have been knee-deep building a feature when the business suddenly asks, oh, can you make it multi-tenant while you're at it, just very casually? Anyone who's heard that one before? Fewer people than the first one, that's all right.

For any of you who haven't been to either of these two meetings before, congratulations, you are in that meeting today. And what we're actually going to do is we're going to say yes, let's do it. And not only that, let's do it the right way: tenant isolated, well-architected, multi-tenancy in mind, the whole nine yards.

Thumbnail 130

So as we begin, let's talk about what we actually committed to build. We committed to build a control plane on AWS, not a prototype, not a demo, the real deal: multi-tenant from the ground up, proper isolation, proper security, proper everything. We also committed to build a demo tenant application that is basically a service management product allowing tenants to create tickets, comment on them, or search knowledge bases. And what we wanted to do is bake AI agents into both of these applications: AI agents for the control plane allowing SaaS admins to manage tenant operations, and AI agents within the tenant app to allow tenants to actually communicate with the AI agent and raise tickets by natural language. And we decided to build all of this using the SaaS Builder Toolkit and the Well-Architected Framework, no shortcuts.

Thumbnail 170

Now, if you're thinking that sounds like a six-month project with a massive team, well, that's what we thought too. But what if I was to tell you that there was a different way? And that's what brings us to our session today, and this is the roadmap. For the next 50 to 55 minutes, this is going to be your map for traversing this session. We're going to tell you this exactly the way we experienced this adventure. We talked about our goal, we're going to dive deeper into the SaaS challenges, and we're going to talk about the tools we used: Kiro, the Well-Architected Lens, and the SaaS Builder Toolkit. Then we'll go into the actual build, the adventure as we like to call it, the real story on how we actually built this thing, the wins, the way that actually works, as well as the lessons learned the hard way. And we'll wrap up with where we are today and what this means for building multi-tenant SaaS through AI. Sounds good. Let's jump in.

Thumbnail 220

The Challenge: Balancing Speed and Quality in SaaS Architecture

So why is this quest so challenging? Because anybody building SaaS today is facing a fundamental dilemma. We're pushed to ship features and innovate faster than ever while managing complex architectures, and at the same time we have to manage tenant isolation, data partitioning, and the works. We need to handle multi-tenancy, ensure bulletproof security, and maintain a very high quality without accumulating crippling technical debt. Balancing this act of speed versus quality is the fundamental challenge we all face as builders today.

Thumbnail 260

So let's jump into what we'll actually be building. On the left-hand side, you can see the control plane. We're going to start with the necessary control plane elements. We're going to need an entry point into our application. For us, that's going to be the SaaS admin portal. It's going to be a React application hosted in an S3 bucket fronted through CloudFront. The backend services are going to be behind an API authorizer using API Gateway, and that's going to be connected to an identity provider such as Amazon Cognito. The backend will also host services for tenant registration, deregistration, user management, as well as all other SaaS operations.

Thumbnail 300

Next, we'll need to embed some AI agents into it. So what we're going to do is we're going to create a Lambda proxy, and then we're going to use the Amazon Bedrock Agent Core runtime

Thumbnail 320

combined with Amazon Agent Core short-term memory for session level persistence. All of the images are going to go into our trusted ECR repository. That's all great. All of this can actually take a lot of tenant registration and user registration requests, but how do we actually provision for that? We need a provisioning engine. What we're going to do is connect our APIs through an event bus and connect our service to be a step function that can actually trigger a code build job and run some CDK code to actually provision the tenant infrastructure. That is the snapshot of our control plane services for tenant registration, deregistration, user management, and AI agent baked in, connected through an event bus to a provisioning or a deprovisioning service.

Thumbnail 360

Next we need to think about our tenant application. For this demo, we wanted to build a service management product. We decided to use serverless tech to actually power up the features for the demo, and we also added an AI agent to it. Similar to the control plane, we have a Lambda proxy that fronts the agent core runtime with agent core short-term memory, and we make sure that we have end to end observability through agent core observability. And finally, that is our quest for today. You can see that the left side covers all the services that we need for the control plane, and the right side will serve as our tenant application plane.

Thumbnail 400

Kiro's Core Capabilities: Spec-Driven Development and Automated Workflows

So how do we actually tackle all of this without a massive team and a six month timeline? Well, that's where Kiro comes in. Here's the thing, building a to do app with AI, lots of tools can do that. But building a secure multi-tenant SaaS platform with AI agents embedded with proper isolation and well architected principles, that's a very different game. Before we go into SaaS, let's dive into the capabilities of Kiro first. For those who haven't used Kiro, this is going to be a somewhat 200 to 400 level talk track, so we'll take you along with the journey.

Thumbnail 440

First, spec driven development. You describe what you want to build. Kiro generates a comprehensive spec with requirements and architecture. You iterate together until it's right, and then you build from that approved blueprint. It's like having a tiny obsessive project manager in your IDE. Everything's documented, everything's very well structured. The tasks have links back to the requirements, and nothing falls through the cracks. For the first time after using Kiro, I felt like it was an actual AI partner instead of a black box because I could see what it could see.

Thumbnail 480

Thumbnail 490

Second, agent hooks, automated workflows that can run in the background. Save code, tests get generated. Update components, documentation gets updated. Change some infrastructure and security tests run. Quality becomes automatic. And third, Kiro allows you to connect to all the MCP servers on the internet. What that allows you to do is actually expand Kiro's context and teach it things that it doesn't really know about. Spec driven planning, deep context, and automated quality, that's what makes Kiro so powerful.

Thumbnail 510

Thumbnail 520

Thumbnail 530

Live Demo: Exploring Kiro's Features from Steering Documents to Agent Hooks

So next we're going to jump into a demo. I'm going to move to the lectern to demo this to you. Let's begin with the capability overview. The ghost icon on the left provides a comprehensive menu of all Kiro features. As you can see, this includes specs, it includes hooks, and it also includes steering documents, and an advanced context management plan for MCP servers. We currently have the FetchMCP server enabled, which allows Kiro to access any website on the internet and learn from it.

Thumbnail 560

Thumbnail 570

Thumbnail 580

The next feature that we're going to explore is agent steering. We're going to start with generating some steering docs. Now this is a brownfield project. When I click that button, Kiro starts reverse engineering the project's source code to create detailed markdown files. These documents provide essential context for specs and prompts that you'll be running for the entire project. With a single click, Kiro has automatically generated these files, and for this demo, what we're going to be using is a project that's based on HTML, Python, and SQLite database.

Thumbnail 600

The first file that Kiro generated is a product.markdown file. This document provides a clear human readable narrative of the source code functionality, and as a developer, I can review this file to quickly grasp the application's purpose. The next important document is the tech.md file.

Thumbnail 620

Thumbnail 630

Thumbnail 650

This document summarizes all the technical components within the source code. It details the libraries, specific versions, and packages used both for the data layer, as well as some important commands that you need to run your project. The final document that Kiro will create from the steering docs functionality is the structure.md file. This file acts as a roadmap for the AI, enabling it to understand the source code structure. When I request a new feature specification, Kiro keeps to the original project's structure and follows the conventions because of this file and what's in this file.

Thumbnail 660

Thumbnail 680

Thumbnail 710

Users can also create custom steering files. Users can create custom steering files to enhance Kiro's functionality. It can be done manually by creating a new file and fiddling with it yourself, or you can ask Kiro to create one for you. In this case, I'm creating a style guide that's based on PEP 8, which is a very well-known standard for Python code. Now the next thing Kiro's going to do once I give it this prompt is it's going to use the Fetch MCP server to actually retrieve the information that it needs for this PEP 8 style guide. And then it'll ask me for approval before every external call.

Thumbnail 720

Thumbnail 730

Thumbnail 750

It'll then summarize this document and save it as a new style guide.md. As you can see, the steering file has detailed understanding of the PEP 8 standard for Python, and it'll be included now for every prompt that you give to Kiro in this project. In the chat interface, developers can seamlessly switch between Spec for structured development and Vibe for more freeform interaction. For our spec-driven workflow, we'll start with a simple prompt to add a new feature to manage categories in this project. From that, Kiro will build out detailed requirements, a design document, and a project plan that it can iterate on and you can iterate on with Kiro before writing any single line of code.

Thumbnail 770

Thumbnail 790

To accomplish this, Kiro ingests the steering documents that we created earlier. This includes the product document, the structure document, and the tech document, as well as the custom style guide we created earlier. It analyzes the source code and then begins generating the first part of the spec. This first part is called the requirements.md file, and it expands our initial prompt into a fully realized feature concept. It utilizes the EARS format that stands for Easy Approach to Requirements Syntax. It's an industry standard, and you may have seen this before, as there's going to be a clear acceptance criteria for each requirement.

Thumbnail 800

Thumbnail 810

Thumbnail 820

Thumbnail 830

Thumbnail 840

Once the requirements are set and we're happy with them, we move on to the design phase. Kiro translates the requirements into a technical implementation plan or a design.md file. This requires deep understanding of the existing source code, and this is the file that I spend most of my time in. This file is very well suited to architects as it will outline the architecture of what it's about to build, as well as what is the plan for building it with some given examples. You can see it'll have things like what does the data model look like for this category feature, as well as what the error handling techniques are going to be and all of the good stuff.

Thumbnail 860

Thumbnail 870

Once we have our design approved, we move on to our final step. And this final step breaks down the requirements and the design document into comprehensive project tasks that Kiro can execute and keep track of. Each task includes sub-elements, a status tracker, and a clear linkage back to the specific requirement that it fulfills. What this does is what I said that the obsessive project manager does in your project. You can actually see all the tasks, and it marks them as complete as it goes through them.

Thumbnail 890

Now let's conclude with a look at our final key feature, agent hooks. Hooks can easily be configured from the Kiro panel using a very simple form, and you can describe in natural language what you want the hook to do. In this example, I'm going to ask Kiro to monitor for any changes in any Python or HTML files, and when that file is saved, I want it to summarize the changes and append them to another file called changelog.md.

Thumbnail 910

What I want to do is have a tracking system to be able to log all my changes. From this natural language, Kiro generates the hook. The hook consists of a few key elements: an expanded, more detailed prompt, the trigger event, which in this case is on file save, and some file patterns to monitor, which in our case are going to be .py or HTML. This hook is actually placed into the .kiro folder within the IDE. It also includes a title, a description, and an easy toggle to turn it off and on. This is how you automate your workflows within Kiro.

Thumbnail 940

Thumbnail 960

New Kiro Releases: Property-Based Testing, CLI, and Kiro Powers

Kiro is amazing, but it's still growing. There are some new releases that do deserve an honorable mention, and I'd like to share those releases with you. First, property-based testing. Traditional unit tests check for individual examples. Given this input, do I get this output? It's heavily biased by the developer developing the feature, and a developer can easily miss some edge test cases. Property-based testing is different. What it does is it works out what are the properties of a spec that must hold true at all times. Then it uses a generator to generate hundreds and thousands of tests and runs through them. It makes systems more robust because even if one single test case fails, Kiro has found a bug. It needs to then ask you whether you'd like to change the source code or the spec.

Thumbnail 1000

Second, Kiro CLI. Kiro CLI extends your AI agents from the IDE into the terminal using the same configuration and capabilities. Debug production issues, scaffold some infrastructure, or automate workflows without context switching. AI assistance where you actually need it. This is actually a facelift on the old Q developer CLI and now fully integrated into the Kiro CLI.

Thumbnail 1020

Third, when building complex systems, your code often spans multiple repositories. For SaaS, it's especially true. You're going to have a front-end repository, back-end repositories, specific libraries, infrastructure code. Kiro allows you to open all of these repositories within the Kiro workspace while allowing you support for multiple roots. That means that there's going to be a .kiro in each of these projects, and Kiro knows where you are contextually.

Thumbnail 1050

Thumbnail 1070

And finally, checkpoints. This has been a lifesaver. As a developer, I tend to frequently go off the rails and get too knee-deep into a feature in a particular session, and it is an auto-save checkpoint that I can revert back to within a session. And this one is actually hot off the press. True story, I put this slide together yesterday, so I wanted to share this with you.

Kiro Powers is the newest addition to Kiro's capabilities. Even though we're not using this in the context of the presentation, we'll talk about it a little bit. Think of a power like Neo in The Matrix. What if you could download Kung Fu right away and use it and forget about it once you're done with it? That's what Kiro Powers is all about. When you load a bunch of MCP servers into memory, that takes up a lot of context from your sessions, but Kiro Powers makes this dynamic. When you say things like I want to work on the database, that database command would trigger a power that you might have installed. In this case it could be Supabase, and it will help you talk to that database with the MCP servers required. And once you say I want to deploy something or create a new agent using Strands, it will take and download the Strands power, and you'll have all the access to the steering files as well as the tools for Strands.

Thumbnail 1140

Thumbnail 1150

You can browse all available powers on kiro.dev/powers. We already have a wide range of powers listed, allowing for API testing with Postman, design to code for Figma. Supabase also is available to power up the Supabase databases, and there's a new SaaS Power that I was tinkering with last night which seems super promising. We haven't used this for this demo, but you should sure go and play with it.

Thumbnail 1160

Thumbnail 1170

Here's how you see the powers option once you update your Kiro. You'll see a tiny ghost icon with Zeus's Thunder showing off its powers. I absolutely love this little ghost. You simply can one-click install any power and start using it today. Cool, we've shown you some of Kiro's capabilities. Powerful stuff.

Thumbnail 1180

Adding Wisdom: Integrating the SaaS Well-Architected Lens

But let's be honest, the risk with AI isn't that it cannot build stuff. The risk is that it will happily build the wrong thing really, really fast and really, really well. And in multi-tenant SaaS, wrong is expensive. A tenant isolation bug, that's not a quick hot fix, it's a security incident. It's customer trust and potentially your business's reputation. So we needed to give Kiro a little bit more than just technical capabilities.

Thumbnail 1220

We needed to give it wisdom, the accumulated knowledge of what makes SaaS architecture work in production. And for us we needed a North Star for that, and we chose the SaaS Well-Architected Lens.

Thumbnail 1250

The SaaS Lens addresses unique challenges of multi-tenancy and gives you a common language for architectural decisions. And here's how it shaped our actual requirements. Operational excellence meant tenant-aware monitoring. We needed to understand the health and performance of each tenant and not just systemwide. Security being non-negotiable, the lens enforces that tenant isolation is absolute foundation. Any breach is catastrophic, and our design had to be bulletproof from day one.

Thumbnail 1260

Thumbnail 1270

Thumbnail 1280

Reliability meant blast radius containment. Preventing one tenant's issues from affecting others is super important. Nobody wants to be woken up at 2 AM because of a noisy neighbor. Performance efficiency guided us to implement tenant-based scaling, which is far more cost effective than one size fits all. And finally, cost optimization. We need to ensure that we're protecting the business's bottom line as we scale, and this part of the pillar helps us with that.

Thumbnail 1300

Thumbnail 1310

Web Crawling and MCP Integration: Teaching Kiro Best Practices

Now the next challenge that we faced is that we needed to somehow get all of this in Kiro's context. Now, the SaaS Lens is actually a collection of web pages on the internet, and that's where we figured we need a crawler. To do this, we'll use an open source web crawler with over 55,000 stars. Crawl4AI is my tool of choice for open source web crawlers, and I use this all the time when I'm web crawling.

Thumbnail 1320

Thumbnail 1330

Why do I like it so much? First, because it's markdown native. It'll scrape and process all the documents into a markdown file, which is great for LLMs because LLMs love markdown. Second, it's fast and efficient and asynchronous. I can actually crawl multiple links together at the same time and extract media tags such as images, audio and video. It also handles the tricky stuff for you, such as infinite scrolls or some other JavaScript magic that is new on the block. And finally, my favorite, it's free and open source, so you can use it too.

Thumbnail 1360

Thumbnail 1370

Thumbnail 1380

So next what we're going to do is we'll start in Vibe mode and ask Kiro to use Crawl4AI to crawl the SaaS Well Architected Lens. We've given it the URL and right now what it's trying to do is it's trying to install Python and pip dependencies and Crawl4AI to my laptop using its agentic features. Then it uses this understanding of the library to write a Python script that will help me crawl the entire SaaS Well-Architected Lens. It quickly figures out that it needs to fetch the supporting links and not just the landing page. It goes back and forth to ensure the completeness of the crawl.

Thumbnail 1390

Thumbnail 1400

Thumbnail 1410

Thumbnail 1420

Next, we need to optimize the relevant content and convert it into a steering document. After the crawl is complete, Kiro automatically cleans the data, removing irrelevant texts such as cookie preferences to ensure the content is usable. After a few iterations, we have our final steering document. This gives Kiro our first superpower that we needed, the wisdom that we were after. It lets us stand on the shoulder of giants and lets us build from that instead of making the same mistakes again.

Thumbnail 1430

Thumbnail 1450

Thumbnail 1460

Now you can see the finished version of the crawled and optimized SaaS Well-Architected Lens as a steering document within Kiro. Next, what we want to do is arm our IDE with some important tools. What we're gonna do is we're gonna edit the MCP.json file, which is where all the MCP config lives. We're gonna attach the Strands MCP server, which has specific information on how to use Strands to build agents. We're gonna also add the AWS documentation server that provides our IDE up-to-date information on AWS docs. And finally, we're gonna add the Amazon Bedrock AgentCore MCP server because this is a new service and to get around the context limitations for foundational models we've got to do that.

Thumbnail 1500

And to complement this, I'll also give Kiro a prompt. I'll ask it to create a steering document that prioritizes the use of these MCP servers whenever it's not sure. This helps Kiro to not hallucinate details, and it'll now go and look up the MCP server to expand its context whenever it's unsure. As you can see, Kiro is now reading all the steering documents.

Thumbnail 1520

Thumbnail 1530

It's also reading up the MCP.json file to understand all the MCP servers, and finally, it creates the actual steering document for having an MCP-first approach. This is the actual steering document that Kiro generated from that prompt, and this tells Kiro every time that it's trying to do something to go look up the MCP servers first to make sure that your code is high quality and is following the best practices. Cool, we understood how to give Kiro a North Star through a steering document.

Thumbnail 1540

The SaaS Builder Toolkit: Accelerating Multi-Tenant Development

Next, what we wanted to do is actually do something else. Knowing the principles is one thing, but knowing the principles doesn't mean that you've got to rebuild the same stuff over and over again. For this we needed an accelerator. For that we used the SaaS Builder Toolkit. SaaS Builder Toolkit is an open source infrastructure tooling from AWS that solves the common problems with every multi-tenant SaaS. Tenant onboarding, user management, isolation, metering, that stuff's all built into the toolkit. So instead of building all of that again, what we want to do is we want to build on top of all of that.

Thumbnail 1580

And this is what the SaaS Builder Toolkit architecture looks like. The control plane handles all the SaaS management operations such as tenant onboarding, user management, billing, and the works. You can interact to it through a CLI or an admin webpage. The application plane is your actual application code, and here's what's important. SBT is completely unopinionated about what you build. Any application code works as long as you can subscribe to the relevant control plane messages and keep to the necessary contracts.

These SBT utilities accelerate common tasks such as tenant provisioning, deprovisioning, user management, and more is coming as the toolkit evolves. In the core utils is where you add your application specific configuration. Tenant provisioning logic goes here, identity settings go here, authorization rules go here. And tying it all together is Amazon EventBridge. This is the event bus which takes the messages from the control plane to the application plane and the application plane to the control plane. It includes necessary helpers that makes it easy to publish and subscribe to these messages.

Thumbnail 1660

Now this architecture is powerful, but it's also complex. So the question now becomes, how do we give Kiro deep enough context to understand SBT? Now as we need to finally teach Kiro about SBT, I'd like to repeat these are prefabricated components. You can look at them as L3 constructs, which are basically a collection of AWS services that are used to build out control plane services.

Thumbnail 1680

Thumbnail 1690

Thumbnail 1700

To do this, what we're going to do is we're going to use another open source tool. We're going to use GitMCP this time. This is a tool that allows you to convert any Git repository on the internet into an MCP server. We're going to generate a URL from GitMCP. Then in Kiro, we're going to ask Kiro to go and add this URL by first understanding it and then attach it to the MCP.json.

Kiro will go and fetch the information about this MCP link that we've just provided, and it's going to use its fetch ability to understand the link. Then it'll go and update the MCP configuration file and the steering documents that we created earlier by reading them first into the context and then adding them into the context. It's also going to ask you every time it's trying to perform a particular operation. You can also go ahead and trust them if you don't want Kiro to constantly pester you for permissions.

Thumbnail 1750

Thumbnail 1760

And the final result is that SBT.AWS is a new MCP server here, which is powered through the GitMCP connection that we made, and the MCP steering document also has the required updates about the SaaS Builder Toolkit. Now Kiro knows about the Well-Architected principles as well as has an MCP server that can read up the SBT documentation.

Thumbnail 1770

Thumbnail 1800

Secret MCP Sauce: Essential Servers for AWS and Development Workflows

Cool. Now that we're here, it almost feels unfair to not share with you wonderful folk this secret MCP sauce that I use all the time. Feel free to take a click in the end and I'm going to be really fast with these. You can attach the AWS Knowledge MCP server if you're looking for AWS best practices, information from blogs, white papers, and having your IDE have this information. You can use the AWS Lab CDK server if your preferred tool of choice is CDK and you do not want any hallucinations in your code.

And also it gives you deep enough context about CDK and AWS, allowing you to write more robust CDK code. If you're tired of hallucinated Terraform resource names, the Terraform MCP server is your friend. These are all servers available in AWS Labs, and there's also going to be a resources section in the end with some links.

Thumbnail 1830

This is a bit of a secret. The AWS API MCP server is what you can use to give Kiro some superpowers to run CLI commands in the background and go and query your AWS account about everything that's deployed in there. You can use this to search logs, list S3 buckets, and whatever you want to do. Context 7. If you're a developer like me and you love to build on Next.js, NestJS, FastAPI, FastMCP, but you're tired of IDE agents constantly using an old version or hallucinated details, attach Context 7 and you'll get access to up-to-date information.

Thumbnail 1880

Chrome Dev Tools? Well, this gives Kiro another superpower. It can open headless browsers and go and traverse and check what it's actually built for you. I also use it all the time to book really cheap flight tickets, by way. And finally, the GitHub MCP server. No developer likes to actually fix any conflicts, and this is where the GitHub MCP server is your friend. You can actually tell it what you want to really do, and it will work through all the complex Git commands under the hood, sometimes that otherwise would take you hours to go look through Stack Overflow to find, and it does them for you.

Thumbnail 1910

Anthony's Build Journey: From Planning to Implementation

So that concludes the first chapter of our story, setting the foundation. We've established our principles, gathered our tools, and charted our course. But planning the expedition is one thing, and actually climbing the mountain is another. And to be your guide for that climb, I'd like to share the hands-on story of what it was like to build this. I'd like to invite Anthony, my co-presenter. Thank you. Thank you, Aman, and hello everybody. Good to see you here today.

Thumbnail 1940

So with all that foundational context and tooling in place at this point, the first task when we're going to be creating a new project, so not starting from brownfield in this case, we're starting from a greenfield, is to begin working directly with Kiro from its very basics and from the spec-driven development perspective. Now it's important to understand that we're not going to be able to just simply use easy simple prompts, asking a prompt and getting a response back. For a project of this complexity, we need to first set out the guidelines of what the overall project happens to be, utilize and have a conversation with Kiro to basically create a collaborative planning session between me as the developer and Kiro as my AI assistant.

By doing this process, I outlined to Kiro the vision for this complex multi-component SaaS application. And Kiro responded back to me first with a project plan, which is in itself a spec document with requirements, the design, and the tasks. But those tasks were then executed to actually create ultimately a breakdown of the solution itself, and ultimately to seven distinct independently deployable components that included the SBT infrastructure that Aman went over with you, as well as the details for the AI help desk that we wanted to create on the application plane, and all of the services such as tenant onboarding.

This meta-level planning was absolutely critical for the project of a scale and of the complexity of this one. It allowed me to visualize the architecture with Kiro and to identify the integration points well before we actually started any coding itself. This upfront AI-assisted planning saved me an immense amount of time as we went down the process and certainly extra rework.

Thumbnail 2070

But a plan is more than just a list of components. Also, as a part of this planning process, Kiro was able to generate documents such as this detailed architectural diagram. This also gave me an immediate visual blueprint of what the entirety of the system was that I was about to build together. I want to point out a few key architectural decisions that we solidified during this stage. You can see the clear separation between the tenant management agent for our SaaS admins and the help desk agent for our tenants. We also explicitly defined the need for a tenant context layer.

The tenant context layer included a data isolation enforcer. This was a direct result, by the way, of Kiro looking at, reading through, and understanding the Well-Architected SaaS Lens as it realized that tenant isolation in the case of a SaaS application was a non-negotiable requirement that it needed to include. Also at the bottom, you can see that we also leveraged shared AI models through Amazon Bedrock throughout the application, and having ultimately this total visual blueprint ensured that I, along with those that were looking at this with me, were aligned together with Kiro and ultimately what we were going to build.

Thumbnail 2150

So with the solid plan and this clear architecture outlined for us at this point, with the requirements documents ready for us to start implementing, the next step in this journey, of course, was the actual day-to-day realities of development with this coding assistant and ultimately as well the lessons that I learned and how I had to evolve in certain cases as those changes came about. As I moved from blueprint to implementation, the focus also shifted to ensuring the quality and consistency at scale for the solution that was being built. This became a story of continuous learning and adaptation for both myself as well as my AI partner.

We're going to walk through ultimately how agent hooks were used to enforce version control, how we enhanced our requirements ultimately with more specific details as time went on or we discovered that there may be missing holes, and most importantly, how we navigated the challenges around testing to forge a robust automated testing strategy with the project. It's important to note that this is where the human-in-the-loop philosophy really was put to the test and showed its importance.

Thumbnail 2230

Thumbnail 2240

So let's dive in first to the day-to-day development process with Kiro. First, we would start out by reviewing our specification documents. By thoroughly reviewing these with Kiro, we can see how that particular specification is outlined. We understand what the design and the requirements and the tasks are going to be before we execute them. We can also continue to refine this with Kiro at this stage as well if we need to add any additional requirements for that particular component.

Thumbnail 2260

Next, we're going to start engaging with Kiro to have it run through the tasks in the task part of the requirements one step at a time. This iterative approach allows us to focus on the specific components that we're working with at that point and to ensure that each piece of the solution was built to the highest levels of standards. Kiro, of course, plays a crucial role here as well as it is then generating the code and providing the guidance as I work through each one of the tasks.

Thumbnail 2300

Next, we need to review the results as Kiro is creating it. After completing each individual task, I went through and reviewed what Kiro had created, making sure that we're matching up to both the design and the requirements specified previously. This step included going through and doing code reviews, running through unit tests and other testing, and getting feedback to Kiro as well as hearing feedback from Kiro based on what I'm finding. Kiro's insights here also help to identify any potential issues and ultimately improve the final code quality.

Thumbnail 2340

Next, as a part of that process, we do and run these build and unit tests. Testing, of course, should be a critical part of the success of any project, especially one of this complexity. Kiro built and ran those tests to verify that each component functioned as we expected. It helped to create the unit tests as well as ultimately functional tests and served a really good purpose in creating a broad set of test coverage for the project, as we'll see here in a moment.

Thumbnail 2370

Thumbnail 2390

Of course, issues are going to arise, and as they do so, especially during the testing process, we need to take care of those, so I worked with Kiro to collaboratively work through those issues. Once all of that was completed, we were then ready to go ahead and jump forward and start working on the next specification and its tasks. This iterative process continued until we had gone through all of the requirements and all of the tasks involved and the final SaaS solution was built.

Thumbnail 2430

Thumbnail 2440

The Results: A Production-Ready SaaS Solution Built in Two and a Half Weeks

One of the things that I implemented as a part of this process is that at certain milestones, I also worked with Kiro to have it generate CLI tools for me to create deployment scripts into my AWS account so that I could actually test the end results in a real world environment as it went along. So ultimately, what did we get out of this? Well, we have of course all of the code and the interactions that I had with Kiro, which we're going to look at here in a moment, and then we actually have the running version of the application which we'll see as well.

Thumbnail 2470

Thumbnail 2480

So let's start first of all with what Kiro created for us. What we see here as we begin is my Kiro IDE. Up on the upper left we see the specification documents, then the agent hooks, again the steering documents that I've worked with, and finally the MCP servers. One thing to note here is that there are quite a few more specification documents than the original seven that were created as part of the project plan, which you can actually see in that project plan requirements document here. That goes to show that over time and as I learned more and needed to add maybe additional requirements, Kiro was flexible enough to work with this.

Thumbnail 2490

Thumbnail 2500

Thumbnail 2510

Thumbnail 2520

Thumbnail 2530

As we scroll through here, the requirements, we can then move over to the task list. One of the things that I want to note on the tasks is that you can actually add additional information into the task document that acts as a sort of steering mechanism specifically for that requirements document and those tasks that you're running. Whereas the main steering documents themselves are global to the entire project, this gives you a little bit more flexibility on a requirements by requirements basis. So here we can see now opening another one of those. Now this is the requirements document for the SaaS admin dashboard, just showing another example and its tasks as well.

Thumbnail 2540

Thumbnail 2550

Now we did have two agent hooks that I put in place. The first one here that's getting opened up was for the creation of documentation. I asked Kiro to create documents as it went along and to thoroughly document not only what it's doing, but also information that would be valuable to other developers in the long run. And the second hook here is a hook that I created to have Kiro work with Git automatically for me so that every time it finished a task, it would commit that into Git so I never accidentally lost anything without it being under source control.

Thumbnail 2580

And the agent steering documents, this also continued to grow over time as I learned more and more throughout the project. One thing that came up during testing, for instance, is that I needed Kiro to make sure that it was actually doing a build of the TypeScript before it actually ran the test, and occasionally it would not do that for me. So I just simply asked Kiro to create a steering document to provide that capability. Now since this was done very early in the days of Kiro, Kiro has of course increased in its capability and it now has a lot of this testing for TypeScript capability right inside of it that can be fixed, but at this point in time it was certainly something I needed.

Thumbnail 2650

Next, if we flip over to the file side, we can see the documents that were created again under the Kiro directory, similar to what Aman showed earlier, including the specification documents, the steering documents, and any agent hooks that have been created. Down below that we can see the actual source code itself in this packages directory. Now this overall directory structure for the project, Kiro and I worked on together to create, and this is a mono repo with all seven of these components built into the same repository. Again, if we had the newer capabilities of Kiro, we might have instead broken this out into different repositories and it could have worked with all of those together today, but in this case, the mono repository worked very well.

Thumbnail 2660

Thumbnail 2690

So we've seen at this point what it is that Kiro can output. We've seen how we can interact with it. And as we're looking at this final product and what it's produced, it's actually pretty impressive. I'd like to show, before we jump into the demo of how the UI actually works and that this is actually a true product, it's a true solution that we made and it's not smoke and mirrors, I do want to go over some pretty impressive numbers with you. First of all, as I mentioned, this is a mono repo. It has seven packages that were built within it. This allowed us of course to maintain consistency across all of those different packages and solutions, although they are all independently deployable, which is one of the requirements that I asked Kiro to do.

Also, as a part of this repository, testing again was very important. Kiro created 226 different test files for this project. The project itself was many tens of thousands of lines of code, and I wanted to ensure that we had really valid and thorough testing for the solution. What we ended up here in a code or a test to production code ratio was one line of test code for every four lines of production code. Now, I've done a lot of development in my days, but I do have to say by working through this project with Kiro, I definitely had the most thorough testing from a unit test and functionality and component test scenario that I've probably ever had in any other project. It is very thorough.

Thumbnail 2790

The other part again is that Kiro also created documentation, and again not just documentation about what it's doing, but also it created API references. It created user guides for both the tenant admins as well as the help desk, and it created, of course, technical specifications that could be important for developers who come along after me and want to maintain or work on this project into the future. So 327 documents ultimately were created as a part of that. Of course, we did deploy into AWS, into an AWS account. So for each component I had an individual CloudFormation stack that again Kiro created for me on my behalf and even when I asked it to would deploy into AWS.

Thumbnail 2810

Now all of this is pretty impressive, but the most impressive thing out of all of this is that this entire solution, including a full-fledged production-ready SaaS, a full tenant application that is a help desk essentially with ticket creation, all of the data interactions thereof, as well as thoroughly defined security including full tenant isolation, was created by one person in two and a half weeks. Beyond that as well, I never wrote a single line of code in order to do this. Kiro did it all for me. Was it always perfect? No, I had to work with Kiro and make sure and address things as they came up, but overall I was incredibly impressed in this capability where essentially if we're creating features, we can take what took weeks down to days or even months to weeks, and it's a very impressive capability and I was quite impressed with it I should say as I was working through it.

Thumbnail 2880

Thumbnail 2890

So let's take a look at what this actually looks like from a UX perspective. What we're starting out here with is the admin screen for the SaaS side of this, for SaaS admins I should say. So specifically looking at how you would interact with the control plane from SaaS. So we have a SaaS admin here. The SaaS admin is going to authenticate, and I am using Cognito as a part of this project to act as an IDP.

Thumbnail 2910

Here we see the dashboard that is presented for this user. Now this user is an admin, so they have more capabilities than other users would have, including a full dashboard support, the ability to introspect and find tenants, to edit those tenants, to create new tenants. Now one thing I will say about this dashboard that's pretty interesting, and I'll be honest, a lot of this actually isn't hooked up to anything. It's a little smoke and mirrors on some of these dashboard features. But Kiro created all of that automatically. I didn't ask it to do that.

Thumbnail 2940

Thumbnail 2960

It's giving suggestions on what a successful SaaS solution and the documents that it would have within it. Here we can see one of those tenants that I created, as well again as the ability to create new ones. We also have a user management section here, and we can see the tenant admin for the tenant that I created as well as the administrative admin. Now if we were doing this maybe in a different way, I wouldn't necessarily mix these two user bases together, but it demonstrates the capability and again all of this is hooked into Cognito.

But we also had one extra requirement, and that was that we wanted to have a full AI assistant and an agent that could interact with the same APIs that we were creating for managing these tenants, doing onboarding, all of that. So we created that agent here with tooling to access those APIs and the communication protocols to go back and forth with it. So I can ask about tenants,

Thumbnail 3010

Thumbnail 3030

I could ask for information about a particular tenant. I could even ask the AI agent to create a new tenant or onboard a new tenant for me. And again, all of this was a part of that initial two weeks. Now here we see the opposite side. This is from the tenant's perspective. So this is the actual help desk interface that was created. We can see that user that was part of the tenant admin user that we saw earlier. And again, when I sign in, it's going to authenticate against Kiro, and we have a completely new and different interface.

Thumbnail 3050

Thumbnail 3060

Thumbnail 3070

Now Kiro, kind of out of the box when it's creating UX, likes to design things very similarly, so we have a left nav here and a dashboard. This dashboard actually is quite a bit more functional than the one that we saw inside of the admin, by the way. It's more hooked up, but as an individual user of this tenant, I can go in, I can see the help desk tickets that were created for the user. I can manage those tickets. I can submit and create a new ticket, as you can see on the screen here, and of course, ultimately, we can then put comments on those tickets, anything that you would expect from a help desk.

And we did add another agent here to interact with the help desk as well in a very similar manner to what we saw with the tenant administration. However, in this case you could ask for information about a ticket. You could create new tickets. If you were dealing with a knowledge base, for instance, we now have the structure in place that we could hook that agent up to a knowledge base and have that implemented as well as part of the ticketing solution. Pretty powerful.

Thumbnail 3100

Thumbnail 3120

Thumbnail 3130

Here we can see the CloudFormation stacks. Now I mentioned that there were seven that were a part of this project, although there are nine listed here. That's because there were a couple that were already in my account at the time, but they are fully deployed. We're jumping over now into DynamoDB and we can see here the many tables that were created as a part of both the help desk application as well as the SaaS solution itself, and again all of this was created by Kiro, right? I never wrote a line of code in order to make any of this happen.

Thumbnail 3140

Thumbnail 3160

Thumbnail 3170

Thumbnail 3180

If we take a look at and explore one of the tables here, we're going to open up the help desk ticketing and we can see those two tickets that we saw in the help desk application are listed here too. And notice that the tenant ID is in this case the key for this table to ensure that we're properly applying, I should say, tenant isolation at the data level since this is a pooled SaaS solution. Here we can see the Lambda functions that were created for this. There are 47 of them, and this covers all of the CRUD interface between the creation and updating of the various different data types that we have in the solution. And finally, if we jump into Amazon Bedrock Agent just to show again that we're not dealing with smoke and mirrors here, we can jump into agent runtime and we can see our two agents listed, both the tenant agent as well as the help desk agent.

Thumbnail 3200

Lessons Learned: Challenges, Adaptations, and Key Takeaways

All right, so pretty powerful stuff and pretty amazing what you can accomplish in two weeks with one person. Oh, and one other thing, most of this time, by the way, was Kiro just running in the background because I have a day job as well, right? So it was doing most of this in the background while I was doing other things. But it was not all sunshine and roses, and with any project you're going to have issues that are going to come up. There are going to be trials and tribulations that you must encounter along the way, and it certainly was the case here too.

These challenges pushed me to, in some cases, even pushed my resolve into wondering what I had gotten myself into, but ultimately I was able to adapt and ultimately made me stronger and learn more about how Kiro works. Today I want to share some of these trials that I had so that maybe you can learn from what I had to go through and possibly not have those same issues as you work through them as well. It also really kind of solidifies again though, that although Kiro can do so much on its own here, there still must be a human in the loop to validate and verify what's going on and to address challenges as they appear.

Thumbnail 3280

So as I navigated the development process, here's some of the challenges that I had to work through and really pushed me to adapt. First, testing and debugging issues. When I first started the project I did make sure that as part of the requirements that I wanted to have testing as a part of the solution. However, Kiro, if you're not super specific, and this goes for pretty much any AI-assisted development I should say, or vibe coding even, can sometimes go off on its own and do its own things from one session to another. So I would end up in certain cases with two completely different test frameworks that were implemented at the same time. Now I could have

fixed that from the very beginning and made sure that didn't happen, either by implementing the appropriate steering documents to drive Kiro to use one centralized testing framework or even have that explicitly in the requirements. But I didn't have it from the beginning, so those are things that I had to work through. As you're beginning to go through the process and really working in spec-driven development, my suggestion to you is to make sure that you take the time at the beginning to really analyze and understand what's inside of those requirements and help Kiro to build to what your vision is by being very explicit within them, and you wouldn't necessarily run into this particular issue.

Thumbnail 3350

Thumbnail 3360

The second one that I ran into was around AgentCore and Strands, and specifically because when I first started doing this project, again it was in the very early days of Kiro, but also the very early days of AgentCore and Strands. In one conversation that I had with Kiro, it actually told me that AgentCore didn't even exist because it had no idea that it did, so I had to go an extra step in this case and actually teach Kiro about AgentCore and about Strands by using some of the techniques that Aman talked about earlier. Having those techniques in my back pocket really saved me in that case because I was able to simply show Kiro here's how you actually implement these solutions. In my case, there was a very early getting started AgentCore capability that's still out there, by the way, and since it was on GitHub, I used GitMCP to create that MCP link and Kiro learned everything that it needed to from there. So pretty powerful way to get around it ultimately, but it was certainly a challenge to overcome.

Thumbnail 3420

And lastly, I'll just reiterate again missing requirements. This also came up as I was beginning to work inside of the help desk part of the solution. I had focused so much of my time and effort and energy into the creation of the SaaS solution and making sure that that SaaS solution was production ready and all the security was following the best practices from the Well-Architected Lens and following along with the best practices from the SaaS Builder Toolkit that I neglected some of the requirements that I really needed inside of the help desk application. So again, I had to go back, opened up Kiro to a new session, made sure that I was in the spec side of the house, and just simply started working with Kiro and added new specifications in order to pick up that slack. That flexibility exists with Kiro no matter where you are inside of your project, even if it's coming towards the end like this one did. You can always go back and ultimately work through those.

So if we're talking about some key takeaways here, it's important to know that these challenges were not just obstacles really for me to overcome. They were really a learning opportunity for me personally to understand how to best work with Kiro. I would challenge you as your developers or yourselves are working in these solutions to think about this and certainly take these on as well, even though they can be very frustrating, as a learning experience and moving forward too.

Thumbnail 3520

Thumbnail 3540

All right, so we've come to the end of our destination here, kind of full circle. At the end of our journey, we've had an ambitious goal that we wanted to, a mountain to climb so to speak, build a complete production-ready SaaS solution on AWS powered by a partnership with a GenAI, in this case Kiro. As you've seen, we were successful in this. We built this solution with a comprehensive control plane and agentic help desk as well with next-generation capabilities, but the product we built was only half the story. The other half is what we learned.

Thumbnail 3560

So first of all, we learned that quality is fundamental. We need to focus on quality as we're building these specifications. Second, and a part of this, clarity is the ultimate accelerator. The more clarity that you give Kiro in your requirements building, the better the solution that Kiro will build for you. And lastly, this is truly a partnership between you as a developer and Kiro as your AI assistant. It's a partnership and not a magic button. Kiro just can't magically make things appear for you.

Thumbnail 3590

Thumbnail 3610

The result of this, of course, is that we get a build out here that takes days, not weeks, and weeks, not months, which is pretty impressive in and of itself. And of course, as we're moving along, it's important to note that the future is now with agents inside of our IDE. Oh, and one last thing, SaaS isn't dead. You may have heard that it just smells like GenTech now, as the last sky that I'll leave you with here today. And I thank you for coming. We have many of those resources that Aman talked about earlier. Please feel free to grab these and use them in your own solutions.

Thumbnail 3630

Thank you all so very much for being here on behalf of Aman and I. Have a great day.

; This article is entirely auto-generated using Amazon Bedrock.