MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

FastAPI from Zero: Writing Your First API Route

2026-01-12 08:59:36

FastAPI is one of the fastest python frameworks. If you are new to FastAPI, here you will learn more about FastAPI and you will write your first FastAPI route in less than 10 minutes.

Table of Contents

  • What is FastAPI
  • Building Blocks of FastAPI
  • Features of FastAPI
  • Your First FastAPI Route
    • Setting Up Environment on Linux
    • Setting Up Environment on Windows
    • Writing your First API route
    • Automatic Documentation

What is FastAPI?

FastAPI is a modern python web framework that is used to build high-performance APIs (Application Programming Interface).

Basic Building blocks of FastAPI.

There are two giants that FastAPI succeeded to combine that put it among the top-rated and fastest python framework for API development. The two giants are Starlette and Pydantic.

  1. Starlette is a lightweight ASGI framework that is ideal for running async web services in Python. Documentation
  2. Pydantic on the other side, is a library used for automatic data validation and serialization. In python, it is used to validate and serialize data with respect to python data types and objects respectively. Documentation

Main Features of FastAPI

  • Fast FastAPI is very fast due to its various advantages such as
    • Automatic data validation
    • Automatic serialization, ie. Converts python objects to JSON format and vice versa
    • Native async operations support (Non-blocking Operations)
    • Automatic Documentation
  • Fewer Bugs Reduce about 40% of human developer errors.
  • Intuitive Great Editor support, auto-completion and less time debugging.
  • Robust Get Production ready code with an automatic documentation.
  • Standard-based FastAPI follows open standards for APIs such as: OpenAPI and JSON schema.

Your First FastAPI Route.

Setting Up environment on Linux.

  • Open your terminal and run the following commands
# Put your system up to date 
sudo apt update && sudo apt upgrade

# Install the packages globally
sudo apt install python3 python3-pip python3-venv

# Create a dedicated folder
mkdir intro_fastapi

# Navigate to the folder
cd intro_fastapi

# Create a virtual environment
python3 -m venv .venv

# Activate the virtual environment
source .venv/bin/activate

# Install the required packages to get starter
pip install "fastapi[standard]" "uvicorn[standard]"

Setting Up environment on Windows.

  • Open your powershell and run the following commands
# Check Python installation
python --version

# If Python is not installed, download it from:
# https://www.python.org/downloads/windows/
# (Make sure to check "Add Python to PATH" during installation)

# Create a dedicated folder
mkdir intro_fastapi

# Navigate to the folder
cd intro_fastapi

# Create a virtual environment
python -m venv .venv

# Activate the virtual environment
.venv\Scripts\activate

# Upgrade pip (recommended)
python -m pip install --upgrade pip

# Install the required packages to get started
pip install "fastapi[standard]" "uvicorn[standard]"

Writing Your First API route

  1. Make sure you are at the project folder root.
  2. Open your folder with your favourite editor
  3. File Structuring Create the following file structure
intro_fastapi/
└── app/
    └── main.py
  1. Open main.py and write the following code
from fastapi import FastAPI

# Create an instance of the FastAPI Class
app = FastAPI()

# register your app's first route using the app decorator
@app.get("/")
def welcome():
    return {"message": "My First API route"}
  1. Run the app (In terminal)
uvicorn app.main:app --reload

Website route browser result

Automatic Documentation

FastAPI provides automatic documentation. There are of two types:

FastAPI Redoc docs

Amazon Bedrock AgentCore : MCP Server on AgentCore Runtime and AgentCore Gateway

2026-01-12 08:41:56

What is MCP? MCP (Model Context Protocol) is a open-source standard for connecting AI application to external systems such as APIs, databases, software and etc.

But how to create and deploy MCP server so that we can access MCP server using MCP client? Here it is using AgentCore Runtime and AgentCore Gateway. This tutorial blog explained how to create and deploy receipt extraction MCP server on AgentCore Runtime then integrate with AgentCore Gateway.

This MCP server help you extract receipt photo information such as store name, purchase date and total/amount then write output of extract information to Amazon DynamoDB table and send email of output of extract information using Amazon SNS. Then test this MCP server with Langchain.

REQUIREMENTS :

  1. AWS account (or AWS credentials), you can sign up/sign in here
  2. Google Gemini account, you can sign up/sign in here
  3. Langchain.

The AWS services used by this MCP server such as :

AWS service Description
Amazon S3 upload file from local to S3 and download file from S3 to receipt extraction MCP server processes.
AWS Secret Manager store Gemini API Key and LLM inference in receipt extraction MCP server.
AgentCore Runtime, AgentCore Gateway and AgentCore Identity.
AgentCore Starter Toolkit quickly configure and deploy MCP server with several AWS services such as Amazon ECR, AWS CodeBuild, and AWS IAM.
Amazon Cognito Authentication and authorization for AgentCore Identity.
Amazon DynamoDB Write result of receipt extraction to NoSQL database.
Amazon SNS Send email notification about result of receipt extraction.

STEP-BY-STEP :

A. Creating Amazon S3 bucket, store Gemini API key in AWS Secret Manager, creating Amazon DynamoDB table and creating Amazon SNS topic/subs.

!pip install python-dotenv boto3

from google.colab import userdata
from dotenv import load_dotenv
import boto3
import json
import os

os.environ["AWS_ACCESS_KEY_ID"] = userdata.get('AWSACCESSKEY')
os.environ["AWS_SECRET_ACCESS_KEY"] = userdata.get('AWSSECRETKEY')

# Get Gemini API Key and email address
load_dotenv("geminiapikey.txt")
gemini = os.getenv("GEMINI_API_KEY")
gmail = os.getenv("EMAIL")
secret_name = "geminiapikey"
table_name = "receiptsExtraction"
topic_name = "receiptsExtractionEmail"
region = "us-west-2"

Use this code to create a Amazon S3 bucket.

s3 = boto3.client('s3', region)
s3.create_bucket(
    Bucket="receipts-extraction",
    CreateBucketConfiguration={
        'LocationConstraint': region
    }
)
print("This bucket is now available.")

S3 bucket

Use this code to create Gemini API Key in AWS Secret Manager.

apikey = boto3.client('secretsmanager', region)
secret_dict = {"GEMINI_API_KEY": gemini}
response = apikey.create_secret(
    Name=secret_name,
    Description="Gemini API Key",
    SecretString=json.dumps(secret_dict)
)
print("Gemini API Key is now stored.")

Gemini API Key

Gemini API Key

Use this code to create result of receipt extraction table in Amazon DynamoDB.

dynamodb = boto3.client('dynamodb', region)
table = dynamodb.create_table(
    TableName=table_name,
    KeySchema=[{'AttributeName': 'storeName', 'KeyType': 'HASH'}],
    AttributeDefinitions=[{'AttributeName': 'storeName', 'AttributeType': 'S'}],
    BillingMode='PAY_PER_REQUEST',
    OnDemandThroughput={'MaxReadRequestUnits': 200,'MaxWriteRequestUnits': 200}
)
print("Receipt extraction table is now available and save output from MCP server.")

DynamoDB

Use this code to create email notification about result of receipt extraction using Amazon SNS.

sns = boto3.client('sns', region)
topic = sns.create_topic(Name=topic_name)
topic_arn = topic['TopicArn']
# Subscribe email to topic but must open your inbox email and click 'Confirm Subscription'
sns.subscribe(
    TopicArn=topic_arn,
    Protocol="email",
    Endpoint=gmail
)
print("Open your inbox with subject AWS Notification - Subscription Confirmation and click Confirm Subscription.")

SNS confirmation

SNS confirmed

SNS topic

B. MCP Server Development

This structure is very important for creating MCP server. Write Python code for MCP server receipt extraction with detailed explanation :

  • receiptExtraction : Extract store name, purchase date and total/amount based this receipt photo with structure output using Google Gemini 2.5 Flash based ReceiptExtractionResult class format like this
class ReceiptExtractionResult(BaseModel):
        """Extracted receipt information."""
        storeName: str = Field(description="Name of store or store name. Must uppercase.")
        purchaseDate: str = Field(description="Purchase date with \"DD-MM-YYYY\" format date.")
        total: float = Field(description="Total or amount. Number and (.) only without any such as $ or other currency.")
  • writeOutput : Write receipt extraction results to Amazon DynamoDB table.

  • sendEmail : Send email notification with receipt extraction results using Amazon SNS.

Create Amazon Cognito User Pool, Domain, Resource Server and User Pool Client for AgentCore Gateway and AgentCore Runtime Inbound Authentication. This step is so that user/application can access agent/tool in AgentCore Runtime or AgentCore Gateway.

Cognito

Use this code to configure the AgentCore Runtime.

from bedrock_agentcore_starter_toolkit import Runtime
agentcore_runtime = Runtime()
region = "us-west-2"
agent_name = "gemini-mcp-server"

runtime = agentcore_runtime.configure(
    entrypoint="mcp_server.py",
    auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name=agent_name,
    protocol="MCP",
    authorizer_configuration={
        "customJWTAuthorizer": {
            "allowedClients": [runtime_cognito_client_id],
            "discoveryUrl": f"https://cognito-idp.{region}.amazonaws.com/{runtime_cognito_pool_id}/.well-known/openid-configuration",
        }
    }
)
runtime

Explaining the above code:

  • from first row until second row means import and initialize AgentCore Runtime.
  • agentcore_runtime.configure means configure the AgentCore Runtime with entry point (MCP server Python code), create IAM role for Runtime, create ECR image, requirements (install libraries), region and agent name.

Use this code to launch AI agent to AgentCore Runtime. Wait up to one minute.

launch_result = agentcore_runtime.launch()

MCP Server

MCP Server

IAM Role

C. TROUBLESHOOTING / VERY IMPORTANT INFORMATION

After AgentCore Runtime is available then invoke MCP server and get error like this screenshot below.

Error

Open CloudWatch Logs or AgentCore Observability to see what happened with this error.

DynamoDB error

SNS error

Go to Amazon Bedrock AgentCore -> Agent runtime then click your agent name that already created. Click "Observability dashboard" or "Cloudwatch logs" to see this error.

This error is happened because DynamoDB and SNS action is not allowed in IAM role for Runtime.

Go to Amazon Bedrock AgentCore -> Agent runtime then click your agent name that already created.

Click "Version 1" then click IAM service role of Permissions (e.g. AmazonBedrockAgentCoreSDKRuntime-{region-name}-{random-number-letter}) like above screenshot.

IAM policy

Click IAM policy name that related (e.g. BedrockAgentCoreRuntimeExecutionPolicy-{your-agent-name}) like above screenshot.

AWS Secret Manager

Go to AWS Secret Manager, click secret name then copy Secret ARN of Gemini API Key.

Edit IAM policy

Add your Secret ARN of Gemini API Key in resource of "secretsManager:GetSecretValue" action with this code :

arn:aws:secretsmanager:us-west-2:{aws_account_id}:secret:geminiapikey-{random-number-letter}

Then click Add new statement -> Write AWS services (S3, DynamoDB and SNS) -> Checklist All actions -> Add resource -> Click Next, click Save and and view the IAM policy after changing it.

OR Edit permission in this IAM policy and write this permission after Sid": "AwsJwtFederation row -> Click Next, click Save and view the IAM policy after changing it.

{
    "Sid": "DynamoDB",
    "Effect": "Allow",
        "Action": [
        "dynamodb:*"
    ],
    "Resource": [
        "arn:aws:dynamodb:us-west-2:{aws_account_id}:table/*"
    ]
},
{
    "Sid": "AmazonSNS",
    "Effect": "Allow",
    "Action": [
        "sns:*"
    ],
    "Resource": [
        "*"
    ]
},
{
    "Sid": "S3",
    "Effect": "Allow",
    "Action": [
        "s3:*"
    ],
    "Resource": [
        "*"
    ]
}

D. AgentCore Gateway

Based AgentCore Gateway documentation, MCP server target type only support OAuth (client credentials) or M2M (machine-to-machine) auth. Based AgentCore Identity documentation, to using M2M auth need create a user pool, resource server, client credentials, and discovery URL configuration.

Create AgentCore Identity for Runtime Outbound Auth. This step is so that agent/tool can access Gateway target such as MCP server, OpenAPI/REST API, API Gateway, Lambda function.

Identity

Create AgentCore Gateway using Gateway User Pool. This step is to integrate MCP server on AgentCore Runtime to AgentCore Gateway.

Gateway

Gateway

Create MCP Server on AgentCore Runtime as a AgentCore Gateway target. This step is to invoke MCP server via Gateway target to Runtime.

MCP Server target

Target detail

Try invoke MCP Server using Langchain MCP Client.

Invoke

DynamoDB table

1

2

3

DynamoDB table

4

5

6

CONCLUSION : Amazon Bedrock AgentCore Runtime can create MCP server and AgentCore Gateway can integrate between MCP server to gateway.

DOCUMENTATION :

GITHUB REPOSITORY : https://github.com/budionosanai/amazon-bedrock-agentcore-one-to-one/tree/main/mcp-server

Thank you,
Budi :)

A Tier List for Company AI Strategies.

2026-01-12 08:36:24

Often these days, conversations inevitably turn to the topic of AI. While we should expect this as it has and will continue to transform our world, I can't help but note the rather disparate takes, understandings and discourse on the matter, even amongst IT professionals. While most of us don't possess deep academic training on the subject, myself being more of a hobbyist for most of my engineering career, and have spent most of our existence in a world where AI was just a curiosity, I always observe a certain juvenile quality to these conversations, as if the masses were thrust into carrying on an informed conversation on a matter that takes years and years to master.

And so I offer to the reader this tier list of organization AI strategies. These conversations help me realize that the majority of companies are still adjusting to this new world, with most making sincere efforts to appear informed yet still searching for the path to post scarcity AI nirvana.

Aside from drawing parallels between colleagues and companies, it frames the topic of AI in terms of a natural progression from the nature of the change, to level of impact it can have, and what to expect from the next level of transformation, and what will be needed to reach the next level.

I offer these 5 tiers, with a fun culinary spice to them. Tier I contains more points than any of the other tiers. Seeing how all organizations start their journey here, and that most companies are at this level, this is to be expected.

I: Sprinkle

Light opportunistic exploitation of AI solutions

  • characterized by 'sprinkling' AI on existing solutions and processes.
  • strategists generalize 'AI' and apply it vaguely, with little understanding of specific AI instances or appropriateness or cost or value
  • often, these 'AI' strategies are just application of traditional IT/computer solutions mislabeled as AI
  • Tier I strategies also claim AI utilization through curation of AI-labeled vendors ,tooling, basic plugins, and integrations.
  • Tier I strategies are often broad yet shallow, and lack measurement or KPIs.
  • No formal strategy, training, or organizational mandates provisioned nor imposed.
  • Tier I strategies are spearheaded by leaders with very little AI fluency or perhaps technology in general, who rely on an assumed instinctual understanding of AI and its value proposition (as an example, contrasting different AI methodologies such as deep learning, neural network, gen AI, LLM, etc, and how these techniques differ in suitability to solving different problems)

II: Stir

Deliberate integration of AI into specific workflows and tools.
-Stategists have at least a basic understanding of AI as a field

  • Vendors and tooling undergo informed analysis before adoption; such integrations feature custom integrations and configurations.

  • Organizational adoption features basic guidelines and training

  • Focus on boosts and efficiency is targeted

  • Changes are still additive rather than transformative

III: Simmer

Deep embedding of AI across multiple functions, with custom solutions and data feedback loops.

  • Organizational leaders either possess AI fluency or consult and delegate AI adoption and operations to informed strategies
  • Fine-tuned models, internal AI platforms, agentic workflows (AI agents that plan and execute multi-step tasks), and RAG systems.
  • Cross-functional governance, data infrastructure investments, and measurable ROI tracking.
  • AI influences decisions, optimizes operations, and starts reshaping how work is done.
  • Legacy roles and titles are augmented and replaced by AI oriented skillsets.
  • The organization is "letting AI simmer" — changes are gradual but pervasive.

IV: Bake

AI is baked into core business processes and products; the company redesigns operations around AI capabilities.

  • Enterprise-wide platforms, autonomous agents/swarm systems, predictive analytics at scale, and AI-driven automation of complex workflows.
  • Significant talent hiring, ethical frameworks, and cultural shift toward AI fluency.
  • New revenue streams or cost structures emerge from AI (e.g., AI-powered products or services).
  • AI is no longer a layer — it's fundamental to how value is created.

V: Feast

AI-native transformation: the entire organization is built or rebuilt around AI as the primary driver.

  • The organization is now an industry leader thanks to an exponential advantaged realized through AI

  • Continuous AI-human collaboration and evolution of proprietary AI models

Making data conversational: Building MCP Servers as API bridges

2026-01-12 08:34:33

At the Fort Wayne AI meetup on 09 January, I presented about a pattern I've discovered while building Vibe Data: Making data conversational by using a model context protocol (MCP) server on top of existing REST APIs. Your API provide your data, an MCP server provides access to the API for a desktop LLM client like Claude or ChatGPT, and the LLM client provides conversational access to your data.

This post captures what I learned building this architecture and what I shared with the developer community.

The situation: Two audiences, one backend

When you're building a data platform, you inevitably face a challenge: different users need different interfaces.

Developers want:

  • Programmatic API access
  • JSON they can transform
  • Full control for automation
  • Integration with their tools

End users want:

  • Quick answers to questions
  • No coding required
  • Natural language queries
  • Simple, intuitive interfaces

The traditional approach is to build two separate systems: a REST API for developers and dashboards or reports for end users. But this creates maintenance overhead, duplicate business logic, and architectural complexity.

The Solution: MCP as an API bridge

I've found a better pattern: build your REST API first, then add an MCP server as a thin conversational wrapper.

Here's the architecture:

          Data platform
                 │
            REST API
    (Business Logic | Auth | Rate Limiting)
                 │
        ┌────────┴────────┐
        │                 │
   Direct API        MCP Server
    Access            (~200 lines)
        │                 │
   Developers      Users + Claude
   (JSON/Code)     (Conversation)

The REST API remains your single source of truth. All business logic, authentication, rate limiting, and caching live here. The MCP server is just a formatting layer that:

  1. Receives structured tool calls from Claude.
  2. Translates them to API requests.
  3. Formats JSON responses as natural language.
  4. Returns conversational text.

Why this pattern works

1. Separation of concerns

Your API handles the hard stuff:

  • Database queries
  • Authentication and authorization
  • Rate limiting and caching
  • Business logic and validation

Your MCP server handles presentation:

  • Formatting JSON as readable text
  • Adding context and insights
  • Transforming data into conversation

When your API changes, both interfaces get the update automatically. No duplicate logic to maintain.

2. Opportunity for two products from one source

You can serve both audiences without building twice:

  • API tier: Developers get JSON, higher rate limits, programmatic access
  • Conversational tier: End users get Claude access, simpler pricing

Same backend. Different value propositions.

3. Progressive enhancement

You're not choosing between API or MCP. You're adding conversational access to an existing system:

  • Start with API (proven, understood, lots of tooling)
  • Add MCP layer when ready (thin wrapper, low risk)
  • Keep both interfaces running (serve more users)

What I learned building this

MCP works best as a bridge

Don't try to rebuild your entire backend in MCP. Don't put business logic in your MCP tools. Build a solid REST API first. That's your product. The MCP server should be ~200 lines of code that calls your API and formats responses.

async getToolMetrics(toolId) {
    try {
      const response = await fetch(
        ${this.baseURL}/tools/${toolId}/metrics
...
const toolName = data[0].name || toolId;
let output = `📈 ${toolName.toUpperCase()} - ${months} Month History\n\n`;
  output += `**Download Trend:**\n`;
....

Different interfaces for different users

Developers want JSON they can transform however they need. They'll build custom dashboards, automate reports, integrate with other systems. Give them an API.

End users just want answers. They don't want to learn curl commands or read API documentation. They want to ask "What's Cursor's growth trend?" and get an answer. Give them Claude with MCP.

You can serve both without building duplicate systems.

The formatting layer is where MCP adds value

Your API returns data:

{"downloads": 8100000, "growth_pct": 55.8}

Your MCP server transforms it:

Cursor grew 55% over the quarter, reaching 8.1M monthly 
downloads, indicating strong developer adoption.

Same data. One is optimized for machines. One is optimized for humans.

This formatting layer, turning structured data into meaningful insights, is where conversational interfaces shine. It's not just passing through JSON; it's contextualizing and explaining it.

The Demo: Making data conversational

During the presentation, I demonstrated this with Vibe Data's adoption intelligence and Claude desktop:

Query 1: "What's Cursor's adoption trend over the last quarter?"

  • Claude calls get_tool_history tool
  • MCP server calls REST API
  • Returns formatted trend analysis with growth calculations

Query 2: "How does Cursor compare to GitHub Copilot?"

  • Claude calls compare_tools
  • Gets metrics for both
  • Synthesizes side-by-side comparison with key insights

Query 3: "Which AI coding tools are growing fastest right now?"

  • Claude calls get_trending_tools
  • Ranks by growth percentage
  • Presents as ordered list with context

The pattern works because Claude can compose these tools in ways I didn't pre-build. Ask "compare the top 3 trending tools" and Claude chains multiple calls automatically. A report would need that query to be pre-built and a dashboard might require the user to pick the appropriate options from menus.

Being honest about limitations

MCP isn't magic, and I told the audience that. Current challenges:

Discovery: Users don't automatically know what tools are available. They have to ask or explore. The ecosystem needs better tool discovery UIs.

Distribution: When you add new tools, users need to update locally. Cloud-hosted MCP servers would solve this with instant updates.

Anticipation: You still need to build specific tools for specific questions. MCP doesn't eliminate the need to think about what users need.

But even with these limitations, it's better:

  • Natural language beats clicking through filters.
  • Claude can compose tools dynamically.
  • Graceful degradation ("I don't have Reddit data") beats silent missing features or cryptic error codes.
  • Standardized protocol beats reinventing the wheel.

Understanding both strengths and friction points, not just evangelizing uncritically, helps drive real adoption.

Real-world use cases

This pattern works for any product with data exposed by APIs:

B2B SaaS:

  • API → Analytics platforms, customer dashboards
  • MCP → "How's our MRR trending?" "Which customers churned?"

E-commerce:

  • API → Inventory systems, order management
  • MCP → "What products are low stock?" "Show me returns this week"

Internal Tools:

  • API → Automated reports, integrations
  • MCP → "Find pending invoices" "Compare Q3 vs Q4 sales"

The pattern is universal: build API first, wrap with MCP, serve both technical and non-technical users.

The code: Open source educational implementation

I've open-sourced an educational MCP server that demonstrates this pattern: github.com/grzetich/ai-developer-tools-mcp

It uses sample data to show the architecture without requiring database access. The structure is identical to production:

src/
├── tools/          # MCP tool definitions
├── api/            # API client (THE BRIDGE)
├── data/           # Mock data (simulates database)
└── utils/          # Response formatters

In production, only api/client.js changes - from mock data to real fetch() calls. Everything else stays the same.

What's next

I'm continuing to refine this pattern at Vibe Data, where the MCP server serves production traffic alongside our REST API. The dual-interface approach lets us serve both developers building integrations and investors asking questions.

I'm also exploring how to solve the distribution and discovery challenges, potentially through cloud-hosted MCP servers that auto-update when new tools are deployed like my Pokémon MCP server.

If you're building with MCP or thinking about conversational interfaces for your data, I'd love to hear what patterns you're discovering. Reach out on GitHub or email.

Resources

🌈 Looking for help if possible: I’m Stuck on My TrackMyHRT App (Medication + Symptom Tracker)

2026-01-12 08:32:36

Hey devs — I’m working on a small, offline, privacy‑first desktop app called TrackMyHRT, and I’ve hit a point where I could really use some outside perspective.

TrackMyHRT is part of a larger vision, but right now I’m focusing on this one tool: a simple, calm interface for logging HRT doses, symptoms, mood, energy, libido, and notes, with local‑only storage and export options. It’s meant to help people understand their patterns and advocate for themselves without giving their data to a cloud service.

The core features work, but I’m stuck on the next steps.

🔧 Where I’m stuck (TrackMyHRT‑specific)

  1. Data model decisions

I’m unsure whether to:

keep the current JSON structure as‑is

refactor into something more future‑proof

or split entries into smaller components (meds, symptoms, metadata)

I don’t want to over‑engineer, but I also don’t want to put myself into a corner.

  1. UI flow + structure

The current UI works, but I’m struggling with:

how to make the entry form feel smoother and less “form‑heavy”

whether the viewer should stay simple or become more detailed

how to keep everything accessible without clutter

I want it to feel calm and low‑pressure, not like filling out a medical form.

  1. Export formats + long‑term planning

Right now I support .jsonl, .json, .txt, and .md.I’m unsure whether:

this is too many

I should standardize around one

or I should add a more structured export for future integrations

  1. Migration logic

I have a small migration step for older JSONL → JSON storage.I’m not sure if I should:

keep supporting legacy formats

or simplify and drop old versions entirely

💬 What I’m looking for

Advice on structuring small but extensible data models

Thoughts on designing a calm, accessible UI for daily logging

Opinions on export formats and long‑term maintainability

Examples of similar trackers or patterns I could learn from

General “here’s how I’d approach this” insight

below is the github repo for the HRT Journey Tracker Suite where the TrackMyHRT tool is located
TrackMyHRT👈

I’m not looking for someone to rewrite the app — just some guidance from people who’ve built trackers, logging tools, or small desktop apps before.

TrackMyHRT is meant to support people navigating hormone therapy in a private, offline, non‑judgmental way. No accounts, no cloud sync, no analytics — just a simple tool that helps people understand their own journey.

I want to build this thoughtfully, but right now I’m too stuck in my head to see the next steps.

If you have experience with python, PySide6, data modeling, UX for logging tools, or just strong opinions about app structure, I’d love to hear from you.

Thanks for reading — and thanks in advance for any guidance.

Private & Fast: Building a Browser-Based Dermatology Screener with WebLLM and WebGPU

2026-01-12 08:30:00

In the world of health-tech, privacy is the ultimate feature. Nobody wants to upload sensitive photos of skin lesions to a mysterious cloud server just to get a preliminary health check. But what if we could bring the power of a Vision Transformer (ViT) directly to the user's browser?

Today, we are diving deep into the world of Edge AI and WebGPU acceleration. We’ll build a "Dermatology Initial Screener" that runs entirely client-side. By leveraging WebLLM, TVM Unity, and Transformers.js, we can perform complex lesion analysis with zero data latency and 100% privacy.

If you are interested in local inference, privacy-first AI, and the future of WebGPU-powered applications, you're in the right place!

The Architecture: Privacy by Design

The goal is simple: The user's photo never leaves their device. We use the browser's GPU to do the heavy lifting that used to require a Python backend with a massive NVIDIA card.

graph TD
    A[User Image Input] --> B[HTML5 Canvas / Pre-processing]
    B --> C{WebGPU Support?}
    C -- Yes --> D[Transformers.js / WebLLM Engine]
    C -- No --> E[WASM Fallback/Error]
    D --> F[Local ViT Model / Vision-Language Model]
    F --> G[Classification & Reasoning]
    G --> H[Instant UI Feedback]
    style F fill:#f96,stroke:#333,stroke-width:2px
    style G fill:#bbf,stroke:#333,stroke-width:2px

Tech Stack

  • WebGPU: The next-gen API for high-performance graphics and computation.
  • WebLLM: A high-performance in-browser LLM framework powered by TVM Unity.
  • Transformers.js: To run vision models (like ViT or MobileNet) natively in JS.
  • React/Vite: For a snappy frontend experience.

Step 1: Initializing the WebGPU Environment

Before we can run a model, we need to ensure the user's browser is ready for WebGPU. This is the secret sauce that makes in-browser AI run at near-native speeds.

async function initWebGPU() {
  if (!navigator.gpu) {
    throw new Error("WebGPU is not supported on this browser. Try Chrome Canary!");
  }
  const adapter = await navigator.gpu.requestAdapter();
  const device = await adapter.requestDevice();
  console.log("🚀 WebGPU is ready to roar!");
  return device;
}

Step 2: Loading the Vision Transformer (ViT)

We’ll use Transformers.js to load a quantized version of a skin lesion classification model. By using a quantized model, we save on bandwidth while maintaining high accuracy.

import { pipeline } from '@xenova/transformers';

async function loadScreenerModel() {
  // We use a model fine-tuned on the HAM10000 dataset for skin lesions
  const classifier = await pipeline('image-classification', 'Xenova/vit-base-patch16-224', {
    device: 'webgpu', // Magic happens here!
  });
  return classifier;
}

Step 3: Local Reasoning with WebLLM

While a ViT can classify an image, WebLLM (via TVM Unity) allows us to add a "reasoning" layer. We can feed the classification result into a local LLM to explain the findings in plain English—all without a server!

import * as webllm from "@mlc-ai/web-llm";

async function getLocalReasoning(prediction) {
  const engine = new webllm.MLCEngine();
  await engine.reload("Llama-3-8B-Instruct-v0.1-q4f16_1-MLC");

  const prompt = `A skin scan detected a ${prediction.label} with ${prediction.score * 100}% confidence. 
                  Provide a brief, non-diagnostic disclaimer and advice for a dermatologist visit.`;

  const reply = await engine.chat.completions.create({
    messages: [{ role: "user", content: prompt }]
  });
  return reply.choices[0].message.content;
}

The "Official" Way to Build Edge AI

While building a prototype is fun, scaling local AI to production requires a deeper understanding of memory management and model optimization. For more production-ready examples and advanced patterns regarding Edge AI and private data processing, I highly recommend checking out the WellAlly Official Blog.

They provide excellent deep-dives into how to optimize TVM Unity pipelines for enterprise health applications, ensuring your local models are as lean as possible.

Step 4: Putting it All Together (The UI)

In your React component, you'd handle the image upload and trigger the pipeline.

const analyzeSkin = async (imageElement) => {
  setLoading(true);
  try {
    const classifier = await loadScreenerModel();
    const results = await classifier(imageElement.src);

    // Get the top result
    const topResult = results[0];

    // Get local LLM reasoning
    const advice = await getLocalReasoning(topResult);

    setReport({ analysis: topResult, advice });
  } catch (err) {
    console.error("Inference failed", err);
  } finally {
    setLoading(false);
  }
};

Why This Matters (The "So What?")

  1. Zero Latency: No waiting for a 5MB high-res photo to upload to a server in Virginia.
  2. Privacy: Medical data is sensitive. Processing it on-device is the gold standard for HIPAA-compliant-ish user experiences.
  3. Offline Capability: This tool could work in remote areas with zero internet after the initial model download.

Conclusion

The browser is no longer just a document viewer; it's a powerful execution environment for Edge AI. By combining WebGPU, WebLLM, and Transformers.js, we can create life-changing tools that respect user privacy by default.

What do you think? Is the future of AI purely local, or will we always need the cloud for the "big" stuff? Let’s chat in the comments! 👇

Happy coding! If you enjoyed this "Learning in Public" journey, don't forget to ❤️ and bookmark! For more advanced AI architecture, visit wellally.tech/blog.