MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

I Built a Container Orchestrator in Rust Because Kubernetes Was Too Much and Coolify Wasn't Enough

2026-04-10 17:21:30

There's a gap in the container orchestration world that nobody talks about.

Docker Compose works for 1 server. Coolify and Dokploy give you a nice GUI but still cap at one node. Kubernetes handles 10,000 nodes but requires a team of platform engineers just to keep the lights on.

What if you have 2 to 20 servers, 20 to 100 services, and a team of 1 to 5 engineers who'd rather ship features than debug etcd quorum failures?

That's exactly where I was. So I built Orca.

Docker Compose ──> Coolify/Dokploy ──> Orca ──> Kubernetes
   (1 node)         (1 node, GUI)      (2-20)     (20-10k)

TL;DR

Orca is a single-binary container + WebAssembly orchestrator written in Rust. One 47MB executable replaces your control plane, container agent, CLI, reverse proxy with auto-TLS, and terminal dashboard. Deploy with TOML configs that fit on one screen — no YAML empires, no Helm charts, no CRDs.

GitHub: github.com/mighty840/orca
Install: cargo install mallorca

The Problem

I was running ~60 services across 3 servers for multiple projects — compliance platforms, trading bots, YouTube automation pipelines, chat servers, AI gateways. Coolify worked great at first, but then I needed:

  • Services on multiple nodes with DNS-based routing per node
  • Auto-TLS without manually configuring Caddy/Traefik per domain
  • Git push deploys that actually work across nodes
  • Rolling updates that don't take down the whole stack
  • Config as code, not clicking through a GUI

Kubernetes was the obvious answer, but for 3 nodes and a solo developer? That's like buying a Boeing 747 to commute to work.

What Orca Actually Does

Single Binary, Everything Included

cargo install mallorca
orca install-service          # systemd unit with auto port binding
sudo systemctl start orca

That one binary runs:

  • Control plane with Raft consensus (openraft + redb — no etcd)
  • Container runtime via Docker/bollard
  • WebAssembly runtime via wasmtime (5ms cold start, ~2MB per instance)
  • Reverse proxy with Host/path routing, WebSocket proxying, rate limiting
  • ACME client for automatic Let's Encrypt certificates
  • Secrets store with AES-256 encryption at rest
  • Health checker with liveness/readiness probes and auto-restart
  • AI assistant that diagnoses cluster issues in natural language

TOML Config That Humans Can Read

[[service]]
name = "api"
image = "myorg/api:latest"
port = 8080
domain = "api.example.com"
health = "/healthz"

[service.env]
DATABASE_URL = "${secrets.DB_URL}"
REDIS_URL = "redis://cache:6379"

[service.resources]
memory = "512Mi"
cpu = 1.0

[service.liveness]
path = "/healthz"
interval_secs = 30
failure_threshold = 3

Compare that to the equivalent Kubernetes YAML. I'll wait.

Multi-Node in One Command

# On worker nodes:
orca install-service --leader 10.0.0.1:6880
sudo systemctl start orca-agent

The agent connects to the master via bidirectional WebSocket — no HTTP polling, no gRPC complexity. Deploy commands arrive instantly. When an agent reconnects after a network blip, the master sends the full desired state and the agent self-heals.

[service.placement]
node = "gpu-box"         # Pin to a specific node

GitOps Without a CI Runner

Orca has a built-in infra webhook. Point your git host at the orca API, and every push triggers git pull + full reconciliation:

# One-time setup:
curl -X POST http://localhost:6880/api/v1/webhooks \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"repo":"myorg/infra","service_name":"infra","branch":"main",
       "secret":"...","infra":true}'

Push a config change → orca pulls → deploys only what changed. No Jenkins, no GitHub Actions runner, no ArgoCD.

For image-only updates (CI pushes new :latest), register a per-service webhook and orca force-pulls + restarts.

How It Compares

Feature Coolify Dokploy Orca K8s
Multi-node No No Yes (Raft) Yes (etcd)
Config format GUI GUI TOML YAML
Auto-TLS Yes Yes Yes (ACME) cert-manager
Secrets GUI GUI AES-256, ${secrets.X} etcd + RBAC
Rolling updates Basic Basic Yes + canary Yes
Health checks Basic Basic Liveness + readiness Yes
WebSocket proxy Partial Partial Full Ingress-dependent
Wasm support No No Yes (wasmtime) Krustlet (dead)
AI ops No No Yes No
GitOps webhook Yes Yes Yes + infra webhook ArgoCD/Flux
Self-update No Docker pull orca update Cluster upgrade
Lines of config per service ~0 (GUI) ~0 (GUI) ~10 TOML ~50-100 YAML
External dependencies Docker, DB Docker Docker only etcd, CoreDNS, ...
Binary size Docker image Docker image 47MB N/A

The Smart Reconciler

One thing that drove me crazy with other orchestrators: redeploy a stack and everything restarts, even services that haven't changed.

Orca's reconciler compares the unresolved config templates (with ${secrets.X} intact), not the resolved values. If your OAuth token refreshed but your config didn't change, the container stays running. Only actual config changes trigger a rolling update.

orca deploy              # Reconcile all — skips unchanged services
orca deploy api          # Reconcile just one service
orca redeploy api        # Force pull image + restart (for :latest updates)

What's Coming in v0.3

The roadmap is driven by what we actually need in production:

  • Remote log streamingorca logs <service> for containers on any node, piped via WebSocket
  • Preview environmentsorca env create pr-123 spins up an ephemeral copy of a project
  • Per-project secrets${secrets.X} resolves project scope first, then global
  • TUI webhook manager — add/edit/delete webhooks from the terminal dashboard
  • TUI backup dashboard — per-node backup status, manual trigger, restore
  • ARM64 builds — native binaries for Raspberry Pi / Graviton
  • Log forwarding — ship container logs to Loki, SigNoz, or any OpenTelemetry collector
  • Nixpacks integration — auto-detect and build without Dockerfiles

Architecture for the Curious

┌─────────────────────────────────────┐
│         CLI / TUI / API             │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│         Control Plane               │
│  Raft consensus (openraft + redb)   │
│  Scheduler (bin-packing + GPU)      │
│  API server (axum)                  │
│  Health checker + AI monitor        │
└──────────────┬──────────────────────┘
               │ WebSocket
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│ Node 1 │ │ Node 2 │ │ Node 3 │
│ Docker │ │ Docker │ │ Docker │
│ Wasm   │ │ Wasm   │ │ Wasm   │
│ Proxy  │ │ Proxy  │ │ Proxy  │
└────────┘ └────────┘ └────────┘

8 Rust crates, ~15k lines, 120+ tests. Every source file under 250 lines. The dependency flow is strict: core <- agent <- control <- cli. No circular deps, no god modules.

Want to Contribute?

Orca is open source (AGPL-3.0) and actively looking for contributors. The codebase is designed to be approachable:

  • Small files — 250 line max, split into clear submodules
  • Comprehensive tests — 120+ unit and integration tests
  • Architecture guideCLAUDE.md documents every crate, convention, and design decision
  • Real issues — every open issue comes from production usage, not hypotheticals

Good first issues:

  1. ARM64 CI build — add a GitHub Actions matrix for aarch64
  2. TUI log viewer — stream container logs in a ratatui pane
  3. Backup --exclude — skip specific volumes from nightly backup
  4. Service templates — WordPress, Supabase, n8n one-click configs
git clone https://github.com/mighty840/orca.git
cd orca
cargo test        # 120+ tests
cargo build       # single binary

Links

Orca is built by developers running real production workloads on it — trading bots, compliance platforms, YouTube automation, AI gateways. Every feature exists because we needed it, every bug fix comes from a real 3 AM incident. If you're stuck between Coolify and Kubernetes, give it a shot.

Star the repo if this resonates. Open an issue if something's broken. PRs welcome.

How to Decide Which LMS Platforms Your Edtech Tool Should Support First in 2026

2026-04-10 17:17:51

Most edtech teams make the same mistake early: they try to support every LMS at once. Canvas, Moodle, Blackboard, Brightspace, Schoology — the list grows until the integration work buries the product work. Six months later, nothing works well anywhere.

You don't need to support every LMS. You need to support the right two, in the right order.

Start with your customer, not the LMS market share chart

The most common mistake is picking LMS platforms based on overall market share instead of where your actual customers are.

Canvas has the largest share in US higher education. Moodle dominates internationally and in K-12 outside the US. Brightspace is strong in Canadian K-12 and corporate learning. Blackboard is common in large US universities.

None of that matters if your first 10 customers are all on Moodle.

Before you write a line of integration code, answer two questions: who are the first schools or institutions you're selling to, and what LMS are they on? One conversation with a potential customer answers this faster than any market research report.

The decision framework

If you're targeting US higher education — start with Canvas. It has the largest share in that segment and its LTI implementation is clean and well-documented. Add Blackboard second if your target customers are large universities.

If you're targeting international schools or K-12 outside the US — start with Moodle. It's open source, widely deployed globally, and has an active developer community. The setup is more involved than Canvas but the audience is large.

If you're targeting corporate learning or Canadian institutions — start with Brightspace. It's the platform of choice in those segments and D2L has strong LTI 1.3 support.

If you're targeting US K-12 — this is the hardest segment. Canvas and Schoology split most of the market, but Google Classroom has roughly 24% share and doesn't support LTI at all. You'll need both an LTI integration and a Google Classroom API integration to cover this segment properly.

The platforms you can defer

Blackboard and Schoology are safe to defer unless a specific customer requires them. Both support LTI 1.3, so adding them later is straightforward once your core integration is solid.

Open edX is worth considering if you're targeting MOOCs or large-scale online courses — but it's a niche segment and the integration is more involved than standard LMS platforms.

One rule that saves months

Build your LTI integration cleanly against the spec, not against one LMS's quirks. Every LMS implements LTI slightly differently — Canvas handles deep linking differently from Moodle, Blackboard has specific behaviors around grade passback. If you build for one LMS's implementation and assume it's universal, you'll rebuild every time you add a new platform.

Test against the 1EdTech reference implementation first. Then test against each LMS separately. Treat each platform's quirks as configuration, not code changes.

FAQ

How long does adding a new LMS integration take?
If your LTI integration is built cleanly against the spec, adding a new LMS is mostly configuration and testing — typically 2 to 4 weeks. If your integration has platform-specific assumptions baked in, expect 6 to 10 weeks of rework per platform.

Should we support LTI 1.1 for older LMS deployments?
No. Build on LTI 1.3 only. LTI 1.1 is deprecated and major platforms are dropping support for new integrations. If a customer is on an older LMS version that only supports 1.1, that's a conversation about their upgrade timeline, not yours.

What if our first customer is on a platform we haven't supported yet?
Build it. One real customer is worth more than theoretical market share. Just build it cleanly so the second and third platforms don't require starting over.

Why Your App's Photos Look Weird: A Developer's Guide to Moiré Patterns

2026-04-10 17:14:42

You've probably seen it before — strange rainbow-colored waves rippling across a photo of a computer screen, or weird grid-like artifacts in a scanned document. That's called a moiré pattern, and if you're building any application that handles images, it's something you'll inevitably run into.

What Exactly Is a Moiré Pattern?

Moiré patterns occur when two repetitive patterns overlap at slightly different angles or scales. Think of it like this:

Pattern A:  | | | | | | | | | |
Pattern B:   | | | | | | | | | |
Result:     |||  |  |||  |  |||    ← interference pattern

In the physical world, this happens constantly:

  • Screen photography: Your phone camera's pixel grid interferes with the monitor's pixel grid → rainbow waves
  • Scanning printed material: The scanner's sampling grid clashes with the halftone dot pattern → wavy artifacts
  • Fabric photography: The camera sensor grid interacts with the weave pattern → visual noise
  • Video recording: Shooting someone wearing a striped shirt on camera → shimmering patterns

Why Developers Should Care

If you're building any of these, moiré will bite you:

1. Image Upload Platforms

Users upload photos of screens, scanned documents, and product images all the time. Moiré degrades image quality and makes OCR unreliable.

// Your OCR pipeline might fail on moiré-affected scans
const result = await tesseract.recognize(scannedImage);
// result.confidence: 45% 😬 — moiré confused the character recognition

2. E-commerce Product Photos

Photographing textured fabrics, mesh materials, or screens? Moiré makes products look defective. This directly impacts conversion rates.

3. Screen Capture & Recording Tools

Building a screen recording app? If users capture one screen with another device, moiré is guaranteed. Even screenshot tools can produce moiré when downscaling.

4. Document Scanning Apps

Any app that digitizes printed materials needs to handle the halftone-to-pixel conversion problem. Without descreening, your scanned PDFs look amateur.

The Math Behind Moiré

For the curious, moiré is an aliasing artifact — a fundamental concept in signal processing.

When you sample a signal (an image) at a rate lower than twice its highest frequency, you get aliasing. This is the Nyquist-Shannon sampling theorem in action:

f_moiré = |f₁ - f₂|

Where:
  f₁ = frequency of pattern 1 (e.g., screen pixel pitch)
  f₂ = frequency of pattern 2 (e.g., camera sensor pitch)

When f₁ and f₂ are close but not identical, you get a low-frequency interference pattern — that's your moiré.

This is the same principle behind:

  • Audio aliasing in digital music
  • The "wagon wheel effect" in video
  • Temporal aliasing in animation frame rates

How to Fix Moiré: The Technical Approaches

Approach 1: Gaussian Blur (The Brute Force Way)

import cv2
import numpy as np

def remove_moire_blur(image, kernel_size=5):
    """
    Simple but destructive — removes moiré by 
    low-pass filtering, but also kills detail.
    """
    return cv2.GaussianBlur(image, (kernel_size, kernel_size), 0)

Pros: Simple, fast

Cons: Destroys image detail. It's like fixing a headache with a sledgehammer.

Approach 2: Frequency Domain Filtering

def remove_moire_frequency(image):
    """
    Smarter approach: find moiré peaks in frequency 
    domain and notch them out.
    """
    # Convert to frequency domain
    f_transform = np.fft.fft2(image)
    f_shift = np.fft.fftshift(f_transform)

    # Create notch filter to remove moiré frequencies
    # (frequencies identified by spectral analysis)
    magnitude = np.abs(f_shift)

    # Find and suppress anomalous frequency peaks
    threshold = np.mean(magnitude) + 3 * np.std(magnitude)
    mask = magnitude < threshold

    # Apply filter and reconstruct
    filtered = f_shift * mask
    return np.abs(np.fft.ifft2(np.fft.ifftshift(filtered)))

Pros: Preserves more detail

Cons: Requires manual tuning per image, doesn't generalize well.

Approach 3: AI/Deep Learning (The Modern Way)

Modern neural networks can learn to separate moiré patterns from actual image content. This is where the field has moved — models trained on paired moiré/clean image datasets can selectively remove the interference while preserving detail.

The key architectures used:

  • U-Net variants — encoder-decoder with skip connections
  • Multi-scale approaches — process at different resolutions to catch moiré at various frequencies
  • GAN-based methods — adversarial training for more realistic restoration

For most developers, implementing this from scratch isn't practical. Tools like Moire Removal use AI models specifically trained for this, so you can integrate moiré removal into your workflow without building the ML pipeline yourself.

Practical Tips for Your Application

If you're dealing with moiré in your product, here's a decision tree:

Is moiré in your input images?
├── Yes, from screen photos
│   └── Consider: slightly defocus, angle the camera, 
│       or use AI post-processing
├── Yes, from scanned documents  
│   └── Use descreening (most scanner software has this)
│       or try specialized tools like descreening APIs
├── Yes, from fabric/product photos
│   └── Adjust camera distance/angle at capture time
│       or use AI removal in post-processing
└── Yes, from downscaling in your app
    └── Use proper anti-aliasing:
        CSS: image-rendering: auto; (not crisp-edges)
        Canvas: ctx.imageSmoothingEnabled = true;

Quick Win: Prevent Moiré in Canvas Downscaling

function downscaleWithAntiAlias(canvas, targetWidth, targetHeight) {
  // Step-down approach prevents moiré from aggressive downscaling
  const steps = Math.ceil(Math.log2(canvas.width / targetWidth));

  let currentCanvas = canvas;
  for (let i = 0; i < steps; i++) {
    const stepCanvas = document.createElement('canvas');
    stepCanvas.width = currentCanvas.width / 2;
    stepCanvas.height = currentCanvas.height / 2;

    const ctx = stepCanvas.getContext('2d');
    ctx.imageSmoothingEnabled = true;
    ctx.imageSmoothingQuality = 'high';
    ctx.drawImage(currentCanvas, 0, 0, stepCanvas.width, stepCanvas.height);

    currentCanvas = stepCanvas;
  }

  // Final resize to exact target
  const finalCanvas = document.createElement('canvas');
  finalCanvas.width = targetWidth;
  finalCanvas.height = targetHeight;
  const ctx = finalCanvas.getContext('2d');
  ctx.imageSmoothingEnabled = true;
  ctx.drawImage(currentCanvas, 0, 0, targetWidth, targetHeight);

  return finalCanvas;
}

Key Takeaways

  1. Moiré is physics, not a bug — it's aliasing from overlapping patterns
  2. Prevention > Cure — adjust capture conditions when possible
  3. AI removal is now practical — you don't need to implement FFT notch filters from scratch
  4. Think about it in your image pipeline — especially if you handle user-uploaded photos, scans, or screen captures

Resources

Have you dealt with moiré in your projects? I'd love to hear your approach in the comments.

Infrastructure Design for Credit Risk Modeling

2026-04-10 17:11:46

Common Problems Leading Institutions Face when Building Scorecards

One version of the truth. Two people asking the same question, or repeating the same exercise should get the same answer. One way to achieve this is by sharing and reusing data sources, data extraction logic, conditions such as filters and segmentation logic, models, parameters and variables, including logic for derived ones.

Transparency and audit. Anyone who needs to see details on each phase of the development process should be able to do so easily. For example, how data is transformed to create aggregated and derived variables, the parameters chosen for model fitting, how variables entered the model, validation details, and other parameters should preferably be stored in graphical user interface (GUI) format for review.

Retention of corporate intellectual property (IP)/knowledge. Practices such as writing unique code for each project and keeping it on individual PCs makes it harder to retain IP when key staff leave. Using programming-based modeling tools makes it more difficult to retain this IP as staff leaving take their coding skills with them. To counter this, many financial institutions have shifted to GUI software to reduce this loss and to introduce standardization.

Integration across the model development tasks. Integration across the continuum of activities from data set creation to validation, means that the output of each phase seamlessly gets used in the next.

Faster time to results. It sometimes takes months to build a model and implement it in many institutions, resulting in the use of inefficient or unstable models for longer than necessary.

WHAT IS THIS RUBBER DUCK FOR?

2026-04-10 17:08:09

This is a submission for the DEV April Fools Challenge

What I Built

The starting point was a brief moment in Harry Potter and the Chamber of Secrets, when Arthur Weasley asks Harry: "what exactly is the function of a rubber duck?"

arthur

I took that very essential question as the basis for the project and turned it into a ministerial interface for the formal examination of a yellow plastic duck.
So I built What Is This Duck For?, a web app that treats a rubber duck as a matter of public administration, emotional uncertainty, procedural concern, and possible domestic surveillance.

The app lets visitors submit the duck to five official interpretive modes:

  • Analyze the Duck — a mock-expert technical reading
  • Request Deeper Interpretation — overthinking, philosophical drift
  • Escalate to Ministry — paperwork, forms, bureaucratic misery
  • Trust the Duck — calm reassurance and quiet faith
  • Do Not Trust the Duck — bathroom paranoia and suspicious side-eye

Each mode generates a formal “ministerial report” with a hypothesis, classification, threat level, confidence score, and conclusion. The result is mildly concerning, far more official than necessary, and hopefully funny.

home page

Demo

You can try the live demo here:

Live Demo

The interface is designed to feel strangely official for something that should never have required official attention in the first place.

A typical session looks like this:

  • choose a mode
  • trigger the inquiry
  • watch the report compile itself
  • leave with fewer answers than you started with

Code

Source code is available here:

GitHub Repository

The project includes:

  • a FastAPI backend
  • Gemini 2.5 Flash
  • a custom frontend with a ministry-style interface
  • prompt-based persona logic
  • an About page explaining the investigation as if it were a real institutional service
  • custom duck persona assets for each official mode

How I Built It

I built the app with FastAPI, server-rendered templates, custom CSS, and JavaScript for the interactive report flow.

The biggest part of the work was not _“building a useful product,” _but overbuilding a completely useless one with as much conviction as possible.

A few highlights:

  • I designed the homepage as a fake bureaucratic dashboard
  • I created five distinct duck-analysis personas with different tones and behaviors
  • I refined the generated writing so each mode felt readable, funny, and specific
  • I added custom duck visuals for each persona on the About page
  • I deployed the demo on Hugging Face Spaces

For the AI side, I worked with Gemini CLI from the beginning, using it throughout the project to build, refine, and steer the experience. On the generation side, I used Google Gemini to produce structured, persona-driven reports that stayed coherent, readable, and funny.

One of the most interesting challenges was building the five analysis personas so they produced outputs that felt distinct, coherent, and consistently funny without becoming repetitive. That required a lot of prompt iteration — especially around the relationship between hypothesis and conclusion — and a gradual shift away from pure absurdity toward a more structured comic logic, all with Gemini 2.5 Flash.

Agents Sessions

1 - This initial prompt defined the joke, the scope, and the rules of the project from the start.

what_is_the_duck_for - initial prompt initial prompt
You

You are my coding assistant. I am building a deliberately silly challenge project called "What Is This Duck For?".

Context:
- This project is for a challenge whose spirit is: build something intentionally useless, funny, light, absurd, and simple.
- The concept: a very serious interface uses Google Gemini to analyze a yellow rubber duck, but never reaches a useful conclusion.
- This is NOT a serious product, NOT an MVP, NOT a scalable app.
- It must stay small, clean, readable, and fun.
- No overengineering.

Your role for this step:
Bootstrap the project environment from scratch inside the current workspace.

Current state:
- Workspace already exists.
- A Python virtual environment .venv already exists.
- Nothing else is installed yet.

Technical direction:
- Python backend with FastAPI
- Simple frontend with HTML/CSS/vanilla JS
- Gemini API integration later
- No database
- No authentication
- No React
- No Docker for now
- No testing setup yet unless strictly necessary
- Keep the architecture minimal

What I want you to do now:
1. Inspect the current workspace first.
2. Activate/use the existing .venv.
3. Install only the minimal dependencies needed for the initial project skeleton:
- fastapi
- uvicorn
- jinja2
- python-dotenv
- google-genai
4. Create a clean minimal project structure for this app.
5. Add a .gitignore.
6. Add a .env.example.
7. Add a minimal README.md that presents the project in the spirit of the challenge.
8. Add the minimal FastAPI app skeleton and the basic template/static folders, but do NOT build the full app yet.
9. Make sure the structure is simple and easy to continue step by step.

Recommended structure:
- app/
- main.py
- gemini_service.py
- prompts.py
- templates/
- index.html
- static/
- css/
- style.css
- js/
- app.js
- img/
- .gitignore
- .env.example
- README.md
- requirements.txt

Important constraints:
- Do not invent extra features.
- Do not add database, auth, admin panel, API complexity, or unnecessary abstractions.
- Do not implement the real Gemini logic yet beyond placeholder structure.
- Do not generate a complex UI yet.
- Keep names explicit and readable.
- Keep the codebase lightweight and hackathon-friendly.
- Prefer clarity over cleverness.

After you make the changes:
1. Show me the final file tree.
2. Explain briefly what you installed and created.
3. Tell me exactly what command I should run next to start the app locally.
4. Mention any point that still needs my manual input, especially the Gemini API key.

If something is ambiguous, make a reasonable minimal choice and state it clearly.

Show more
24 messages not shown
Agent

Providing the Final Response

The project "What Is This Duck For?" is now bootstrapped with a minimal FastAPI skeleton and a clean directory structure.

Final File Tree

.
├── app/
│   ├── static/
│   │   ├── css/
│   │   │   └── style.css
│   │   ├── img/
│   │   └── js/
│   │       └── app.js
│   ├── templates/
│   │   └── index.html
│   ├── gemini_service.py
│   ├── main.py
│   └── prompts.py
├── .env.example
├── .gitignore
├── README.md
└── requirements.txt

What was installed and created:

  1. Dependencies: fastapi, uvicorn, jinja2, python-dotenv, and google-genai were installed into your existing .venv. A requirements.txt was generated.
  2. Backend Skeleton: Created app/main.py with basic routing and static file serving.
  3. Gemini Service: Added app/gemini_service.py and app/prompts.py as placeholders for the AI logic.
  4. Frontend: Built a minimalist "Advanced Rubber Duck Analytics" interface using vanilla HTML, CSS, and JS.
  5. Project Docs: Added a README.md in the spirit of the challenge and a .env.example for configuration.

How to start the app locally:

  1. Prepare Environment: bash cp .env.example .env
  2. Run the Server: bash .venv/bin/uvicorn app.main:app --reload
  3. View the App: Open your browser at http://localhost:8000.

Manual Input Needed:

  • Gemini API Key: You need to edit the .env file and provide a valid GEMINI_API_KEY.
  • Duck Image: I've created the app/static/img/ folder. You might want to drop a picture of a rubber duck there later to replace the CSS placeholder.
Show more
36 messages not shown
You

You are my coding assistant.

We are continuing the project:
WHAT IS THIS DUCK FOR?

Important:
Do not implement the Gemini engine yet.
Do not redesign the UI.
Do not add backend complexity.
This step is about documentation, planning, and locking the behavior of the 5 buttons.

PROJECT STATE

The interface now includes the 5 final action buttons:
1. ANALYZE THE DUCK
2. REQUEST DEEPER INTERPRETATION
3. ESCALATE TO MINISTRY
4. TRUST THE DUCK
5. DO NOT TRUST THE DUCK

We want to keep the project simple and organized.
So before implementing the engine, we want:
- a small docs folder
- a clear planning document
- a task checklist we can follow and update
- a mini-spec for the 5 button modes

YOUR TASK

  1. Create a docs/ folder at the project root.

  2. Inside docs/, create these files:

  • project_plan.md
  • task_checklist.md
  • button_modes.md
  1. Fill them with simple, practical, readable content.

CONTENT REQUIREMENTS

A) docs/project_plan.md
Create a short project plan that reflects the real order of work:

  • current state
  • what is already done
  • what comes next
  • what should come later

The order should be:

  1. Environment bootstrap
  2. Basic app skeleton
  3. UI base
  4. Final button set in the interface
  5. Lock the behavior of each button
  6. Implement Gemini interaction
  7. Polish the UI and visual identity
  8. Final cleanup / submission preparation

Keep it concise.
Do not write corporate/project-management jargon.
Write it like a practical internal roadmap.

B) docs/task_checklist.md
Create a checklist with markdown checkboxes.

Include:
- completed tasks
- current tasks
- upcoming tasks

Use clear checkboxes like:
- [x] ...
- [ ] ...

Include at least:
- project bootstrapped
- FastAPI skeleton created
- single-page UI created
- 5 final buttons added
- button behavior spec pending
- Gemini backend pending
- report panel dynamic update pending
- rubber duck visual improvement pending
- aesthetic polish pending
- README/submission refinement pending

C) docs/button_modes.md
This is the most important file for this step.

Write a mini-spec for the 5 buttons.

For each button, define clearly:
- button label
- internal mode name suggestion
- role
- tone
- type of absurdity / angle of error
- expected response style
- what kind of conclusion it should tend to produce

Use these 5 buttons exactly:

  1. Analyze the Duck
  2. base pseudo-scientific analysis

  3. Request Deeper Interpretation

  4. over-interpretation / intellectual overreach

  5. Escalate to Ministry

  6. absurd bureaucracy / official classification

  7. Trust the Duck

  8. confident mystic reading / symbolic authority

  9. Do Not Trust the Duck

  10. suspicious administrative paranoia

IMPORTANT RULES

  • Keep all docs short and useful.
  • Do not overengineer.
  • Do not add product language.
  • Do not implement code logic yet unless absolutely necessary.
  • Do not change the backend or Gemini integration in this step.
  • If you need to make a very small frontend text adjustment for consistency, keep it minimal and list it explicitly.

OUTPUT REQUIRED

After making the changes, show me:

  1. the final file tree for the new docs/ folder
  2. the full contents of:
  3. docs/project_plan.md
  4. docs/task_checklist.md
  5. docs/button_modes.md
  6. a short explanation of how these docs will help the next implementation step
  7. if you changed any file outside docs/, list it explicitly
Show more
Agent

Structuring Project Documentation / Outlining the Button Spec / Formulating Button Mode Details / Executing the Plan's Steps

I will create the docs/ directory and populate it with the project plan, task checklist, and button mode specifications.

7 messages not shown
Agent

I have created the requested documentation and planning folder.

Final File Tree for docs/:

docs/
├── button_modes.md
├── project_plan.md
└── task_checklist.md

Contents Summary:

  • project_plan.md: Provides a concise, practical roadmap from project bootstrap to final cleanup, focusing on building a "serious, useless" application.
  • task_checklist.md: Lists all completed, current, and upcoming tasks with markdown checkboxes, covering infrastructure, interface, and AI integration phases.
  • button_modes.md: A detailed specification for the five analysis buttons, defining their internal mode names, tones, absurdity angles, and expected response styles.

How these docs will help:

  • Consistency: button_modes.md ensures that as we implement the AI prompts, each mode remains distinct and follows its intended "absurdity angle" (e.g., dry science vs. mystic prophecy).
  • Focus: project_plan.md keeps the project scope narrow, preventing feature creep and maintaining the simple, single-page concept.
  • Organization: task_checklist.md allows us to track our progress accurately and ensures we don't skip the crucial "polish" steps at the end.

No files were changed outside of the docs/ folder in this step. We are now ready to implement the backend logic to handle these specific modes.

Show more

2 - Persona Refinement | Making “Ministry” Funnier and More Concrete

Persona building | what is duck for persona_building
You

We are doing a targeted correction pass on the "ministry" persona only.

Do not change frontend.
Do not change backend flow.
Do not change logging.
Do not change JSON schema.
Do not modify the other personas unless strictly necessary.

DIAGNOSIS

The other personas are now much closer to the desired result.
The main remaining failure is "ministry".

Current problem:
"ministry" still tends to generate bureaucratic worldbuilding instead of a clear comic theory of function.

It produces:
- too much administrative texture
- too many clauses, sections, and procedural layers
- too much fake-ministry prose
- not enough immediate sense

The result is coherent in tone, but not funny enough and not clear enough.

CORRECTION GOAL

The ministry persona must no longer feel like:
"a generic bureaucratic monologue"

It must feel like:
"a civil servant calmly explaining the duck’s specific administrative function and the ridiculous procedural consequences attached to it"

NEW RULES FOR MINISTRY

  1. The hypothesis must clearly state what the duck is administratively FOR.
    It must describe a concrete official or semi-official function.

  2. The function must be simple, specific, and readable.

  3. The conclusion must comment on the administrative implications of that function.
    It must not spiral into ministry lore.

  4. Keep the bureaucracy petty, local, and humiliating.
    Not epic, not sprawling.

  5. Prefer:

  6. one form

  7. one permit

  8. one missing signature

  9. one wrong classification

  10. one absurd office rule
    over:

  11. multiple clauses

  12. layered subcommittees

  13. long directive chains

  14. abstract grievance systems

  15. The joke should be understandable even on first read.

  16. The duck must remain central.
    The bureaucracy must exist because of the duck’s function, not the other way around.

GOOD DIRECTION EXAMPLES

  • The duck serves as a floating compliance marker for baths already in progress.
  • The duck functions as an informal complaint receiver for minor bath-related grievances.
  • The duck is required as a witness object before a splash incident can be officially recorded.
  • The duck temporarily certifies that a bather is still emotionally eligible for continued soaking.

BAD DIRECTION EXAMPLES

  • abstract grievance manifestation object
  • sprawling directive chains
  • excessive clause stacking
  • vague ministry lore
  • classification replacing function

TONE TARGET

The humor should come from:
- procedural seriousness
- ridiculous administrative logic
- calm official language applied to a bath toy

It should be funny because it sounds annoyingly plausible,
not because it sounds huge or overbuilt.

TASK

Update only the ministry persona instructions in app/prompts.py.
Update docs only if needed for consistency.
Do not touch the other persona instructions unless required.

After changes, show:
1. the updated ministry prompt section
2. any doc changes if made
3. a short explanation of how you made ministry clearer, more concrete, and less lore-heavy
4. confirmation that JSON schema, frontend, backend flow, and logging were unchanged

Show more
Agent

Refining the Persona / Prioritizing Functionality

I will update the "ministry" persona in app/prompts.py to ensure it focuses on a clear, concrete administrative function for the duck, followed by a humorous interpretation of its procedural consequences. I'll then review and adjust docs/tone_charter.md and docs/button_modes.md to ensure they remain consistent with this more targeted, petty bureaucratic tone.

1 message not shown
Agent

I have updated the "ministry" persona in app/prompts.py to be more concrete and petty.

1. Updated Ministry Prompt Section

    "ministry": """
    PERSONA: THE PETTY CIVIL SERVANT
    Function Claim: Propose a simple, concrete administrative role for the duck (e.g., a witness for splash-claims, a floating compliance marker, a receiver for bath-grievances).
    Style: Calm, official, and annoyingly procedural. Focus on one specific hurdle: one missing permit, one wrong signature, or an expired bather-eligibility form.
    Avoid: Sprawling lore, multiple clauses, or epic bureaucracy. Keep it local, petty, and understandable.
    Conclusion: Explain the specific administrative consequences or jurisdictional headache caused by the hypothesis.
    """,

2. Documentation Changes

I have verified docs/tone_charter.md and docs/button_modes.md. They already align with this "petty bureaucrat" direction, so no further changes were required there.

3. Explanation of Changes

  • Concrete Function: The model is now explicitly told to propose a simple administrative role (e.g., "witness for splash-claims") instead of generating abstract "grievance manifestation" lore.
  • Petty Hurdles: I replaced the instruction for "rigid legalistic style" with a focus on specific, relatable administrative failures like "one missing permit" or "one wrong signature."
  • Reduced Lore: I banned "sprawling lore" and "epic bureaucracy" to ensure the joke remains grounded in the bathroom context and is immediately understandable.

4. Confirmations

  • JSON schema remains unchanged.
  • Frontend was not modified.
  • Backend flow remains exactly the same.
  • Logging is still active and unchanged.
Show more

Prize Category

I’m primarily submitting this for Best Google AI Usage.

Gemini was a core part of the project, both in the building process and in the app itself. It helped turn a simple joke into a system of distinct personas, structured reports, and consistently readable comic outputs rather than generic AI text.

And if the Ministry happens to win over the public, I would gladly accept the consequences under Community Favorite.

Thanks for reading, and please proceed with caution.
In matters involving the yellow rubber duck, certainty remains premature.

Project Glasswing and the Mythos Moment: A Critical Examination of AI's Cybersecurity Crossroads

2026-04-10 17:06:08

On April 7, 2026, Anthropic announced Project Glasswing—a defensive cybersecurity initiative built around Claude Mythos Preview, a frontier AI model so capable at finding and exploiting vulnerabilities that Anthropic deems it too dangerous for general public release. Backed by $100 million in usage credits and a "coalition of the willing" including Amazon, Apple, Google, Microsoft, Nvidia, the Linux Foundation, CrowdStrike, Palo Alto Networks, and more, Project Glasswing aims to give defenders a head start before similar capabilities proliferate to adversarial actors.

The announcement arrived during a remarkable week for Anthropic: the company disclosed $30 billion in annualized revenue (tripling in months), sealed a multi-gigawatt compute deal with Google and Broadcom, and faces potential IPO considerations. This timing raises immediate questions about whether Glasswing represents a watershed moment for cybersecurity, a strategic business move, or both.

What follows is a deep investigation drawing on Anthropic's own documentation, independent press analysis, technical community response, and security expert perspectives to evaluate Project Glasswing—the claims, the risks, the business strategy, and what it means for the future of digital security.

The Capabilities: Something Remarkable, or Marketing Hyperbole?

What Anthropic Claims

According to Anthropic's comprehensive technical evaluation, Claude Mythos Preview demonstrates:

  • Autonomous discovery of thousands of zero-day vulnerabilities in every major operating system and web browser
  • Ability to develop working exploits without human intervention—in one case chaining together four distinct vulnerabilities to escape browser sandboxes
  • Spectacular benchmark results: 83.1% on CyberGym versus 66.6% for Claude Opus 4.6, and 93.9% on SWE-bench Verified

Particularly striking are specific examples:

  1. A 27-year-old vulnerability in OpenBSD—a security-focused OS—that allowed remote crash by mere connection
  2. A 16-year-old bug in FFmpeg's H.264 codec, surviving five million automated fuzzing attempts
  3. Autonomous local privilege escalation exploits on Linux by chaining multiple vulnerabilities

External Verification

FFmpeg maintainers have confirmed patches were submitted noting they "appear to be written by humans." Greg Kroah-Hartman, the Linux stable maintainer, has publicly stated: "Months ago, we were getting 'AI slop'... Something happened a month ago, and the world switched. Now we have real reports." Security teams across major open source projects report the same shift.

Forbes analyst Paulo Carvão notes that the evidence is "difficult to dismiss" given that Mythos can "chain together vulnerabilities that individually appear benign but collectively yield complete system compromise."

The Skeptical Community Response

On Hacker News, responses range from excitement about genuine advancement to bitter skepticism about relentless "doomer" marketing. One security professional noted they've already had success using existing models: "I've had these successes without scaffolding or really anything past Claude CLI and a small prompt as well? So like I'm in a weird place where this was already happening and Mythos is being sold like it wasn't good before?"

Others point out that we've heard dramatic breakthrough claims before. Anthropic's own CEO previously claimed 90% of code would be written by LLMs within 3-6 months—a timeline clearly not met. There's fatigue with each iteration being framed as world-endingly powerful.

Critical Assessment

This appears to be a genuine capability leap, not pure marketing. The technical documentation demonstrates stepwise exploit development that goes well beyond what was previously possible with autonomous AI. The 4% to 85% increase in Firefox exploit success rate (per Anthropic's internal comparisons between Opus 4.6 and Mythos) is substantial.

However, the implications are where hype and reality diverge. The capability is real. Whether it necessitates the dramatic response Anthropic has mounted—and whether Anthropic is the appropriate custodian—is less clear.

The Strategy: Controlled Release or Market Creation?

Anthropic's Stated Rationale

Anthropic makes a straightforward argument: Frontier AI cybersecurity capabilities are approaching (or have reached) a level that could fundamentally alter the security landscape. By limiting Mythos Preview to vetted defensive partners, they give defenders time to harden systems before similar capabilities become broadly available to adversaries.

This is framed as responsible AI governance—a model considered "too dangerous to release publicly" being deployed exclusively for defensive purposes.

Business and Competitive Dimensions

Forbes identifies five factors driving the invite-only rollout:

  1. Real capability jump (as discussed)
  2. Responsible AI governance positioning
  3. Strategic marketing through scarcity—a narrative that generates enormous press
  4. Capacity constraints—Anthropic is throttling usage; the model is compute-intensive
  5. Premium pricing—$25/$125 per million input/output tokens (versus $5/$25 for Opus), positioning Mythos as a luxury security product

VentureBeat adds crucial context: The same day Glasswing launched, Anthropic disclosed $30B in revenue and sealed the Google-Broadcom compute deal. The timing intersects with IPO speculation. A "high-profile, government-adjacent cybersecurity initiative with blue-chip partners is exactly the kind of program that burnishes an IPO narrative."

Who Actually Gains Access?

The coalition structure creates an interesting dynamic. Tech competitors (Google vs. Microsoft) are both included. Smaller organizations and open-source maintainers are granted access via programs like "Claude for Open Source," with $4M in direct donations to open-source security organizations.

But critics note this creates new forms of exclusion. As one Hacker News commenter observed: "The fact that you won't be able to produce secure software without access to one of these models. Good for them $."

Whether the goal is truly defense for all, or defense for those who can afford/partner with Anthropic, is genuinely unclear.

The Risks: Defense, Offense, and the Zero-Day Explosion

The Core Paradox

The fundamental challenge Mythos presents is that the same capabilities used by defenders to find and fix vulnerabilities can be used by attackers to find and exploit them. Anthropic acknowledges this explicitly but argues that "the advantage will belong to the side that can get the most out of these tools."

In the short term, Anthropic warns, attackers who gain access to similar capabilities first could have a decisive advantage. In the long term, they expect defenders to prevail due to their ability to direct more resources and fix bugs before code ships.

The "transitional period" could be tumultuous.

What Happens When Adversaries Get Similar Models?

Malware News reports serious concern within the intelligence community. Analysts are "casually chatting" about the Mythos release. Multiple officials note that U.S. agencies both defend networks and conduct offensive operations—and stockpile zero-days for future use.

Hayden Smith of Hunted Labs calls the news "scary and ominous" because the offensive potential is unclear. "Even with deep vetting, the odds of Mythos flowing into the wrong hands is barely a hypothetical given the landscape of current attacks on the open source ecosystem."

The concern isn't just state actors. As one executive at a cyber investment firm asked: "How is anyone supposed to defend against all of this at once?"

The Patching Problem

Perhaps the most overlooked risk is the downstream impact of discovering thousands of vulnerabilities simultaneously. As Anthropic itself notes in its Red Team blog, "over 99% of the vulnerabilities we've found have not yet been patched."

Flooding maintainers—many of whom are unpaid volunteers—with critical vulnerabilities at scale could overwhelm the very processes needed to fix them. Anthropic has built a triage pipeline to manually validate reports before submission, but bottlenecks seem inevitable.

The 45-day coordinated disclosure window assumes maintainers can produce, test, and ship complex patches within that time—a presumption that may not hold for kernel-level vulnerabilities in critical systems.

Geopolitical Implications: AI as an Arms Race Component

The U.S. Government Relationship

Morgan Adamski, former executive director at U.S. Cyber Command, notes that "there's obviously a huge potential there from an adversarial perspective" for offensive use. She highlights an "equity conversation": if the U.S. exploits something in an adversarial network, it must also defend against that same vulnerability in its own infrastructure.

Anthropic has briefed senior officials across the U.S. government on Mythos's capabilities, including both offensive and defensive applications. This comes after contentious disputes with the Pentagon over military uses of Claude, which saw Anthropic designated a "supply chain risk" before securing a preliminary injunction.

Leah Siskind of the Foundation for Defense of Democracies argues: "The government 'needs to make amends with Anthropic and help them and Glasswing members maintain the American lead on AI by preventing Chinese model theft.'"

The International Dimension

As Project Glasswing proceeds, other nations (particularly China, Russia, and U.S. adversaries) will almost certainly develop or acquire similar capabilities. Mythos-level models will eventually proliferate. The question isn't whether, but when—and whether the defensive advantages gained during the controlled rollout period will be durable.

One concern: By making Mythos capabilities known while restricting access, Anthropic may have inadvertently created a roadmap for other AI labs to target. The technical specifications described in the system card provide a benchmark to aim for.

Trust and Irony: The Custodian Problem

Anthropic's Security Track Record

It is rich irony that Anthropic—asking governments and Fortune 500 companies to trust it with a model capable of autonomously exploiting Linux kernels—has suffered notable security lapses:

  1. A draft Mythos blog post was left in an unsecured, publicly searchable data store in March 2026, exposing roughly 3,000 internal assets
  2. For approximately three hours in March 2026, anyone running npm install on Claude Code pulled down 512,000 lines of Anthropic's source code due to a packaging error

Nicholas Carlini of Anthropic distinguishes these as "human errors in publishing tooling" rather than breaches of core security architecture—accurate as far as it goes, but a distinction that may not reassure stakeholders.

The Boy Who Cried Wolf?

There is legitimate concern about alarm fatigue. As Hacker News commenters note, every model is framed as revolutionizing everything, predicting doom if mishandled. When the next genuinely concerning capability arrives, will security practitioners—and the public—still be listening?

Conversely, as others pointed out: "Tuning out completely because of the existence of false positives is not a good choice." The villagers may tire of the boy crying wolf, but wolves do eventually arrive.

Pros and Cons: A Critical Summary

Pros

Aspect Assessment
Genuine capability improvement The demonstrated ability to autonomously find and chain vulnerabilities is a real step forward
Proactive defense Finding bugs before adversaries do is fundamentally sound strategy
Open-source support $4M in donations to OSS security addresses real asymmetries in resources
Responsible disclosure pipeline Triage and human validation demonstrate awareness of maintenance bottlenecks
Transparency Detailed technical documentation with cryptographic commitments shows seriousness
Coalition approach Bringing competitors together on security reduces fragmentation

Cons

Aspect Assessment
Exclusionary access Creates dependency on Anthropic; smaller actors may be left behind
FOMO and coercion Organizations may join not out of belief but fear of seeming negligent
Overwhelmed maintainers Even with triage, the scale of findings risks swamping patching capacity
Verification limited Access restrictions make independent verification of claims difficult
Business opportunism Timing with IPO and revenue milestones suggests mixed motives
Geopolitical escalation Demonstrating capabilities may accelerate adversarial AI development
Trust issues Anthropic's security lapses undermine its credibility as gatekeeper

Critical Opinions from Multiple Perspectives

The Security Community

On Hacker News, security professionals express a range of views:

  • Skeptical: "This looks more like another lobby group...The 'urgency' is very likely mostly appreciated to drive policy."
  • Concerned: "How is anyone supposed to defend against all of this at once?"
  • Measured: "I side with you but on the other hand: this is how it works to get attention by those who aren't affiliated with computer science and AI."
  • Optimistic: "At launch, a technology is considered dangerous for being too powerful. 3 months later, you are an absolute idiot to still be using that useless model."

Greg Kroah-Hartman's quote—about the "world switched" from AI slop to real reports—stands out as evidence from a respected figure in Linux development.

Industry Analysts

Paulo Carvão at Forbes takes a nuanced view, noting both genuine capability and strategic positioning: "This announcement cannot be understood in isolation" from Anthropic's revenue growth and compute deals. The restricted rollout serves multiple purposes.

Michael Nuñez at VentureBeat focuses on the fundamental wager: "Anthropic is, in essence, betting that transparency can outrun proliferation."

Intelligence and Government Concerns

Morgan Adamski emphasizes the offense-defense equivalence: "If cyberintelligence analysts find a novel vulnerability in an enemy computer network, it's possible a U.S. system might have the same vulnerability, too."

The intelligence community's "casual" discussions and serious concern about adversarial acquisition mirror the stakes: this isn't just a cybersecurity issue; it's a national security issue.

The Open-Source Perspective

Jim Zemlin, CEO of the Linux Foundation, provides perhaps the most compelling endorsement: "In the past, security expertise has been a luxury reserved for organizations with large security teams. Open-source maintainers—whose software underpins much of the world's critical infrastructure—have historically been left to figure out security on their own." Project Glasswing, he says, "offers a credible path to changing that equation."

This gets at a real problem: the asymmetry between well-resourced corporations and the volunteer-maintained projects that form software's foundation.

Conclusion: A Necessary Step, But A Flawed One?

Project Glasswing represents a genuinely significant moment in AI development. The technical capabilities of Claude Mythos Preview appear real enough that Anthropic—not a company known for understatement—is willing to frame them as too dangerous for public release. The decision to limit access to defensive partners and invest in open-source security is, in principle, defensible.

But the initiative is also deeply problematic:

  1. It concentrates power in Anthropic's hands during a transition period that will be contested globally
  2. It markets through scarcity, creating artificial urgency that serves business interests
  3. It may overwhelm the very maintenance processes needed to address discovered vulnerabilities
  4. It invites escalation, as other labs rush to match or exceed demonstrated capabilities
  5. It suffers from trust deficits, given Anthropic's own security history and the incentives of a company on an IPO trajectory

The core question—whether Project Glasswing genuinely makes the world more secure, or merely reshapes advantage within existing power structures—has no clear answer yet. The only certainty is that the age of AI-augmented cyberconflict has begun in earnest. The glasswing's transparent wings hide vulnerabilities well. But in seeking to reveal those vulnerabilities to defenders first, Anthropic may have revealed something else: just how quickly the ground beneath cybersecurity's feet is shifting.

In the coming months—before the next frontier lab announces its own game-changing model, before adversarial access reaches Mythos-equivalent levels, before the inevitable disclosure of vulnerabilities that even Anthropic cannot contain—we will learn whether controlled releases like Project Glasswing can genuinely preserve a defensive advantage, or whether the fundamental symmetries of offense and defense make this a game of diminishing returns.

The wolf may or may not have arrived. But when it does, the villages that invested in defenses during the calm will have a better chance. Whether Anthropic should be the one selling those defenses is the question that remains.