MoreRSS

site iconHackerNoonModify

We are an open and international community of 45,000+ contributing writers publishing stories and expertise for 4+ million curious and insightful monthly readers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of HackerNoon

什么是卷影复制,为什么Windows 11依赖它?

2026-02-02 05:48:10

This article explains how Volume Shadow Copy and the Volume Shadow Copy Service work in Windows 11, covering snapshot behavior, copy-on-write storage, dependent services, and key administrative commands for managing system recovery.

LLM作为集成端点:使用LangChain4j Chat构建Apache Camel路由

2026-02-02 05:19:47

Building AI features is straightforward until you need to integrate them into a production system.

In production, a simple 'call an LLM' often becomes:

  • wiring HTTP/JMS/Kafka/file inputs,
  • enforcing timeouts, retries, fallbacks,
  • stitching context obtained from internal data sources,
  • protecting secrets,
  • making it testable without burning API credits,
  • and ensuring the system is observable when issues occur overnight.

This is where Apache Camel excels. With Camel 4.5 and later, you can handle LLM calls as standard integration endpoints using camel-langchain4j-chat, powered by LangChain4j.

In this tutorial, you will build a Java 21 and Gradle project that demonstrates:

  1. Single-message chat (CHATSINGLEMESSAGE)
  2. Prompt templates + variables (CHATSINGLEMESSAGEWITHPROMPT)
  3. Chat history (CHATMULTIPLEMESSAGES)
  4. RAG with Camel’s Content Enricher (EIP + aggregator strategy)
  5. RAG via headers (simple “inject context” approach)
  6. A mock mode so everything executes in CI without API keys.

You’ll finish with a project you can reuse as a foundation for real integration flows, without turning your codebase into a “prompt spaghetti factory”.

What We’re Building

A small runnable CLI app that boots Camel Main (no Spring required) and runs five demos:

  • single
  • prompt
  • history
  • rag-enrich
  • rag-header

You can run them all or one at a time from the command line.

Prerequisites

  • Java 21
  • Basic Apache Camel familiarity (routes, direct: endpoints, ProducerTemplate)
  • Optional: OpenAI API key (the tutorial runs fully without it)

Architecture in One Picture

Here’s the mental model:

\ \ For RAG, Camel enriches the exchange before calling the LLM:

Project Setup

Tech Stack

  • Java 21
  • Apache Camel 4.17.0
  • LangChain4j 1.10.0
  • Gradle 8.x
  • OpenAI GPT-4o-mini (optional)'

Folder Structure

camel-langchain4j-chat-demo/
├── build.gradle
├── settings.gradle
├── src/
│   ├── main/
│   │   ├── java/com/example/langchain4j/
│   │   │   ├── App.java
│   │   │   ├── ChatRoutes.java
│   │   │   ├── MockChatModel.java
│   │   │   └── ModelFactory.java
│   │   └── resources/
│   │       ├── application.properties
│   │       └── logback.xml
│   └── test/
│       └── java/com/example/langchain4j/
│           └── ChatRoutesTest.java
└── README.md

Step 1: Gradle Setup

Create build.gradle:

plugins {
    id 'java'
    id 'application'
}

group = 'com.example'
version = '1.0.0'

java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(21)
    }
}

application {
    mainClass = 'com.example.langchain4j.App'
}

repositories {
    mavenCentral()
}

dependencies {
    implementation 'org.apache.camel:camel-core:4.17.0'
    implementation 'org.apache.camel:camel-main:4.17.0'
    implementation 'org.apache.camel:camel-langchain4j-chat:4.17.0'
    implementation 'dev.langchain4j:langchain4j-core:1.10.0'
    implementation 'dev.langchain4j:langchain4j-open-ai:1.10.0'
    implementation 'ch.qos.logback:logback-classic:1.5.15'
    testImplementation 'org.junit.jupiter:junit-jupiter:5.11.4'
    testImplementation 'org.apache.camel:camel-test-junit5:4.17.0'
    testImplementation 'org.assertj:assertj-core:3.27.3'
}

test {
    useJUnitPlatform()
}

And settings.gradle:

rootProject.name = 'camel-langchain4j-chat-demo' 

Step 2: Configuration and Logging

application.properties

# Application Mode (mock or openai)
app.mode=openai

# OpenAI Configuration (only used when app.mode=openai)
openai.apiKey=sk-*****
openai.modelName=gpt-4o-mini
openai.temperature=0.3

# Apache Camel Configuration
camel.main.name=camel-langchain4j-chat-demo
camel.main.duration-max-seconds=0
camel.main.shutdown-timeout=30

# Logging Configuration
logging.level.root=INFO
logging.level.org.apache.camel=INFO
logging.level.com.example.langchain4j=DEBUG

logback.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <root level="INFO">
        <appender-ref ref="CONSOLE"/>
    </root>

    <logger name="com.example.langchain4j" level="DEBUG"/>
    <logger name="org.apache.camel" level="INFO"/>
</configuration>

Step 3: Production-Friendly “Mock Mode” (No API Key Required)

If you want this to be more than a toy demo, you need a way to run it without external dependencies.

That’s why we implement a deterministic MockChatModel:

MockChatModel.java

package com.example.langchain4j;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;

/**
 * Mock implementation of ChatModel for testing and demo without API keys.
 */
public class MockChatModel implements ChatModel {

    @Override
    public ChatResponse chat(ChatRequest request) {
        StringBuilder response = new StringBuilder("[MOCK] Responding to: ");

        if (request != null && request.messages() != null && !request.messages().isEmpty()) {
            ChatMessage lastMessage = request.messages().get(request.messages().size() - 1);
            // Use text() method which should exist
            String userText = "";
            try {
                userText = (String) lastMessage.getClass().getMethod("text").invoke(lastMessage);
            } catch (Exception e) {
                userText = lastMessage.toString();
            }

            // Generate deterministic response based on content
            if (userText.contains("recipe") || userText.contains("dish")) {
                response.append("Here's a delicious recipe with your requested ingredients!");
            } else if (userText.contains("Apache Camel")) {
                response.append("Apache Camel is a powerful integration framework!");
            } else if (userText.contains("capital")) {
                response.append("Paris is the capital of France.");
            } else {
                response.append("I understand your question about: ").append(userText.substring(0, Math.min(50, userText.length())));
            }
        } else {
            response.append("Hello! I'm a mock AI assistant.");
        }

        return ChatResponse.builder()
                .aiMessage(new AiMessage(response.toString()))
                .build();
    }
}

Why this matters in production:

  • CI/CD runs don’t require an LLM.
  • Unit tests are stable.
  • Engineers can work offline.
  • You can validate flow logic (routing, RAG, templates) before spending money.

Step 4: Choosing a Model at Runtime (OpenAI or Mock)

ModelFactory.java

package com.example.langchain4j;

import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.time.Duration;

/**
 * Factory to create ChatModel instances based on configuration.
 */
public class ModelFactory {
    private static final Logger log = LoggerFactory.getLogger(ModelFactory.class);

    public static ChatModel createChatModel(String mode, String apiKey, String modelName, Double temperature) {
        if ("openai".equalsIgnoreCase(mode) && apiKey != null && !apiKey.isEmpty()) {
            log.info("Creating OpenAI chat model with model: {}", modelName);
            return OpenAiChatModel.builder()
                    .apiKey(apiKey)
                    .modelName(modelName)
                    .temperature(temperature)
                    .timeout(Duration.ofSeconds(60))
                    .build();
        } else {
            log.info("Creating Mock chat model (API key not provided or mode is mock)");
            return new MockChatModel();
        }
    }
}

Step 5: The Camel Routes (Where AI Becomes “Just Another Endpoint”)

The Camel component URI looks like:

langchain4j-chat:chatId?chatModel=#beanName&chatOperation=OPERATION
  • chatId is just an identifier
  • chatModel=#chatModel references a bean in the Camel registry
  • chatOperation controls the behavior

ChatRoutes.java

package com.example.langchain4j;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.rag.content.Content;
import org.apache.camel.AggregationStrategy;
import org.apache.camel.Exchange;
import org.apache.camel.builder.RouteBuilder;
import org.apache.camel.component.langchain4j.chat.LangChain4jChat;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

/**
 * Apache Camel routes demonstrating camel-langchain4j-chat component usage.
 */
public class ChatRoutes extends RouteBuilder {

    @Override
    public void configure() throws Exception {

        // Demo 1: CHAT_SINGLE_MESSAGE - Simple question/answer
        from("direct:single")
            .log("Demo 1: Single message - Input: ${body}")
            .to("langchain4j-chat:demo?chatModel=#chatModel&chatOperation=CHAT_SINGLE_MESSAGE")
            .log("Demo 1: Response: ${body}");

        // Demo 2: CHAT_SINGLE_MESSAGE_WITH_PROMPT - Using prompt template with variables
        from("direct:prompt")
            .log("Demo 2: Prompt template - Variables: ${body}")
            .process(exchange -> {
                // Set the prompt template in header
                String template = "Create a recipe for a {{dishType}} with these ingredients: {{ingredients}}";
                exchange.getIn().setHeader("CamelLangChain4jChatPromptTemplate", template);
            })
            .to("langchain4j-chat:demo?chatModel=#chatModel&chatOperation=CHAT_SINGLE_MESSAGE_WITH_PROMPT")
            .log("Demo 2: Response: ${body}");

        // Demo 3: CHAT_MULTIPLE_MESSAGES - Chat with history/context
        from("direct:history")
            .log("Demo 3: Multiple messages with history")
            .process(exchange -> {
                // Build a conversation with system message, previous context, and user question
                List<ChatMessage> messages = new ArrayList<>();
                messages.add(new SystemMessage("You are a helpful AI assistant specialized in Apache Camel."));
                messages.add(new UserMessage("What is Apache Camel?"));
                messages.add(new AiMessage("Apache Camel is an open-source integration framework based on enterprise integration patterns."));
                messages.add(new UserMessage("What are some key features?"));

                exchange.getIn().setBody(messages);
            })
            .to("langchain4j-chat:demo?chatModel=#chatModel&chatOperation=CHAT_MULTIPLE_MESSAGES")
            .log("Demo 3: Response: ${body}");

        // Demo 4: RAG using Content Enricher pattern with LangChain4jRagAggregatorStrategy
        from("direct:rag-enrich")
            .log("Demo 4: RAG with Content Enricher - Question: ${body}")
            .enrich("direct:rag-source", new LangChain4jRagAggregatorStrategy())
            .to("langchain4j-chat:demo?chatModel=#chatModel&chatOperation=CHAT_SINGLE_MESSAGE")
            .log("Demo 4: Response: ${body}");

        // RAG knowledge source route
        from("direct:rag-source")
            .log("Fetching RAG knowledge...")
            .process(exchange -> {
                // Simulate fetching relevant documents/snippets
                List<String> knowledgeSnippets = new ArrayList<>();
                knowledgeSnippets.add("Apache Camel 4.x introduced the concept of lightweight mode for faster startup.");
                knowledgeSnippets.add("The camel-langchain4j-chat component supports multiple chat operations including single message, prompt templates, and chat history.");
                knowledgeSnippets.add("LangChain4j integration allows Camel routes to interact with various LLM providers like OpenAI, Azure OpenAI, and more.");

                exchange.getIn().setBody(knowledgeSnippets);
            });

        // Demo 5: RAG using CamelLangChain4jChatAugmentedData header
        from("direct:rag-header")
            .log("Demo 5: RAG with header - Question: ${body}")
            .process(exchange -> {
                String question = exchange.getIn().getBody(String.class);

                // Create augmented data content
                List<Content> augmentedData = new ArrayList<>();
                augmentedData.add(Content.from("Apache Camel version 4.0 was released in 2023 with major improvements."));
                augmentedData.add(Content.from("The LangChain4j component enables AI-powered integration patterns in Camel routes."));

                // Set augmented data in header
                exchange.getIn().setHeader("CamelLangChain4jChatAugmentedData", augmentedData);

                // Reset body to the question
                exchange.getIn().setBody(question);
            })
            .to("langchain4j-chat:demo?chatModel=#chatModel&chatOperation=CHAT_SINGLE_MESSAGE")
            .log("Demo 5: Response: ${body}");
    }

    /**
     * Custom aggregation strategy for RAG pattern using Content Enricher.
     */
    private static class LangChain4jRagAggregatorStrategy implements AggregationStrategy {
        @Override
        public Exchange aggregate(Exchange original, Exchange resource) {
            String question = original.getIn().getBody(String.class);
            List<String> knowledgeSnippets = resource.getIn().getBody(List.class);

            // Build augmented prompt with context
            StringBuilder augmentedPrompt = new StringBuilder();
            augmentedPrompt.append("Context:\n");
            for (String snippet : knowledgeSnippets) {
                augmentedPrompt.append("- ").append(snippet).append("\n");
            }
            augmentedPrompt.append("\nQuestion: ").append(question);

            original.getIn().setBody(augmentedPrompt.toString());
            return original;
        }
    }
}

Why Camel’s approach is useful

Camel gives you EIPs (Enterprise Integration Patterns) that apply beautifully to AI:

  • Content Enricher → RAG
  • Circuit breaker → LLM resilience
  • Throttling → cost control
  • Dead letter channel → failure routing
  • Idempotency → don’t process the same request twice

You don’t need a new architecture. You reuse proven integration patterns.

Step 6: Bootstrapping Camel Main + Running Demos

App.java

package com.example.langchain4j;

import dev.langchain4j.model.chat.ChatModel;
import org.apache.camel.CamelContext;
import org.apache.camel.ProducerTemplate;
import org.apache.camel.main.Main;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.HashMap;
import java.util.Map;

/**
 * Main application class demonstrating Apache Camel LangChain4j Chat component.
 */
public class App {
    private static final Logger log = LoggerFactory.getLogger(App.class);

    public static void main(String[] args) throws Exception {
        Main main = new Main();

        // Configure Camel Main
        main.configure().addRoutesBuilder(new ChatRoutes());

        // Initialize Camel context first (without starting routes)
        main.init();

        CamelContext camelContext = main.getCamelContext();

        // Read configuration from properties
        String mode = camelContext.resolvePropertyPlaceholders("{{app.mode}}");
        String apiKey = System.getenv("OPENAI_API_KEY");
        if (apiKey == null || apiKey.isEmpty()) {
            apiKey = camelContext.resolvePropertyPlaceholders("{{openai.apiKey}}");
        }
        String modelName = camelContext.resolvePropertyPlaceholders("{{openai.modelName}}");
        Double temperature = Double.parseDouble(camelContext.resolvePropertyPlaceholders("{{openai.temperature}}"));

        // Create and register ChatModel BEFORE starting routes
        ChatModel chatModel = ModelFactory.createChatModel(mode, apiKey, modelName, temperature);
        camelContext.getRegistry().bind("chatModel", chatModel);

        log.info("=".repeat(80));
        log.info("Apache Camel LangChain4j Chat Demo");
        log.info("Mode: {}", mode);
        log.info("=".repeat(80));

        // Now start Camel
        main.start();

        try {
            // Determine which demo(s) to run
            String demoMode = args.length > 0 ? args[0] : "all";

            ProducerTemplate template = camelContext.createProducerTemplate();

            switch (demoMode.toLowerCase()) {
                case "single":
                    runSingleMessageDemo(template);
                    break;
                case "prompt":
                    runPromptTemplateDemo(template);
                    break;
                case "history":
                    runChatHistoryDemo(template);
                    break;
                case "rag-enrich":
                    runRagEnrichDemo(template);
                    break;
                case "rag-header":
                    runRagHeaderDemo(template);
                    break;
                case "all":
                default:
                    runAllDemos(template);
                    break;
            }

            log.info("=".repeat(80));
            log.info("Demo completed! Shutting down...");
            log.info("=".repeat(80));

        } finally {
            // Shutdown after demos complete
            main.stop();
        }
    }

    private static void runAllDemos(ProducerTemplate template) {
        runSingleMessageDemo(template);
        runPromptTemplateDemo(template);
        runChatHistoryDemo(template);
        runRagEnrichDemo(template);
        runRagHeaderDemo(template);
    }

    private static void runSingleMessageDemo(ProducerTemplate template) {
        log.info("\n" + "=".repeat(80));
        log.info("DEMO 1: CHAT_SINGLE_MESSAGE");
        log.info("=".repeat(80));

        String question = "What is the capital of France?";
        String response = template.requestBody("direct:single", question, String.class);

        log.info("Question: {}", question);
        log.info("Answer: {}", response);
    }

    private static void runPromptTemplateDemo(ProducerTemplate template) {
        log.info("\n" + "=".repeat(80));
        log.info("DEMO 2: CHAT_SINGLE_MESSAGE_WITH_PROMPT (Prompt Template)");
        log.info("=".repeat(80));

        Map<String, Object> variables = new HashMap<>();
        variables.put("dishType", "pasta");
        variables.put("ingredients", "tomatoes, garlic, basil, olive oil");

        String response = template.requestBody("direct:prompt", variables, String.class);

        log.info("Template: Create a recipe for a {{dishType}} with these ingredients: {{ingredients}}");
        log.info("Variables: {}", variables);
        log.info("Answer: {}", response);
    }

    private static void runChatHistoryDemo(ProducerTemplate template) {
        log.info("\n" + "=".repeat(80));
        log.info("DEMO 3: CHAT_MULTIPLE_MESSAGES (Chat History)");
        log.info("=".repeat(80));

        log.info("Building conversation with context...");
        String response = template.requestBody("direct:history", null, String.class);

        log.info("Final Answer: {}", response);
    }

    private static void runRagEnrichDemo(ProducerTemplate template) {
        log.info("\n" + "=".repeat(80));
        log.info("DEMO 4: RAG with Content Enricher Pattern");
        log.info("=".repeat(80));

        String question = "What's new in Apache Camel 4.x?";
        String response = template.requestBody("direct:rag-enrich", question, String.class);

        log.info("Question: {}", question);
        log.info("Answer (with RAG context): {}", response);
    }

    private static void runRagHeaderDemo(ProducerTemplate template) {
        log.info("\n" + "=".repeat(80));
        log.info("DEMO 5: RAG with CamelLangChain4jChatAugmentedData Header");
        log.info("=".repeat(80));

        String question = "Tell me about the LangChain4j integration in Camel";
        String response = template.requestBody("direct:rag-header", question, String.class);

        log.info("Question: {}", question);
        log.info("Answer (with augmented data): {}", response);
    }
}

Running It

Build + run all demos (mock mode)

./gradlew clean test
./gradlew run

Run one demo

./gradlew run --args="prompt"
./gradlew run --args="rag-enrich"

Enable OpenAI mode

Option A (properties):

app.mode=openai
openai.apiKey=sk-...

Option B (recommended): environment variable

Windows (PowerShell)

setx OPENAI_API_KEY "sk-..."

Then in application.properties you can use a placeholder:

openai.apiKey=${OPENAI_API_KEY:}

Example Output (Mock Mode)

You’ll see logs like:

=== DEMO 1: CHAT_SINGLE_MESSAGE ===
Q: What is the capital of France?
A: [MOCK] The capital of France is Paris.

Best Practices (The Stuff You’ll Appreciate Later)

1) Never hardcode API keys

Use environment variables or a secrets manager. Even in demos, set the pattern.

openai.apiKey=${OPENAI_API_KEY:}

2) Add timeouts and resilience early

LLMs are network calls. Treat them like any dependency: timeouts, retries, fallback.

In Camel, this typically becomes:

  • timeout() / circuitBreaker() (Resilience4j)
  • onException() for controlled failure paths
  • throttle() to protect budget and upstream limits

3) Keep prompts versioned and externalized

Prompts are “business logic”. Don’t bury them as strings in random methods.

At minimum: store templates in resources/ and load them, or manage them as versioned assets.

4) Use mock mode + tests to protect your pipeline

Your LLM code should be testable without network calls. That’s what MockChatModel gives you.

Unit Testing the Routes

ChatRoutesTest.java

package com.example.langchain4j;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import org.apache.camel.CamelContext;
import org.apache.camel.ProducerTemplate;
import org.apache.camel.builder.RouteBuilder;
import org.apache.camel.test.junit5.CamelTestSupport;
import org.junit.jupiter.api.Test;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

import static org.assertj.core.api.Assertions.assertThat;

/**
 * Unit tests for ChatRoutes using MockChatModel.
 */
public class ChatRoutesTest extends CamelTestSupport {

    @Override
    protected CamelContext createCamelContext() throws Exception {
        CamelContext context = super.createCamelContext();

        // Register MockChatModel for testing
        context.getRegistry().bind("chatModel", new MockChatModel());

        return context;
    }

    @Override
    protected RouteBuilder createRouteBuilder() {
        return new ChatRoutes();
    }

    @Test
    public void testSingleMessage() {
        ProducerTemplate template = context.createProducerTemplate();

        String question = "What is the capital of France?";
        String response = template.requestBody("direct:single", question, String.class);

        assertThat(response)
            .isNotNull()
            .isNotEmpty()
            .contains("MOCK")
            .contains("capital");
    }

    @Test
    public void testPromptTemplate() {
        ProducerTemplate template = context.createProducerTemplate();

        Map<String, Object> variables = new HashMap<>();
        variables.put("dishType", "pasta");
        variables.put("ingredients", "tomatoes, garlic, basil");

        String response = template.requestBody("direct:prompt", variables, String.class);

        assertThat(response)
            .isNotNull()
            .isNotEmpty()
            .contains("MOCK")
            .containsAnyOf("recipe", "dish");
    }

    @Test
    public void testChatHistory() {
        ProducerTemplate template = context.createProducerTemplate();

        String response = template.requestBody("direct:history", null, String.class);

        assertThat(response)
            .isNotNull()
            .isNotEmpty()
            .contains("MOCK");
    }

    @Test
    public void testRagEnrich() {
        ProducerTemplate template = context.createProducerTemplate();

        String question = "What's new in Apache Camel 4.x?";
        String response = template.requestBody("direct:rag-enrich", question, String.class);

        assertThat(response)
            .isNotNull()
            .isNotEmpty()
            .contains("MOCK")
            .containsAnyOf("Camel", "Apache");
    }

    @Test
    public void testRagHeader() {
        ProducerTemplate template = context.createProducerTemplate();

        String question = "Tell me about LangChain4j in Camel";
        String response = template.requestBody("direct:rag-header", question, String.class);

        assertThat(response)
            .isNotNull()
            .isNotEmpty()
            .contains("MOCK");
    }

    @Test
    public void testMockChatModel() {
        MockChatModel mockModel = new MockChatModel();

        ChatRequest request = ChatRequest.builder()
            .messages(List.of(new UserMessage("What is Apache Camel?")))
            .build();

        ChatResponse response = mockModel.chat(request);

        assertThat(response).isNotNull();
        assertThat(response.aiMessage()).isNotNull();
        assertThat(response.aiMessage().text())
            .contains("MOCK")
            .contains("Apache Camel");
    }
}

RAG: Two Approaches (When to Use Which)

| Approach | How it works | Best for | Tradeoff | |----|----|----|----| | Content Enricher (enrich) | Camel pulls context from one or more routes and merges it | Dynamic retrieval, multiple sources | Slightly more code (strategy) | | Header-based (AUGMENTED_DATA) | You directly attach known context as structured content | Simple static context or already-retrieved context | Less flexible for complex retrieval logic |

In real projects, RAG often evolves like this:

  1. Start with header-based (fast, easy, predictable)
  2. Move to enrich() when retrieval becomes a workflow (vector DB, permissions, reranking)

Real-World Use Cases (Where This Pattern Fits)

Once you’re comfortable with camel-langchain4j-chat, the same structure powers:

  • Customer support routing (tickets → context → response)
  • Document pipelines (extract → summarize → classify → route)
  • Data enrichment (incoming events → explain/label → persist)
  • Ops assistants (logs/metrics → RAG → remediation steps)

The key is Camel: it already knows how to connect and orchestrate everything around the AI call.

Conclusion: Treat LLM Calls Like Integrations, Not “Special Snowflakes”

The biggest mindset shift is this:

An LLM is not an app. It’s a dependency—like a database, a queue, or an API.

And if you treat it that way, you’ll naturally build:

  • safer prompts,
  • better error handling,
  • testability,
  • and integration flows that can scale beyond demos.

\

流动性稳定币循环机制(上):开仓与平仓成本

2026-02-02 05:11:32

\ This article explores a conservative looping strategy on Fluid using USDe/USDT on Arbitrum. With ~8.9% APY and a liquidation buffer of ~12%, the strategy prioritizes liquidity and risk-adjusted returns over maximum yield. Break-even is reached after ~6–8 weeks.

Assets

Use reliable stablecoins from trusted issuers, where the probability of a depeg is very low. The final choice is always yours.

Personally, I’d start with:

  • USDC
  • USDT
  • USDe

Later, this list of reliable stablecoins can be expanded.

Fluid strategy: USDe / USDT

Fluid is a relatively new protocol, but it’s built on a robust system under the hood. Its oracle doesn’t rely on spot exchange prices; instead, it values assets based on their underlying collateral— important for more consistent pricing. This strategy currently offers ~8.93% APY on Arbitrum. USDe and USDT are “direct” stablecoins—neither wrapped tokens nor yield-bearing/staked variants. In general, deeper liquidity can help reduce slippage and may support more stable pricing during stressed market conditions.

\

\

:::info Disclaimer: This article is for informational purposes only and does not constitute financial, investment, or legal advice. DYOR!

:::

Strategy payback period

In looping strategies, it is crucial to estimate how long it takes to recover the costs of opening and closing a position.

Due to leverage, multiple swap operations are required, and it may turn out that the strategy needs to run for at least 2–3 weeks before it becomes profitable.

During this period, your stablecoins are effectively locked in the strategy, and exiting early may result in a net loss.

Gas costs are negligible compared to swap-related execution costs.

Basic Cost Estimation

  • Position size:

    $10,000 × 6 = $60,000 (6× leverage)

  • Price impact per swap:

    0.02%

  • Number of swaps:

    6 swaps to enter + 6 swaps to exit

  • Total swap cost per side:

    6 × 0.02% = 0.12%

  • Cost per entry or exit:

    0.12% of $60,000 = $72

  • Total round-trip cost (entry + exit):

    2 × $70 = $140

  • Cost relative to initial capital:

    $140 / $10,000 = 1.4%

Profitability breakdown

  • Final APY: ~8**.93%** (see screenshot)

  • Annual yield:$893

  • Break-even time:

    ~$140 is recovered in ~56 days

After the first two months, the strategy starts generating net positive returns.

Gas costs

Opening or closing a complex leveraged position typically consumes 600k–1M gas.

Typical values on Arbitrum:

L2 gas price: 0.1–0.3 gwei

ETH price: ~$3,300

At these levels:

1M gas ≈ $0.30–$0.90

To stay conservative, it’s reasonable to assume:

Entry / Exit gas cost: $1–2

Risk Assessment

As shown in the interface, with 6× leverage, the position remains in the safe zone.

Liquidation would only begin if the price of the assets deviates by approximately 12.16%, which is a significant move for stablecoins.

On Fluid, liquidations are gradual rather than instantaneous, meaning the position is reduced step by step instead of being fully liquidated at once.

The liquidation penalty is up to 2%.

While a 12% deviation is unlikely under normal conditions, extreme market stress or systemic stablecoin events may cause temporary dislocations. As with any leveraged strategy, tail risks cannot be fully eliminated.

Conclusion

This strategy is suitable for users comfortable with leveraged DeFi positions and who do not require immediate liquidity. It is not suitable for short-term capital or risk-averse investors.

A strategy like this typically turns net positive only after at least three weeks. During this period, your stablecoins are effectively tied up in the position and not readily available.

Recommendation: consider opening the position during “quiet” market conditions, when execution costs (slippage/price impact and other entry costs) are significantly lower.

The strategy favors capital preservation and predictable returns over aggressive yield optimization.

\

一本1960年代的哲学书教会我关于物流生产的人工智能

2026-02-02 05:00:20

A book first published in 1962—Thomas Kuhn’s The Structure of Scientific Revolutions—isn’t where you’d expect to find guidance for shipping production AI. But Kuhn’s core point about paradigms clicked instantly for me: reliability starts when you make your “world” explicit—what exists, what’s allowed, what counts as evidence—and then enforce it. That’s exactly what I needed to migrate 750k+ CRM objects and build a post-hangup VoIP pipeline where transcripts, summaries, and 1–10 scoring don’t drift into inconsistent interpretations.

I’m what real developers might politely call a “vibe-coder.”

I can spend hours unpacking geopolitical risk or debating why the EU defaults to caution. But if you ask me about Python memory management, I’ll assume you mean my ability to remember what I named a variable last week.

I’m not a Software Engineer. I’m a Strategic Product Lead and a General Manager. I define the what and the why, align stakeholders, and I’m accountable for whether a system behaves predictably once it leaves the sandbox. I also lecture in Business Problem Solving, so I tend to approach technology as a governance problem: clarify the objective, specify the constraints, then build something that fails safely.

Two years ago, I started using AI to prototype quickly—mostly to test whether an idea had legs. It worked until the project stopped being a prototype.

Because the requirement wasn’t “build a custom CRM.” It was:

  • migrate 122,000 contacts
  • migrate 160,000 leads
  • migrate ~500,000 notes and tasks
  • integrate our VoIP center so that every call, within seconds after hangup, produces:
  • a complete transcript
  • a summary
  • a sentiment score (1–10)
  • five agent-performance scores (1–10): solution effectiveness, professionalism, script adherence, procedure adherence, clarity
  • an urgency level for escalation

At this size, ambiguity doesn’t stay local. A small mapping inconsistency becomes thousands of questionable rows. A slightly inconsistent output format becomes a long-term reporting problem. The cost isn’t just technical—it’s organizational: people stop trusting the CRM.

AI sped up prototyping. Production reliability still requires boring engineering discipline: contracts, validation, idempotency, and constraints.

The point of this article is not that I “solved AI.” The point is that I stopped treating model output as helpful text and started treating it as untrusted input—governed by explicit definitions.

For clarity: “we” refers to my workflow—me and three LLMs used in parallel for critique and cross-checking. Final design decisions and enforcement mechanisms remained human-owned.

\

Ontology, in the International Relations sense (Kuhn + a quick example)

In International Relations, it helps to borrow Thomas Kuhn’s idea of a paradigm: a shared framework that tells a community what counts as a legitimate problem, what counts as evidence, and what a “good” explanation looks like.

An ontology is the paradigm’s inventory and rulebook: what entities exist in your analysis, what relationships matter, and what properties are allowed to vary. If you don’t make that explicit, people can use the same words and still analyze different phenomena.

A quick example: take the same event—say, a border crisis.

  • Under a realist paradigm, the ontology centers on states as primary actors, material capabilities, balance of power, and credible threats. The analysis asks: Who has leverage? What are the incentives? What capabilities change the payoff structure?
  • Under a constructivist paradigm, the ontology expands to identities, norms, shared meanings, and legitimacy. The analysis asks: How are interests constructed? What narratives make escalation acceptable—or taboo? Which norms constrain action?

Same event. Different ontology. Different variables. Different “explanations.”

Tech analogy: realism vs constructivism is like two incompatible schemas. Same input, different fields, different allowed states—so you should expect different outputs.

That’s exactly what happens in production AI and data migration. If you don’t explicitly define the entities, relationships, and admissible states in your system, you don’t get one consistent dataset—you get parallel interpretations that collide later. The model (and sometimes the humans) will quietly invent meanings.

So we made the system’s world explicit. Then we enforced it.

\

The reliability pattern: contract → gate → constraints

Nothing here is novel engineering. An experienced engineer will recognize familiar patterns: contract-first design, defensive validation, and strict constraints.

What was personally novel (to me) is that an IR habit—obsession with definitions and enforcement—got me to the boring, correct patterns faster than “prompting harder” ever could.

The pattern we implemented was simple:

  • Output contract: define exactly what the model is allowed to output
  • Schema gate: validate model output before it can touch the database
  • Database constraints: enforce the same rules at rest, so invalid data can’t accumulate

It’s not “trust the model.” It’s “verify the payload.”

\

Architecture: post-hangup, then enforce

We designed the pipeline around one clean boundary condition: hangup.

The call ends, the audio artifact is finalized, and only then do we run transcription and scoring. Operationally, a call is only considered “closed” when its insight record is persisted (idempotent retries handle transient failures).

\ Figure 1: Post-hangup processing. Insights are generated after hangup and must pass a schema gate (required fields + 1–10 checks) before persistence.

\ \n High-level flow:

  • hangup event triggers processing (idempotent by call_id)
  • speech-to-text produces a complete transcript
  • transcript goes to the insights engine with the policy/protocol (the output contract and scoring rules)
  • the model returns a structured payload
  • a schema gate validates required fields, types, and 1–10 ranges
  • only validated payloads get stored; non-compliant payloads go to retry/DLQ

This is where “AI” stops being an assistant and becomes a component in a production system: the output becomes a transaction, not a suggestion.

\

Define the world: the output contract

Instead of asking for “sentiment,” we defined an explicit output contract:

  • transcript_text (text)
  • summary_text (text)
  • sentiment_score (integer 1..10)
  • agent_solution_score (integer 1..10)
  • agent_professionalism (integer 1..10)
  • agent_script_adherence (integer 1..10)
  • agent_procedure_adherence (integer 1..10)
  • agent_clarity (integer 1..10)
  • urgency_level (enum: Low / Medium / High)
  • key_phrases (json)
  • confidence_score (optional float 0..1)
  • analyzed_at (timestamp)
  • model_version, schema_version (strings)

Anything outside the contract is invalid. Not “close enough.” Invalid.

This is the core idea: you don’t make model outputs reliable by asking nicely. You make them reliable by treating them as untrusted input until they pass a gate.

\

Scale: your database needs to be strict, not optimistic

At 50 rows, “we’ll fix it later” is a workflow. At 750,000+ objects, it’s a myth.

So we added a second enforcement layer: the database. Not as a backup plan—as a principle. If something doesn’t fit, the system should refuse it rather than store it and let it rot.

We enforced two invariants.

1) Enforce 1:1 mechanically

If you claim “every call gets insights,” make it a database invariant:

CALL_INSIGHTS.call_id is both PK and FK to CALL_LOG.call_id

This is not a philosophical statement. It’s a mechanical guarantee.

2) Constrain scoring at rest

We store:

  • sentiment score as CHECK 1..10
  • each agent metric as CHECK 1..10
  • urgency as a controlled enum/level
  • optional confidence for auditing
  • model_version and schema_version so the system stays explainable over time

\ Figure 2: Storage-level guardrails. Scores are constrained integers (1–10) and call insights are enforced 1:1 (PK=FK), so invalid payloads cannot silently accumulate.

\ \n We don’t treat a 1–10 score as ground truth. It’s anchored to evidence (transcript + summary) and audited against downstream reality (lead evolution, resolution outcomes, escalation events). When the signal disagrees with outcomes, it’s traceable and correctable.

In IR terms: the “measurement” only makes sense because the ontology is explicit. You know what the object is (the call), what its properties are (scores, urgency), and what counts as a change of state (lead evolution, escalation, closure). Without that, a number is just a vibe.

\

Migration without semantic corruption

The VoIP pipeline was only half the story. The other half was moving the dataset into a custom CRM without importing legacy ambiguity.

The biggest risk in CRM migration isn’t missing data. It’s meaning drift—what I call semantic corruption:

  • two systems use the same label for different concepts
  • different teams treat the same field as different things
  • free-text “stages” become permanent, un-auditable taxonomy

We handled migration with the same ontology-first logic.

The schema of truth

Before scripts, we defined the target data model as a schema of truth:

  • explicit entities (Contact, Lead, Call, Call Insights, Notes/Tasks)
  • explicit relations (FKs, cardinality, invariants)
  • explicit admissible states (enums and allowed transitions)

This is ontology in the IR sense: a declared set of entities, relations, and admissible states—so your mapping can’t quietly change the phenomenon you think you are measuring.

Canonical IDs and idempotent imports

A migration without canonical identifiers is a one-time import you can’t reproduce.

We enforced stable identity:

  • legacy_system + legacy_id per object
  • uniqueness constraints
  • replay-safe (idempotent) writes

That allowed reprocessing without duplicates and enabled corrections without manual surgery.

Quarantine, don’t “best effort”

Ugly data exists. The question is whether you hide it.

We split incoming records into:

  • valid → import
  • repairable → normalize + log
  • invalid → quarantine with explicit reasons

At scale, rejecting invalid data isn’t being “harsh.” It’s preventing silent rot.

\

Tooling note (brief and honest)

We use OpenAI in production, selected after comparative testing against Gemini about five months ago. This isn’t a lab write-up—we’re writing after the pipeline has been running long enough for us to validate operational usefulness.

The transferable idea is the pattern: contract + gate + database constraints, not the vendor.

\

:::info A quick note on privacy (non-negotiable): Call transcripts are sensitive data. We apply role-based access, retention policies, and redaction where needed. The technical architecture is only useful if governance around data access is equally strict.

:::

\

Takeaways (the practical checklist)

If you’re building AI features into a system people will rely on:

  • treat model output as untrusted input
  • define an output contract (fields, types, ranges)
  • validate at the boundary (schema gate)
  • enforce at rest (database constraints)
  • version definitions (schema/model versions) for explainability
  • design around a clean boundary condition (for us: post-hangup)
  • keep evidence (transcript/summary) so scores are traceable and correctable

You don’t need to become a data scientist to do this. But you do need to stop thinking like a prompt writer and start thinking like a system designer.

Reference(s)

Kuhn, T. S. The Structure of Scientific Revolutions (originally published 1962). University of Chicago Press. https://doi.org/10.7208/chicago/9780226458106.001.0001

\ \

七项降低大规模云事件停机时间的操作实践

2026-02-02 04:53:58

High-scale systems fail in many unexpected ways that you would never have designed for. Over the course of the last 14 years, I have navigated the layers of physical and virtual networking. I started out as an individual contributor writing code for data plane services and transitioned to leading global teams managing highly distributed services owning millions of hosts. I have seen a wide range of incidents, such as multi-service impact, single-service impact, cascading failures, single customer issues, service failure during incident recovery, service failure post-recovery, and the inability for services to auto-recover. The list goes on and on. I have studied the root causes of major outages across the industry’s cloud leaders. There are common failure patterns across the industry. While these events are inevitable, based on my experience, adhering to the below best practices for managing failures will greatly improve your ability to handle them. These are the seven best practices I recommend to keep teams efficient during large-scale incidents and help reduce the impact time.

1. Mitigate First, Root Cause Later

During an outage, the natural tendency from engineering teams would be to find the underlying cause of the issue. However, you should always drive the discussion towards how to mitigate the issue. I’ve seen teams spend a lot of critical time debugging code while customer impact is still ongoing. Most of the time, you don’t need to know the root cause in order to mitigate the issue or to execute the recovery steps. If an incident correlates with an ongoing deployment causing a spike in 5xx errors, you should roll back the deployment immediately instead of debugging the code to identify the bug. If a single host is experiencing failure, remove it from the fleet immediately. You can perform the deep-dive analysis later once the impact is mitigated. If you have enough hands on deck, you can divide and conquer by tasking one group with immediate mitigation, and another with the root cause investigation.

2. Don’t Be a Hero: Ask for Help When You Need It

Earlier in my career, I mistakenly thought reaching out for help would be seen as a sign of weakness. I was always tempted to try to solve every incident myself in order to prove my technical and operational ability. But over time, I have realized this is often an incorrect approach since it creates a single point of failure, delaying the mitigation. The rule of thumb one could use is - escalate the moment you are blocked. A peer who is a domain expert or senior tech lead brings more experience to the table by correlating with previous outages - something you won’t be able to do easily under pressure or when you’re blocked.

3. Test Your Tools Regularly

Reliable tooling is critical and required to handle an operational incident. Teams often rely on scripts or automation that are only used during rare events, and they will likely fail since they aren’t exercised regularly. Not having the tools working during an incident when you need them the most will further delay the mitigation and create more pressure on the teams to use manual, untested approaches during incidents. Using an untested approach can always result in errors which may increase the impact or delay the mitigation further. You should treat your operational tools with the same rigor and quality as your production code. One way to do this is by having unit tests and component-level tests run every time the tools are updated or when their dependencies change in a pre-production environment. By catching software changes that break the tools immediately and not during a large-scale event, you will increase your team’s effectiveness and operational posture for handling service outages.

4. Verify and Validate

When making production changes in response to an ongoing incident, it is better to be slow and safe than to rush and break things. There are many examples of someone running a wrong command or making a manual change to a system, service, or database during an event that breaks production. Always verify production changes via tests, additional reviews, and approvals before executing them. This approach can be enforced by mandating every production change goes through a formal peer review or an "over the shoulder" second pair of eyes review. Taking an extra 30 seconds to verify your work will potentially avert errors that can cause more impact during incidents. After execution of a command or change, it is equally important to validate the result by querying or inspecting if the change behaved as expected.

5. Avoid the "Context Tax"

One of the most common patterns in large-scale event handling is not having a common understanding of the issue at hand. Every time a new person joins the bridge and asks the same questions about the event, it results in operators context switching from mitigation to explaining what they know about the issue. This pattern can be avoided by having a clear summary of the event written down in the event tracker and deferring questions to it. A good summary must include a start time, nature of impact (latency vs. errors), magnitude of impact, scope (partition vs. zonal vs. regional), recovery metrics to track, and active threads with owners and estimated time of completion. This approach helps avoid losing valuable time and allows operators to stay focused on mitigation. Another good operational hygiene is to always post a blurb explaining the relevance of a graph, instead of just posting the graph without any context. This tells everyone what the data is showing so they can support you more effectively.

6. Aggressively Filter Distractions

It is important to stay focused on mitigation during a large-scale event. Given large-scale complex events have many participants, many of them will have their own theory of what the issue might be. While it’s good to hear different perspectives and think of various possibilities, it is often counter-productive and can make the call go in circles for hours. This is usually the case because often these theories are not backed up by data or evidence. An incident manager must keep the discussion on a logical, data-driven path and track associated investigation threads in a visible document. If a participant offers a new theory that isn’t backed by data, it should be moved to a backlog of pending action items or should be investigated separately from the main threads.

7. Drills Over Documentation

Teams typically use documents, training videos, and standard operating procedures guides to onboard new members to on-call rotations. While this sounds like a reasonable approach, I have found that new members are more effective if they are provided hands-on exposure. You can achieve this by shifting your onboarding process to include operational drills in addition to training material. The operational drills can be simulated in a pre-production environment. During these drills, you can have your new on-calls mitigate the issue by using the tools, following the SOPs and executing the escalation process, similar to how they would handle production events. Being well-prepared through drills will help operators stay calm and be better equipped to handle real events.

Final Thoughts

Networking components and large-scale distributed systems relying on cloud infrastructure have become critical, foundational components for many software companies. It is essential to ensure high availability and resilience for these components. We have read about many cloud outages that can disrupt day-to-day operations, impacting several sectors of the industry that rely on cloud companies. As the complexities of these systems increase over time, it is important to build discipline in operational hygiene to manage them. Especially given the rise in AI adoption, the interdependencies between services are multiplying at a rapid pace. While it is not possible to completely avoid failures, it is critical to have processes and a culture in place to recover quickly and ensure minimal disruption for use cases dependent on cloud technologies. By prioritizing the best practices mentioned above, we move from a reactive mode to a proactive, well-prepared, and more disciplined operational culture.

\

我用Go语言构建了一个零依赖的ngrok替代方案

2026-02-02 04:45:45

I discovered nport (https://github.com/tuanngocptn/nport) - a fantastic ngrok alternative built in Node.js. It’s free, open-source, and uses Cloudflare’s infrastructure. But I wanted something with:

  • Smaller footprint - Single binary, no Node.js runtime
  • Faster startup - Go’s compilation speed
  • Better concurrency - Native goroutines
  • Learning opportunity - Deep dive into tunneling tech

So, I decided to build something myself. Starting with interrogating some of my core decisions along the way, I’ll walk you through what I built.

Why Go?

Performance Comparison:

Binary size

  • Node.js (nport): ~50MB + Node.js runtime
  • Go (golocalport): ~10MB standalone

Startup time

  • Node.js (nport): ~500ms
  • Go (golocalport): ~50ms

Memory usage

  • Node.js (nport): ~30MB
  • Go (golocalport): ~5MB

Concurrency

  • Node.js (nport): Event loop
  • Go (golocalport): Native goroutines

Dependencies

  • Node.js (nport): Many npm packages
  • Go (golocalport): Zero external (stdlib only)

Architecture Overview

The system is built with clean separation of concerns:

Core Components:

  1. CLI Interface - Flag parsing, user interaction
  2. API Client - Communicates with backend
  3. Binary Manager - Downloads/manages cloudflared
  4. Tunnel Orchestrator - Lifecycle management
  5. State Manager - Thread-safe runtime state
  6. UI Display - Pretty terminal output

Implementation Journey

Phase 1: Project Setup (15 minutes)

Started with the basics:

go mod init github.com/devshark/golocalport

Created clean project structure:

golocalport/
├── cmd/golocalport/main.go       # Entry point
├── internal/
│   ├── api/                 # Backend client
│   ├── binary/              # Cloudflared manager
│   ├── config/              # Configuration
│   ├── state/               # State management
│   ├── tunnel/              # Orchestrator
│   └── ui/                  # Display
└── server/                  # Backend API

Phase 2: Core Infrastructure (30 minutes)

Config Package - Dead simple constants:

const (
    Version        = "0.1.0"
    DefaultPort    = 8080
    DefaultBackend = "https://api.golocalport.link"
    TunnelTimeout  = 4 * time.Hour
)

State Manager - Thread-safe with mutex:

type State struct {
    mu          sync.RWMutex
    TunnelID    string
    Subdomain   string
    Port        int
    Process     *exec.Cmd
    StartTime   time.Time
}

Phase 3: API Client (20 minutes)

Simple HTTP client for backend communication:

func (c *Client) CreateTunnel(subdomain, backendURL string) (*CreateResponse, error) {
    body, _ := json.Marshal(map[string]string{"subdomain": subdomain})
    resp, err := c.httpClient.Post(backendURL, "application/json", bytes.NewBuffer(body))
    // ... handle response
}

Phase 4: Binary Manager (45 minutes)

Challenge: macOS cloudflared comes as .tgz, not raw binary.

Solution: Detect file type and extract:

func Download(binPath string) error {
    url := getDownloadURL()
    resp, err := http.Get(url)

    // Handle .tgz files for macOS
    if filepath.Ext(url) == ".tgz" {
        return extractTgz(resp.Body, binPath)
    }

    // Direct binary for Linux/Windows
    // ...
}

Cross-platform URL mapping:

urls := map[string]string{
    "darwin-amd64":  baseURL + "/cloudflared-darwin-amd64.tgz",
    "darwin-arm64":  baseURL + "/cloudflared-darwin-amd64.tgz",
    "linux-amd64":   baseURL + "/cloudflared-linux-amd64",
    "windows-amd64": baseURL + "/cloudflared-windows-amd64.exe",
}

Phase 5: Tunnel Orchestrator (30 minutes)

Coordinates everything:

func Start(cfg *config.Config) error {
    // 1. Ensure binary exists
    if !binary.Exists(config.BinPath) {
        binary.Download(config.BinPath)
    }

    // 2. Create tunnel via API
    resp, err := client.CreateTunnel(cfg.Subdomain, cfg.BackendURL)

    // 3. Start cloudflared process
    cmd, err := binary.Spawn(config.BinPath, resp.TunnelToken, cfg.Port)

    // 4. Setup timeout & signal handling
    timer := time.AfterFunc(config.TunnelTimeout, Cleanup)
    signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
    <-sigChan
}

Phase 6: CLI Interface (15 minutes)

Standard library flag package - no dependencies needed:

subdomain := flag.String("s", "", "Custom subdomain")
backend := flag.String("b", "", "Backend URL")
version := flag.Bool("v", false, "Show version")
flag.Parse()

port := config.DefaultPort
if flag.NArg() > 0 {
    port, _ = strconv.Atoi(flag.Arg(0))
}

Phase 7: Backend Server (45 minutes)

Built a minimal Go server instead of using Cloudflare Workers:

Why?

  • Full control
  • Easy to self-host
  • No vendor lock-in
  • Can run anywhere

Implementation:

func handleCreate(w http.ResponseWriter, r *http.Request) {
    // 1. Create Cloudflare Tunnel
    tunnelID, token, err := createCloudflaredTunnel(subdomain)

    // 2. Create DNS CNAME record
    fullDomain := fmt.Sprintf("%s.%s", subdomain, cfDomain)
    cnameTarget := fmt.Sprintf("%s.cfargotunnel.com", tunnelID)
    createDNSRecord(fullDomain, cnameTarget)

    // 3. Return credentials
    json.NewEncoder(w).Encode(CreateResponse{
        Success:     true,
        TunnelID:    tunnelID,
        TunnelToken: token,
        URL:         fmt.Sprintf("https://%s", fullDomain),
    })
}

Cloudflare API integration (~100 lines):

func cfRequest(method, url string, body interface{}) (json.RawMessage, error) {
    req, _ := http.NewRequest(method, url, reqBody)
    req.Header.Set("Authorization", "Bearer "+cfAPIToken)
    req.Header.Set("Content-Type", "application/json")
    // ... handle response
}

\

Final Stats

Client (GoLocalPort CLI)

  • Files: 7 Go files
  • Lines of Code: ~600
  • Dependencies: 0 external (stdlib only)
  • Binary Size: ~8MB
  • Build Time: ~2 seconds

Server (Backend API)

  • Files: 2 Go files
  • Lines of Code: ~200
  • Dependencies: 0 external (stdlib only)
  • Deployment: Fly.io, Railway, Docker, VPS

Total Development Time

  • Planning & Analysis: 30 minutes
  • Client Implementation: 2 hours
  • Server Implementation: 45 minutes
  • Documentation: 30 minutes
  • Total: ~3.5 hours

\

How It Works

The flow is straightforward:

  1. You run golocalport 3000 -s myapp

  2. GoLocalPort creates a Cloudflare Tunnel via the backend API

  3. DNS record is created: myapp.golocalport.link → Cloudflare Edge

  4. Cloudflared connects your localhost:3000 to Cloudflare

  5. Traffic flows through Cloudflare’s network to your machine

  6. On exit (Ctrl+C), tunnel and DNS are cleaned up

    ==Internet → Cloudflare Edge → Cloudflare Tunnel → Your localhost:3000 (https://myapp.golocalport.link)==

Usage

Client:

# Build
go build -o golocalport cmd/golocalport/main.go

# Run with random subdomain
./golocalport 3000

# Run with custom subdomain
./golocalport 3000 -s myapp
# Creates: https://myapp.yourdomain.com

Server:

# Deploy to Fly.io (free)
cd server
fly launch
fly secrets set CF_ACCOUNT_ID=xxx CF_ZONE_ID=xxx CF_API_TOKEN=xxx CF_DOMAIN=yourdomain.com
fly deploy

Key Learnings

1. Go’s Stdlib is Powerful

No external dependencies needed for:

  • HTTP client/server
  • JSON parsing
  • Tar/gzip extraction
  • Process management
  • Signal handling

2. Cloudflare Tunnels are Amazing

  • Free tier is generous
  • Global edge network
  • Automatic HTTPS
  • No port forwarding needed
  • Works behind NAT/firewalls

3. Minimal Code is Better

  • Easier to maintain
  • Faster to understand
  • Fewer bugs
  • Better performance

4. Cross-Platform is Tricky

Different binary formats per OS:

  • macOS: .tgz archive
  • Linux: raw binary
  • Windows: .exe

Solution: Runtime detection + extraction logic

Challenges & Solutions

Challenge 1: Binary Format Differences

  • Problem: macOS cloudflared is .tgz, not raw binary
  • Solution: Detect extension, extract tar.gz on-the-fly

Challenge 2: Thread Safety

  • Problem: Multiple goroutines accessing state
  • Solution: sync.RWMutex for safe concurrent access

Challenge 3: Graceful Shutdown

  • Problem: Cleanup on Ctrl+C
  • Solution: Signal handling + defer cleanup

Challenge 4: Backend Hosting

  • Problem: Need somewhere to run backend
  • Solution: Multiple options - Fly.io (free), Railway, Docker, VPS

What’s Next?

Planned Features

  • Update checking
  • Config file support
  • Traffic inspection/logging
  • Custom domains (not just subdomains)
  • TUI interface
  • Homebrew formula

Potential Improvements

  • Add tests (unit + integration)
  • Performance benchmarks
  • Windows/Linux testing

nport vs golocalport

Language

  • nport: JavaScript
  • golocalport: Go

Runtime

  • nport: Node.js required
  • golocalport: Standalone binary

Binary size

  • nport: ~50MB + runtime
  • golocalport: ~8MB

Startup

  • nport: ~500ms
  • golocalport: ~50ms

Memory

  • nport: ~30MB
  • golocalport: ~5MB

Dependencies

  • nport: Many npm packages
  • golocalport: Zero (stdlib)

Backend

  • nport: Cloudflare Worker
  • golocalport: Go server (self-host)

Lines of code

  • nport: ~1000
  • golocalport: ~800

Concurrency

  • nport: Event loop
  • golocalport: Goroutines

Conclusion

Building GoLocalPort was a fantastic learning experience. In just a few hours, I created a production-ready tunnel service that:

  • Works on macOS, Linux, Windows
  • Has zero external dependencies
  • Produces a tiny binary
  • Starts instantly
  • Uses minimal memory
  • Includes both client and server
  • Is fully open-source

Go proved to be the perfect choice for this type of system tool. The standard library had everything needed, and the resulting binary is small, fast, and portable.

Try It Yourself

# Clone the repo
git clone https://github.com/devshark/golocalport.git
cd golocalport

# Build
go build -o golocalport cmd/golocalport/main.go

# Run
./golocalport 3000

Visit https://www.golocalport.link/ for installation instructions and documentation.

Resources

Questions? Feedback? Open an issue on GitHub or reach out!

Made with ❤️ using Go

\