2026-04-01 11:16:39
I created an npm module that strips YAML files down to size. It can delete specific elements or replace them with empty values.
Published here:
The word "scraper" means something like "to scrape off." It even has a Travis CI badge!
Given a YAML file like this:
sample:
hoge: I want to delete this
fuga: I want to empty this
hogehoge:
hoge: I want to delete this too
fuga: I want to empty this too
hogera: Leave this as-is
piyopiyo:
piyo: I want to delete the parent of this
The library can:
hoge
fuga, but replace the value with an empty stringpiyo
The resulting output is:
sample:
fuga: ''
hogehoge:
fuga: ''
hogera: Leave this as-is
Unnecessary elements are stripped away, making the YAML file much lighter. It works at any nesting depth!
I use Swagger to generate API documentation sites. When I tried to import a Swagger file into Amazon API Gateway, I got a size limit error.
To work around that, I built this library to strip out unnecessary elements from the YAML.
Node.js is required.
If Node.js is installed, run:
npm install yaml-scraper
Here's the sample code for the operations described above:
// Load libraries
const fs = require('fs');
const scraper = require('yaml-scraper');
// Read the YAML file
const input = fs.readFileSync('./sample.yaml', 'utf8');
// Delete 'hoge', empty 'fuga', delete parent of 'piyo'
const output = scraper(input)
.delete('hoge')
.empty('fuga')
.deleteParent('piyo')
.toString();
// Print the result
console.log(output);
As you can see, delete, empty, and deleteParent are provided as a method chain.
The following articles were helpful during development — thank you for the clear explanations!
2026-04-01 11:15:56
I was frustrated.
Every LLM API provider logs your prompts.
OpenAI. Anthropic. Google. All of them.
For teams building on sensitive data —
healthcare, fintech, legal — this is a blocker.
So this morning I built NullLog.
A private LLM inference API with zero data retention.
Not a policy. Architecture. Nothing is ever written to storage.
Here's exactly how I built it in a few hours.
Total infra cost to run: near zero.
Customer pays Stripe
→ Webhook fires to Cloudflare Worker
→ Worker generates API key
→ Key stored in KV (email + tier only, no usage logs)
→ Customer gets key by email in 60 seconds
→ They hit /v1/chat/completions
→ Worker routes to inference
→ Response returned
→ Nothing written anywhere
export default {
async fetch(request, env) {
const url = new URL(request.url)
if (url.pathname === '/v1/chat/completions') {
const apiKey = request.headers
.get('Authorization')
?.replace('Bearer ', '')
// Validate key exists
const keyData = await env.KEYS.get(apiKey, 'json')
if (!keyData?.active) {
return Response.json(
{ error: 'Invalid API key' },
{ status: 401 }
)
}
// Route to inference — nothing logged
const body = await request.json()
const response = await env.AI.run(
'@cf/meta/llama-4-scout-17b-16e-instruct',
{ messages: body.messages }
)
return Response.json({
choices: [{
message: {
role: 'assistant',
content: response.response
}
}]
})
}
}
}
Zero database writes in the inference path.
The only thing stored is whether your API key is valid.
Running the latest from Cloudflare's edge network:
One line change:
from openai import OpenAI
client = OpenAI(
api_key="your-nulllog-key",
base_url="https://api.sparsitron.com/v1"
)
response = client.chat.completions.create(
model="kimi-k2.5",
messages=[{"role": "user", "content": "Hello"}]
)
Works with LangChain, LlamaIndex, any OpenAI SDK integration.
Zero logging is an architecture decision, not a policy.
Most providers say "we don't train on your data" —
but they still log. Logging and training are separate things.
True privacy means nothing written to persistent storage
anywhere in the request path.
Compliance unlocks enterprise.
GDPR, HIPAA, SOC2 — these aren't just checkboxes.
They're why enterprises can't use OpenAI directly.
Private inference is a real $B market that's mostly unsolved.
Cloudflare Workers AI is surprisingly powerful.
Running frontier models at the edge with near-zero
infra cost. The credit system is generous for early products.
Live at api.sparsitron.com
Free trial with code PHLAUNCH at
api.sparsitron.com/redeem
Would love feedback from the dev.to community —
especially on the zero-log architecture approach
and whether this solves a real pain you've faced.
I'm also building IntelliCortex — a novel neural
architecture to replace transformers. Sparsitron™
is our sparse computation approach. NullLog is
the infra layer we built to run our own experiments
privately. Patent filed.
2026-04-01 11:14:33
I've been experimenting with infinite scroll gallery interactions —
the kind you see on Awwwards-winning sites — and built two
components I wanted to share.
An omnidirectional infinite grid. Scroll or drag in any direction
and the cards wrap seamlessly. Each card has position-based inner
parallax, and clicking any card triggers a fullscreen expansion
with animated border radius.
Live Demo: https://infinite-2d-grid-scroll.netlify.app
Key technical details:
translate3d with modular wrapping on both axesA Three.js-powered gallery with curved depth. Three rows scroll
at different speeds. The interesting part: cards expand to
fullscreen via a custom vertex shader.
Live Demo: https://infinite-scroll-horizontal-gallery.netlify.app
The shader uses a uCorners vec4 uniform — each corner (TL, TR,
BL, BR) animates independently from 0 to 1, with randomized
stagger. The vertex shader interpolates between the card's curved
state and a fullscreen state using bilinear interpolation:
float cornersProgress = mix(
mix(uCorners.z, uCorners.w, uv.x),
mix(uCorners.x, uCorners.y, uv.x),
uv.y
);
gl_Position = projectionMatrix * viewMatrix *
mix(defaultState, fullScreenState, cornersProgress);
This means every time you click a card, the expansion looks
slightly different.
I packaged these as a product on
Gumroad
for anyone who wants to use them in their projects.
Would love to hear what you think about the interactions!
2026-04-01 11:14:22
Quick one-liner: Create a qcow2 disk image with qemu-img, install Alpine Linux into it, and boot from disk — so your VM survives container restarts.
Post #2 proved that KVM hardware acceleration is fast. But there's a catch: every time the container stops, the VM state vanishes. The Alpine ISO is read-only — any changes you make inside the VM exist only in RAM. Stop the container and they're gone.
That's fine for a boot-speed demo, but it's not a real VM. A real VM has a disk that persists between runs. The disk lives on the host filesystem, the container is just the runtime, and the two are completely independent. Stop and restart the container as many times as you want — the disk doesn't care.
This post adds that layer. You'll create a qcow2 disk image, boot from ISO + disk to run the Alpine installer, then boot from disk alone to confirm it survived.
qemu:base image from Post #1~/Downloads/alpine-standard-3.23.3-x86_64.iso
~/vm directory (you'll create it below)First, create a dedicated directory for your VM disk images:
$ mkdir -p ~/vm
Then create the disk image:
$ podman run --rm \
-v ~/vm:/vm:z \
qemu:base \
qemu-img create -f qcow2 /vm/alpine.qcow2 8G
You should see:
Formatting '/vm/alpine.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=8589934592 lazy_refcounts=off refcount_bits=16
What's qcow2? It stands for QEMU Copy-On-Write version 2. The key property is thin provisioning: the file on your host starts tiny (a few hundred KB) and only grows as the VM actually writes data. Specifying 8G sets the maximum size the VM sees, not the space it consumes on disk immediately.
Now boot with both the ISO and the disk attached. The -boot d flag tells QEMU to boot from the CD-ROM first:
$ podman run --rm -it \
--device /dev/kvm \
-v ~/vm:/vm:z \
-v ~/Downloads:/iso:z \
qemu:base \
qemu-system-x86_64 \
-enable-kvm -cpu host \
-nographic \
-m 512 \
-cdrom /iso/alpine-standard-3.23.3-x86_64.iso \
-drive file=/vm/alpine.qcow2,format=qcow2 \
-boot d
Alpine will boot from the ISO into a live environment. Log in as root — no password required.
Once you're at the shell, run the Alpine installer:
# setup-alpine
Work through the prompts. Most defaults are fine. The ones that matter:
alpine
eth0, DHCPnone
openssh or none — your callnone
sda — this is your qcow2 imagesys — full system install to disky
When the installer finishes, power off:
# poweroff
The container exits. The alpine.qcow2 file on your host now contains a complete Alpine installation.
Drop the ISO flags entirely. The disk knows how to boot now:
$ podman run --rm -it \
--device /dev/kvm \
-v ~/vm:/vm:z \
qemu:base \
qemu-system-x86_64 \
-enable-kvm -cpu host \
-nographic \
-m 512 \
-drive file=/vm/alpine.qcow2,format=qcow2
Alpine boots from the installed disk. Log in with the username you created during setup. Now write a file to prove the disk persists:
$ echo "hello from install" > ~/persistence-test.txt
$ cat ~/persistence-test.txt
hello from install
$ su -
# poweroff
The container exits. Run the exact same boot command again:
$ podman run --rm -it \
--device /dev/kvm \
-v ~/vm:/vm:z \
qemu:base \
qemu-system-x86_64 \
-enable-kvm -cpu host \
-nographic \
-m 512 \
-drive file=/vm/alpine.qcow2,format=qcow2
Log in and check:
$ cat ~/persistence-test.txt
hello from install
The file survived. The container was destroyed and recreated, but the disk image on your host never changed. That's persistence.
The container is ephemeral — --rm means Podman deletes it the moment QEMU exits. But the disk image at ~/vm/alpine.qcow2 lives on your host filesystem, completely outside the container lifecycle.
The bind mount (-v ~/vm:/vm:z) is just a path into the host. Writing to /vm/alpine.qcow2 inside the container is writing to ~/vm/alpine.qcow2 on the host. When the container is gone, the file remains.
| Flag | Where | What It Does |
|---|---|---|
-drive file=/vm/alpine.qcow2,format=qcow2 |
QEMU | Attaches the disk image as a block device (sda inside the VM) |
-boot d |
QEMU | Sets boot order to CD-ROM first; needed during install so Alpine boots from ISO, not the blank disk |
format=qcow2 |
QEMU -drive option |
Tells QEMU the image format explicitly; avoids format auto-detection warnings |
-v ~/vm:/vm:z |
Podman | Bind-mounts the host ~/vm directory; the disk image lives here, not inside the container |
-v ~/Downloads:/iso:z |
Podman | Bind-mounts the ISO directory; only needed during the install step |
That install took a few minutes of interactive prompts. Every time you want a new Alpine VM, you'd repeat it from scratch.
Post #4: We'll skip the installer entirely by using a cloud image — a pre-built disk image ready to boot in seconds.
This guide is Part 3 of the KVM Virtual Machines on Podman series.
Part 1: Build a KVM-Ready Container Image from Scratch
Part 2: KVM Acceleration in a Rootless Podman Container
Coming up in Part 4: Cloud Images — Skip the Installer, Boot in Seconds
Found this helpful?
Published: 6 Apr 2026
Author: David Tio
Tags: KVM, QEMU, Podman, Virtualization, Containers, Alpine Linux, qcow2, Linux, Tutorial
Series: KVM Virtual Machines on Podman
Word Count: ~750
SEO Metadata:
2026-04-01 11:11:49
In the real world, API responses aren't always uniform. Sometimes, a single endpoint might return data in wildly different shapes depending on a status field, or an error might contain different details than a successful payload. This heterogeneity can quickly lead to runtime errors if not handled carefully. TypeScript's discriminated unions, combined with type guards, provide an elegant and robust solution to this common challenge, ensuring type safety even when dealing with dynamic data.
This article dives into how to define and use discriminated unions to confidently process API responses that vary in structure. We'll move beyond high-level concepts and explore practical code examples, demonstrating how to leverage TypeScript's type narrowing capabilities for more reliable applications.
At its core, a discriminated union is a type that consists of several other types (a union), where each member of the union shares a common, literal property (the 'discriminant'). This property acts as a tag, allowing TypeScript to narrow down the specific type within the union at runtime based on its value.
Consider an API that returns either a success payload with user data or an error payload with a message. Both responses might share a status field, but its value ('success' or 'error') dictates the presence of other fields.
Let's start by defining the possible shapes of our API responses using interfaces and then combining them into a discriminated union type. We'll use a status property as our discriminant.
interface UserData {
id: string;
name: string;
email: string;
}
interface SuccessApiResponse {
status: 'success';
data: UserData;
}
interface ErrorApiResponse {
status: 'error';
message: string;
errorCode?: number; // Optional error code
}
// Our discriminated union type
type ApiResponse = SuccessApiResponse | ErrorApiResponse;
Here, UserData defines the structure of a successful payload. SuccessApiResponse and ErrorApiResponse each declare a status property with a literal string type ('success' or 'error'). This literal type is crucial for the discriminant. Finally, ApiResponse is a union of these two interfaces. Notice how SuccessApiResponse has a data property, while ErrorApiResponse has message and an optional errorCode.
Once we have our discriminated union, the next step is to write code that can intelligently determine which specific type it's currently dealing with. This is where type guards come in. A type guard is a runtime check that guarantees a type within a certain scope. For discriminated unions, checking the discriminant property acts as a powerful type guard.
Let's create a function that processes an ApiResponse:
function processApiResponse(response: ApiResponse): void {
if (response.status === 'success') {
// TypeScript now knows 'response' is a SuccessApiResponse
console.log(`User ID: ${response.data.id}, Name: ${response.data.name}`);
// console.log(response.message); // This would cause a compile-time error
} else {
// TypeScript now knows 'response' is an ErrorApiResponse
console.error(`Error (${response.errorCode || 'unknown'}): ${response.message}`);
// console.log(response.data); // This would also cause a compile-time error
}
}
// Example usage:
const successRes: ApiResponse = {
status: 'success',
data: { id: '123', name: 'Alice', email: '[email protected]' }
};
const errorRes: ApiResponse = {
status: 'error',
message: 'User not found',
errorCode: 404
};
processApiResponse(successRes);
processApiResponse(errorRes);
In processApiResponse, the if (response.status === 'success') statement acts as a type guard. Inside the if block, TypeScript automatically narrows the type of response to SuccessApiResponse. This means you can safely access response.data without any type assertions. Conversely, in the else block, TypeScript narrows response to ErrorApiResponse, allowing access to response.message and response.errorCode.
Discriminated unions truly shine when you need to process an array or collection of items that could be any of the union members. Imagine a scenario where you've made multiple API calls, and you want to log both successful results and errors.
const apiResponses: ApiResponse[] = [
{ status: 'success', data: { id: 'u1', name: 'Bob', email: '[email protected]' } },
{ status: 'error', message: 'Unauthorized', errorCode: 401 },
{ status: 'success', data: { id: 'u2', name: 'Charlie', email: '[email protected]' } }
];
apiResponses.forEach(response => {
switch (response.status) {
case 'success':
console.log(`Fetched user: ${response.data.name} (ID: ${response.data.id})`);
break;
case 'error':
console.error(`API Error: ${response.message} (Code: ${response.errorCode || 'N/A'})`);
break;
default:
// This branch should theoretically be unreachable if all cases are handled.
// 'never' type helps ensure exhaustiveness.
const _exhaustiveCheck: never = response;
return _exhaustiveCheck;
}
});
Here, we iterate through an array of ApiResponse objects. The switch (response.status) statement provides clear branching logic. For each case, TypeScript intelligently narrows the response type, allowing direct and safe access to the specific properties of SuccessApiResponse or ErrorApiResponse. The default case with const _exhaustiveCheck: never = response; is a powerful TypeScript pattern. If you were to add a new variant to ApiResponse (e.g., LoadingApiResponse) but forgot to add a corresponding case in the switch, TypeScript would issue a compile-time error, reminding you to handle all possibilities. This ensures exhaustiveness and prevents unexpected runtime behavior.
While powerful, discriminated unions have a few pitfalls to be aware of:
'success', not string). If it's optional (status?: 'success') or a broad string type, TypeScript cannot reliably use it for narrowing.typeof for object differentiation often doesn't work as expected, as typeof someObject will always return 'object'. Always use the literal discriminant property for narrowing discriminated unions.if/else if/else chains or switch statements can lead to runtime errors if new union members are introduced. The never type check in a default case is a strong guard against this.SuccessApiResponse and ErrorApiResponse could both have status: 'pending', TypeScript wouldn't be able to narrow them effectively based on that property.Discriminated unions are a cornerstone of advanced TypeScript usage, providing a robust mechanism to handle data with varying structures. By explicitly defining the possible shapes and using a common literal property as a discriminant, you enable TypeScript's static analysis to perform precise type narrowing. This leads to code that is:
Mastering discriminated unions and type guards is an essential skill for any intermediate TypeScript developer. They empower you to write more resilient and maintainable applications, especially when interacting with external systems like APIs that may return dynamic data. Experiment with these patterns in your own projects to experience the full benefits of TypeScript's type system in action. Your future self (and your teammates) will thank you for the clarity and safety they provide.
2026-04-01 11:10:30
RAG Web Browser is an Apify actor that fetches any web page, strips out all the noise, and returns clean markdown text that AI models can actually read and use. It bridges the gap between the static knowledge inside a language model and the live, changing web — letting Claude, GPT-4, or any other LLM answer questions based on real-time data instead of guessing from training data that may be months or years old.
The actor is available at https://apify.com/tugelbay/rag-web-browser and runs on Apify's Pay Per Event pricing at $3 per 1,000 requests.
Before diving into the tool itself, it helps to understand the problem it solves.
RAG stands for Retrieval Augmented Generation. It is a technique for making AI models more accurate by giving them relevant, current information at the moment they generate a response — rather than relying solely on what they memorized during training.
Here is the core problem with language models like Claude or GPT-4: they are trained on a snapshot of the internet from a specific point in time. After that cutoff, they know nothing about what happened. Ask a model about a product released last month, a competitor's updated pricing, or today's news, and you will get either a confident wrong answer (a hallucination) or an admission that it does not know.
RAG solves this by adding a retrieval step before generation:
The quality of this process depends entirely on what you retrieve and how clean it is. A raw HTML page dumped into a context window is full of navigation menus, cookie banners, JavaScript code, and footer links — none of which help the model answer anything. What you need is clean, structured text. That is exactly what RAG Web Browser produces.
RAG Web Browser takes a URL as input and returns clean markdown as output. That is the entire job, and it does it well.
The pipeline inside the actor works in four stages:
1. Fetch — The actor loads the target URL using a headless browser. This matters because a huge portion of the modern web is rendered by JavaScript. A simple HTTP request will not see the content on pages built with React, Vue, Angular, or any other client-side framework. The headless browser executes the JavaScript and waits for the page to fully render before reading the content.
2. Parse — Once the page is loaded, the actor identifies and extracts the main content. It distinguishes between body text and structural clutter: navigation bars, sidebars, cookie consent dialogs, social sharing buttons, ad blocks, related article carousels, comment sections, and site-wide footers are all identified and removed.
3. Convert — The cleaned content is converted to markdown. Headers become # and ##. Lists stay as lists. Tables are formatted properly. Links are preserved where they add context. The output is readable by both humans and language models.
4. Return — The clean markdown is returned as a structured JSON response, ready to be consumed by any application or API call.
The result is a version of the web page that contains the signal without the noise — exactly what a language model needs to reason accurately about the page's content.
Language models are powerful reasoners but poor browsers. They cannot open a URL, render JavaScript, scroll a page, or handle paywalls. When you paste a URL into a prompt and ask a model to analyze a web page, the model is working from its training data about what that page used to look like — or hallucinating what it thinks the page should contain.
Even when LLMs are given tool-use capabilities and can make HTTP requests, the raw output of most web pages is overwhelming. A typical e-commerce product page contains thousands of tokens of navigation, tracking scripts, and boilerplate for every hundred tokens of actual product information. Feeding that into a context window wastes space, increases cost, and degrades output quality.
RAG Web Browser solves both problems:
The practical result is that your AI assistant answers questions about live web pages accurately, efficiently, and without burning context on junk.
AI assistants with real-time web search — The most direct application. When a user asks your AI assistant about a company, a product, a news story, or any topic where freshness matters, the assistant fetches the relevant pages through RAG Web Browser and uses the clean markdown to generate a grounded, accurate response. No hallucinations about outdated information, no admissions of ignorance.
Automated research pipelines — Research workflows that need to process dozens or hundreds of web pages benefit enormously from automated content extraction. A pipeline that monitors competitor pricing pages, tracks industry news, or aggregates product reviews can run RAG Web Browser at each URL and feed the clean output directly into a summarization or classification model.
Content freshness for AI-generated articles — When building content automation systems, accuracy requires current source material. RAG Web Browser can pull the latest data from authoritative sources, statistics pages, or original research papers, giving your content generation model factual grounding for every claim it makes.
Claude and ChatGPT plugins — Both Claude and ChatGPT support tool use and function calling. RAG Web Browser can be wrapped as a callable tool, so the model can request a web page fetch mid-conversation and incorporate the result into its next response. This creates AI assistants that are genuinely connected to the live web rather than pretending to be.
Competitive intelligence automation — Marketing and product teams that track competitors can automate the collection of competitor content: pricing pages, feature announcements, job postings, changelog entries. By running RAG Web Browser on a list of competitor URLs on a schedule, teams get a continuous feed of clean, AI-readable competitor content without any manual browsing.
RAG Web Browser runs on Apify's Pay Per Event model. The cost is $3 per 1,000 requests.
For most use cases, this is extremely affordable:
There are no subscription fees, no minimum commitments, and no seat licenses. You pay for what you run. If you already have an Apify account with credits, RAG Web Browser draws from the same balance as any other actor.
For context on the broader Apify platform and how actors are priced, see the Apify web scraping platform overview.
The integration pattern is the same regardless of which language model you use.
Step 1: Call the actor with a URL
Send a request to the Apify API with the target URL as input. The actor runs, fetches the page, cleans the content, and returns a JSON response. The key field in the response is the markdown content of the page.
Step 2: Include the markdown in your prompt
Take the returned markdown and insert it into your LLM prompt as context. The structure looks like this:
You are a research assistant. Use the following web page content to answer the user's question accurately.
--- Web Page Content ---
[markdown from RAG Web Browser]
--- End of Content ---
User question: [question here]
Step 3: Let the model reason over clean content
The model now has a structured, readable version of the web page in its context window. It can extract specific facts, summarize the content, compare information across multiple pages, or answer direct questions — all grounded in the actual current content of the page rather than its training data.
For Claude specifically, this pattern works natively with the Messages API. Pass the markdown as a user turn or as a system context block. Claude handles long markdown well and will cite specific sections when answering questions.
For GPT-4 and other OpenAI-compatible models, the same approach works with the chat completions API. The markdown can be passed as a system message or as part of the user message, depending on your preferred prompting structure.
For automated pipelines, the Apify JavaScript and Python SDKs let you call the actor programmatically, collect the output, and pass it to your LLM in a single function. This makes it straightforward to build loops that process multiple URLs and aggregate the results.
Several tools solve parts of the same problem. Here is how they compare.
RAG Web Browser vs. Firecrawl — Firecrawl is a dedicated web-to-markdown API that works well for clean, static pages. It is faster for straightforward content but handles JavaScript-heavy pages less reliably than RAG Web Browser's headless browser approach. Firecrawl requires a separate subscription; RAG Web Browser runs on existing Apify credits if you already use the platform. For teams already in the Apify ecosystem, RAG Web Browser has zero additional overhead.
RAG Web Browser vs. Browserbase — Browserbase provides full remote browser infrastructure for complex browser automation. It is more powerful and more expensive, aimed at use cases that require actual interaction: clicking buttons, filling forms, navigating multi-step flows. RAG Web Browser is purpose-built for read-only content extraction and is significantly simpler to integrate for pure RAG use cases. If you only need the content of a page, not the ability to interact with it, Browserbase is overkill.
RAG Web Browser vs. raw HTTP requests — Raw HTTP requests with libraries like requests, httpx, or axios cannot execute JavaScript. A growing majority of web content is loaded by JavaScript after the initial HTML response, which means raw requests often return empty or incomplete pages. They also return raw HTML that your code must parse, which requires building and maintaining custom extraction logic for each domain. RAG Web Browser handles both problems out of the box, across any website, without custom parsers.
RAG Web Browser vs. LLM built-in web search — Models like GPT-4 with browsing and Claude with web search tools can retrieve content natively. However, these capabilities are gated by the model provider's implementation, limited to their specific interface, and not available through the API in the same way. RAG Web Browser gives you programmatic, API-level control over exactly which pages get fetched and how the content is processed — essential for production pipelines where you cannot rely on a conversational interface.
JavaScript rendering — The actor uses a full headless browser (Chromium-based) that executes JavaScript exactly as a real browser would. Pages built on any modern JavaScript framework — React, Vue, Angular, Next.js, Nuxt — are fully rendered before content extraction begins.
Content isolation — The extraction algorithm targets the primary content region of each page. It uses structural signals (heading hierarchy, text density, semantic HTML tags like <article> and <main>) to identify what is content versus what is navigation, advertising, or boilerplate. This works across diverse site layouts without requiring site-specific configuration.
Markdown output quality — The markdown output preserves the logical structure of the source content: headings, lists, bold text, tables, and inline links. This structure is meaningful for LLMs — a model reading a well-formatted table in markdown can reason about it correctly, whereas the same data in raw HTML is significantly harder to parse.
Scale and concurrency — Because it runs on Apify's cloud infrastructure, RAG Web Browser can process multiple URLs concurrently. A pipeline with 500 pages to process does not need to wait for them sequentially. Apify handles the infrastructure, scaling, and browser pool management transparently.
Error handling — Pages that fail to load, return errors, or are blocked return structured error responses rather than crashing the pipeline. This makes it safe to use in automated workflows where some percentage of URLs may be unavailable.
The actor is at https://apify.com/tugelbay/rag-web-browser. You need an Apify account to run it — the free tier includes enough credits to test any use case before committing to production scale.
The input is straightforward: provide a URL (or a list of URLs), configure any optional parameters like wait times or content selectors, and run the actor. The output is available immediately in the Apify dataset, accessible via API or direct download.
For teams building AI applications where accuracy and freshness matter, RAG Web Browser removes one of the most common failure modes: the model reasoning from stale or absent information. At $3 per 1,000 requests, the cost of giving your AI real-time web access is low enough that it is hard to justify not using it.
Originally published at https://konabayev.com/blog/rag-web-browser/