2026-03-11 04:34:52
Many VTU students struggle to calculate their SGPA manually after results are announced.
To make this easier, I built a simple VTU SGPA Calculator using JavaScript and a JSON data file.
In this post, I’ll explain how you can build a basic SGPA calculator using JavaScript and JSON.
VTU SGPA is calculated using a credit-weighted average formula.
SGPA = Σ (Credit × Grade Point) / Σ Credits
Instead of hardcoding subjects in JavaScript, we store them in a JSON file.
subjects.json
[
{ "subject": "Mathematics", "credits": 4 },
{ "subject": "Physics", "credits": 3 },
{ "subject": "Chemistry", "credits": 3 },
{ "subject": "Programming", "credits": 4 }
]
This approach makes the calculator:
<h2>VTU SGPA Calculator</h2>
<div id="subjects"></div>
<button onclick="calculateGPA()">Calculate SGPA</button>
<p id="result"></p>
This creates a simple interface where students can enter their grade points.
fetch("subjects.json")
.then(res => res.json())
.then(data => {
const container = document.getElementById("subjects");
data.forEach(sub => {
container.innerHTML += `
<div>
${sub.subject} (${sub.credits} credits)
<input type="number" min="0" max="10" class="grade">
</div>
`;
});
});
This dynamically loads the subject list and creates input fields.
function calculateGPA(){
const grades = document.querySelectorAll(".grade");
let totalCredits = 0;
let weightedSum = 0;
grades.forEach((input, index) => {
const grade = parseFloat(input.value) || 0;
const credits = subjects[index].credits;
weightedSum += grade * credits;
totalCredits += credits;
});
const gpa = weightedSum / totalCredits;
document.getElementById("result").innerText =
"Your SGPA: " + gpa.toFixed(2);
}
This calculates SGPA using the credit-weighted formula.
Using JSON helps because:
If you're a VTU student, you can try the full calculator here:
https://www.civvy.tech/vtu-sgpa
Building a VTU SGPA calculator with JavaScript and JSON is simple and efficient.
It helps students:
2026-03-11 04:33:09
S3-native streaming isn't a new idea. WarpStream, AutoMQ, and S2 are all betting on the same thesis: object storage is durable and cheap enough to replace broker-local disks for event streaming.
I've been building my own take on this called StreamHouse. It's open source (Apache 2.0), written in Rust, and I want to walk through the actual architecture rather than just pitch it.
What "S3-native" means in practice
The write path:
Reads pull segments from S3. Metadata (topics, offsets, consumer groups) lives in Postgres or SQLite. Agents are stateless, they just need S3 credentials and a metadata connection.
No inter-broker replication. No partition reassignment when nodes change. Storage scales independently from compute.
Where StreamHouse sits in the space
WarpStream proved the S3-native model works. AutoMQ took a different angle, building on top of the Kafka codebase. S2 is focused on being a storage primitive.
StreamHouse is:
Whether any of that matters depends on your use case. If you want managed and don't care about self-hosting, WarpStream might be the better pick. If you want to run it yourself, inspect the code, and not depend on a vendor, that's what I'm building for.
The durability model
Two ack modes:
acks=leader: confirmed after WAL fsync. ~2.2M records/sec. There's a window where data exists only on local disk before the S3 flush. WAL protects against process crashes, but a full disk failure in that window means data loss.
acks=durable: confirmed after S3 upload. Multiple producers batch into a 200ms window and share a single upload. Slower, but the data is in S3 before the producer gets an ack.
The hard part: metadata vs data consistency
If metadata says a segment exists but S3 doesn't have it (or vice versa), you have a problem. This is the thing that bit me early on and that people rightly called out.
Orphan cleanup: A background reconciler diffs S3 against metadata periodically. Orphans get a 1-hour grace period so it doesn't race with in-progress uploads, then get cleaned up.
Full disaster recovery: If you lose your metadata store, the server rebuilds on startup. It discovers orgs from S3 prefixes, restores from automatic metadata snapshots (saved to S3 after every segment flush), and reconciles any gaps. I have a 5-phase test suite that deletes all metadata and verifies full recovery.
What's in it besides the storage engine
What's missing
Source: https://github.com/gbram1/streamhouse
Website and how it works: https://streamhouse.app/how-it-works
If you're interested in the S3-native streaming space or want to dig into the internals, I'd love feedback. And if you have opinions on which connectors matter most, I'm all ears, that's the next big piece of work.
2026-03-11 04:28:45
Everyone talks about red teaming AI agents. Few do it continuously. None do it with cryptographic proof.
We built something different: a live adversarial arena where 15 agents attack our governance infrastructure 24/7. Every detection is provable. The current detection rate — 91.8% — updates in real-time.
The standard AI safety playbook looks familiar: hire a red team, run a battery of tests, produce a report, file it with compliance. Repeat annually, maybe quarterly if you're diligent.
This is how we tested bridges in the 1800s. Load it once, write down whether it held, move on.
The problem is obvious to anyone shipping AI agents in production: they don't sit still. They drift. They encounter novel inputs. They operate under adversarial conditions that no static benchmark can anticipate.
A model that passes every eval on Tuesday can hallucinate policy violations on Wednesday because:
Static red team reports are stale on delivery. "We tested it once" isn't governance — it's compliance theater.
We built something different: a continuous adversarial stress test of our AIP (AI Integrity Protocol) governance pipeline.
Fifteen adversarial agents run around the clock, probing for weaknesses across six technique categories:
Generating plausible but misleading reasoning chains that pass surface analysis. The attacker injects prompts that cause an authorized agent to take authorized actions for unauthorized purposes.
The action looks legitimate. The intent isn't. This is the hardest class to detect because nothing in the action itself is wrong — you have to understand the reasoning behind it.
Subtle reasoning drift across conversational turns. No single message violates policy. But over five, ten, fifteen turns, the agent's reasoning has drifted from its behavioral contract.
This is the boiling frog of agent safety — invisible to systems that only evaluate individual outputs.
Exploiting trace format, minimizing data, or splitting behavior across sessions to dodge analysis entirely. If the agent doesn't reason visibly, there's nothing to analyze.
This tests whether governance can detect the absence of evidence as evidence of absence.
Manipulating trace context, timestamps, or session data to mislead analysis. This is adversarial AI aimed at the auditor, not the agent.
If your governance relies on an LLM to evaluate another LLM's reasoning, the attacker's next move is to compromise the evaluator.
Exploiting the gap between declared values and actual behavior. The agent says what you want to hear while doing something else.
Attempts to operate outside policy boundaries without triggering enforcement. Boundary probing, undeclared tools, escalation suppression.
Every one caught so far.
Here's what makes this different from traditional red teaming: every detection is cryptographically attested.
Not "we said we caught it." Proof that the catching was honest and the evidence wasn't tampered with.
When the 24-hour rolling average detection rate crosses 90%, the red team agents gain the ability to mutate. They analyze which strategies were caught, how they were caught, and evolve new approaches with awareness of the defense.
This creates adversarial evolutionary pressure. The defense must keep improving or get outpaced.
The "Days Since Evasion" counter makes this arms race visible. When something gets through, the counter resets to zero — and the evasion technique is published as a detection recipe.
Most companies hide their breaches. We announce ours.
| Metric | Current Value |
|---|---|
| Total attempts | 3,200+ |
| Detection rate | 91.8% |
| Average analysis latency | 13.8s |
| Days since evasion | 0 |
| Mutations | 8 |
That 91.8% is live. It changes every time an adversary runs. When we launched the arena, detection started at roughly 40%. The defense evolution chart shows the climb: 40% to 70% in twelve hours, stabilizing above 85% by day two, crossing 90% on day three.
We could have waited until detection hit 99%. We could have cherry-picked the attack classes. We could have published the number that looks best on a slide deck.
We published the live number instead.
There are three main approaches to AI agent safety today:
Enforcement-first approaches make dangerous actions impossible by construction. Valuable, but enforcement without detection is blind to novel attacks.
Observability platforms log what happened. Good for forensics, but logs are mutable and require trusting the vendor.
Static benchmarks measure point-in-time performance. An agent that passes today can drift tomorrow.
The arena doesn't replace these approaches. It complements them with continuous, adversarial, cryptographically provable governance testing.
For developers interested in the technical implementation:
// Simplified detection pipeline
pub struct DetectionPipeline {
analyzers: Vec<Box<dyn IntegrityAnalyzer>>,
attestation: AttestationService,
threshold: f64,
}
impl DetectionPipeline {
pub async fn analyze_trace(&self, trace: &AgentTrace) -> DetectionResult {
let analysis = self.run_parallel_analysis(trace).await?;
let confidence = analysis.aggregate_confidence();
if confidence > self.threshold {
let attestation = self.attestation.sign_detection(&analysis).await?;
Ok(DetectionResult::Detected { analysis, attestation })
} else {
Ok(DetectionResult::Clean)
}
}
}
The key insight: continuous testing reveals gaps that static benchmarks miss. The adversarial pressure ensures those gaps get smaller over time.
The arena is live now. Detection recipes publish automatically when evasions are caught and patched. We're exploring opening it to external adversaries — if you think you can beat our detection, we want to know.
Every mutation is a public research contribution to adversarial AI safety. The industry needs more of this kind of open research.
Originally published on mnemom.ai
2026-03-11 04:27:16
Most brands have no idea how AI engines describe them.
Not Google. Not Bing. The AI engines: ChatGPT, Gemini, Perplexity, Claude, Grok, DeepSeek. The ones people are increasingly using instead of search.
A Princeton and Georgia Tech research team coined the term Generative Engine Optimization (GEO) in their 2024 paper (arXiv:2311.09735, published at KDD 2024). Their finding: traditional SEO techniques have almost zero correlation with visibility in AI-generated answers. The ranking factors are completely different.
Each engine has its own retrieval pipeline. They do not all work the same way.
Gemini reads structured definitions and your opening paragraph. It uses Google Search grounding to pull live web data, but it also has training data biases. If your first 200 words are ambiguous, Gemini will fill in the blanks with hallucinated context. We learned this the hard way: Gemini described our own tool as a "life sciences data analysis platform." We are a GEO audit tool.
Perplexity indexes live web content aggressively. It reads your first paragraph and title, then decides if you are worth citing. It has 39.5 million backlinks from 82,000 domains and generates hub pages for every query pattern. If Perplexity does not have a third-party source mentioning your brand, you do not exist in its answers.
DeepSeek pulls primarily from training data, not live web. If your brand was not in its training corpus, it will literally ask the user: "Is it possibly XanLens, XenLens, ZanLens, or a similar spelling?" That is a real response from our audit.
Grok uses X (Twitter) data and live web search. It picks up social signals faster than other engines. If you have an active X presence discussing your product, Grok will find you. If not, it may categorize you as fictional.
The Princeton GEO paper tested optimization strategies across generative engines and found:
Separately, SE Ranking studied 2.3 million pages and found that domain traffic was the number one predictor of appearing in AI-generated answers. Not backlinks. Not keyword optimization. Traffic.
AI referral traffic is still small in absolute terms (roughly 1% of total web traffic) but growing at 130%+ year over year. The brands that establish AI visibility now will have a structural advantage as that percentage climbs.
GEO is not one thing. It is a set of practices across three categories:
1. Identity signals (on-site)
2. Off-site presence (citations)
3. Monitoring
We built XanLens to do exactly this: audit how visible a brand is across AI engines. Score from 0 to 100, tested against 132 prompts across live engines.
Then we ran it on ourselves.
Score: 16 out of 100.
Knowledge score: 17. Discovery score: 18. Citation score: 29.
Gemini thought we were a pharma company. DeepSeek did not know if our name was spelled correctly. Grok categorized us as fictional.
We had robots.txt blocking AI crawlers. Our meta title was 27 characters. Our schema markup was 62% complete. We had zero off-site content.
The tool worked. It just told us what we did not want to hear.
Now we are fixing it, publicly, and tracking the score changes. We call it Day Zero.
If you have never checked how AI engines describe your brand:
GEO is not a replacement for SEO. It is a parallel channel that most brands have not started working on yet. The research says it matters. Our own audit says it matters. The 130% growth in AI referral traffic says it matters.
The question is whether you check before or after your competitors do.
2026-03-11 04:23:31
A live investigation. This post will be updated as I dig deeper, fix it, and reflect on what it means.
Today I deleted a file called llm_api_client.py.
It had no imports pointing to it anywhere in the codebase. Pure orphan. Dead code by any definition.
The problem: CORE's constitutional auditor didn't catch it.
CORE has a rule called purity.no_dead_code:
{
"id": "purity.no_dead_code",
"statement": "Production code MUST NOT contain unreachable or dead symbols as identified by static analysis.",
"enforcement": "reporting"
}
The rule exists. The audit runs it on every core-admin code audit call. It produced exactly 1 warning in recent runs — but not for llm_api_client.py.
I only found the dead file manually, while working through a separate compliance task.
That's a problem worth understanding.
CORE's enforcement model separates what the law says from how it's enforced. The rule lives in .intent/rules/, the enforcement mechanism lives in .intent/enforcement/mappings/.
Here's the full enforcement declaration for purity.no_dead_code:
purity.no_dead_code:
engine: workflow_gate
params:
check_type: dead_code_check
tool: "vulture"
confidence: 80
Vulture. A solid static analysis tool — but one with a specific scope. Vulture finds unused symbols within files: functions that are defined but never called, variables assigned but never read, classes that are never instantiated.
What vulture does not do: traverse the import graph to find files that nothing imports.
llm_api_client.py likely had internal symbols that appeared "used" within the file itself. From vulture's perspective: no violations. From reality's perspective: the entire file was unreachable from the rest of the system.
The rule says: "unreachable or dead symbols"
The enforcement checks: unused symbols inside files
These are two different things. The enforcement is a subset of what the rule claims to guarantee. The constitution was shallow.
This is, I think, the most honest thing I can say about constitutional AI governance:
The constitution is only as strong as its enforcement mechanisms. A rule that exists but enforces shallowly is not a guarantee — it's an aspiration.
CORE did exactly what it was told. No more, no less. The law declared "no dead code." The enforcement mechanism checked for unused symbols. The file slipped through the gap between what the law said and what the enforcement did.
This isn't a criticism of the approach. It's the nature of any governance system — constitutional law included. The text of the law and the apparatus that enforces it are always two separate things. The gap between them is where violations live.
What matters is: can the system correct itself when the gap is found?
In CORE's model, the fix is a .intent/ declaration change. Not Python. Not a code patch. A policy update that changes enforcement behavior system-wide.
True dead file detection requires import graph traversal — building a dependency graph of the entire codebase and identifying files that no entry point can reach.
Tools that can do this: pydeps, custom AST graph traversal, or a knowledge_gate that queries CORE's own symbol database (which already tracks file-level relationships via core.symbols).
The declaration change would look something like:
purity.no_dead_code:
engine: workflow_gate
params:
check_type: dead_code_check
tool: "vulture" # symbol-level: keep this
confidence: 80
# ADD:
additional_checks:
- check_type: orphan_file_check
engine: knowledge_gate
params:
check_type: unreachable_files
entry_points:
- "src/cli/"
- "src/body/atomic/"
I checked. CORE's knowledge_gate currently supports:
capability_assignmentast_duplicationsemantic_duplicationduplicate_idstable_has_recordsNo orphan file detection. No import graph traversal.
The gap goes deeper than a declaration change. A new check_type implementation is needed — which means extending knowledge_gate itself, or building a dedicated engine. The .intent/ declaration is the easy part. The enforcement mechanism has to exist first.
This is the rabbit hole.
llm_api_client.py)knowledge_gate does not support orphan file detection — new engine neededcheck_type for import graph traversalknowledge_gate or build dedicated engine.intent/enforcement/mappings/code/purity.yaml
[UPDATE 1 — coming soon: designing the orphan file check — declaration-first, engine second]
[UPDATE 2 — coming soon: implementation and proof it works]
[UPDATE 3 — coming soon: the philosophical reflection on constitutional blind spots]
CORE is open source: github.com/DariuszNewecki/CORE
Credit: the PromptModel artifact pattern was inspired by Ruben Hassid's prompt engineering work.
2026-03-11 04:23:29
Most developers who try Claude Code and get mediocre results have the same problem: they’re not writing briefs, they’re writing wishes.
"Make this function better" is a wish.
A brief looks different.
A Claude Code brief is a structured task description with four parts:
Here’s the difference in practice.
Wish:
"Refactor the payment processing module to be cleaner"
Brief:
"Context: This is our payment processing module (src/payments/processor.py). It handles Stripe webhooks and writes to our orders table. It was written in 2021, processes ~500 transactions/day.
Task: Refactor for readability. Main issues: process_webhook is 180 lines and there are bare except clauses everywhere.
Constraints: Do NOT change function signatures — other modules depend on them. Do NOT change database write logic. Keep the same Stripe library version.
Success criteria: All functions < 50 lines. Specific exception types instead of bare excepts. Test suite still passes."
The first prompt will give you something. The second will give you something you can actually use.
Claude Code reads your codebase. That’s its superpower. But "relevant" is determined by what you tell it to focus on.
Without context, Claude Code guesses at what matters. Sometimes it guesses right. Often it makes changes that are technically correct but practically wrong — because it didn’t know that orders table is a read replica.
Your context paragraph should answer:
Three sentences of context prevents three rounds of revision.
Most AI coding disasters happen because constraints weren’t specified.
"Rewrite this API endpoint to be faster" → Claude Code restructures your routing, breaking five other endpoints.
Not because it was wrong about making it faster. Because you didn’t say the routing was off-limits.
Constraints to always consider:
When you include "the test suite still passes" as a success criterion, Claude Code will run your tests and iterate until they do. Without success criteria, you get "here’s my best attempt" and you’re left evaluating it yourself.
Good success criteria are:
Context:
[What this code is, what it does, where it sits in the system, traffic/criticality]
Task:
[Specific, verb-first description of what to do]
Constraints:
- [Thing it cannot change]
- [Thing it must preserve]
- [Standard/style it must follow]
Success criteria:
- [Measurable outcome 1]
- [Measurable outcome 2]
- [Test or check that verifies completion]
The brief format is even more useful at the team level.
When everyone writes briefs the same way:
The teams seeing the biggest Claude Code ROI aren’t the ones with the most API access. They’re the ones that made brief-writing a team norm.
Write your next Claude Code task as a brief. All four parts. It will take 5 extra minutes.
See what comes back.
I’d bet you won’t write a wish again.
If you’re thinking about how to roll this out to your whole engineering team, the Ask Patrick co-work program covers exactly this. Free team assessment: askpatrick.co/assessment.html