2026-04-25 11:15:00
This is Part 3 of a three-part series on AI governance architecture. In Part 1, we explored the negative proof problem why signed receipts can't prove that unauthorized actions didn't happen. In Part 2, we examined pre-execution gates that evaluate policy before execution occurs. Today, we'll build a complete reference architecture showing exactly how these components fit together in a production system.
Note: This series explores architectural patterns for AI governance based on regulatory requirements and cryptographic best practices. The layered architecture and code examples presented are conceptual frameworks for educational purposes, adaptable across different tech stacks and deployment environments.
We've established the conceptual foundation for pre-execution governance: evaluate policy before execution rather than after, create denial proofs that demonstrate prevention rather than just detection, maintain deterministic policy evaluation to enable replay verification. But understanding the pattern conceptually is different from implementing it in a production system where reliability, performance, and maintainability all matter.
The gap between "this makes sense architecturally" and "this works in production" is where most governance initiatives stall out. You start with good intentions, build a proof of concept that validates the core ideas, then hit the messy reality of integrating with existing systems, handling edge cases, managing policy evolution, and operating the whole stack at scale. What you need is a clear architectural blueprint that shows not just what components to build, but how they interact, what each layer is responsible for, and how to evolve the system as requirements change.
This reference architecture represents patterns that work across different tech stacks and deployment environments. The specific implementation details will vary depending on whether you're running on AWS, Azure, GCP, or on-premises infrastructure, but the layered structure remains the same. Each layer has a specific responsibility, clear boundaries with adjacent layers, and well-defined interfaces that make testing and evolution manageable.
Every request into your AI system passes through a single entry point with no bypass paths. This is architecturally similar to how API gateways work in microservices architectures—you enforce that all traffic flows through one place so you can apply cross-cutting concerns consistently. In this case, the cross-cutting concern is governance evaluation.
The execution router's job is deceptively simple: receive requests, determine which governance pipeline applies based on tenant and folder context, and route to the appropriate evaluation flow. But that simplicity is load-bearing. If there are multiple entry points into your AI execution layer, or if developers can bypass the router by calling model APIs directly, your governance guarantees collapse. The router is only effective if it's mandatory and non-bypassable.
In practice, making the router mandatory means using your infrastructure's access control systems to enforce it. If you're running on AWS, that means IAM policies that prevent Lambda functions from calling Bedrock directly—they have to go through the router. If you're running on Azure, it means managed identities that only grant the router function permission to invoke AI services. If you're running on-premises with direct model access, it means network segmentation that prevents application servers from reaching model APIs without passing through the governance layer.
The router also handles authentication and initial context resolution. Before any governance evaluation happens, you need to know who's making the request and what organizational boundaries it belongs to. That typically means validating JWT tokens, resolving tenant identifiers from user claims, and loading the folder context that determines which policies apply. This context becomes the foundation for all subsequent policy evaluation.
Here's what that looks like structurally:
class ExecutionRouter:
"""
Single entry point for all AI requests. No bypass paths allowed.
Infrastructure access controls enforce that all model invocations
must flow through this router.
"""
async def route_request(self, request):
# Step 1: Authentication - who's making this request?
caller = await self.authenticate(request)
# Step 2: Context resolution - which tenant/folder?
context = await self.resolve_context(caller, request)
# Step 3: Route to appropriate governance pipeline
# Different tenants or folders might have different policy engines
pipeline = self.get_pipeline(context.tenant_id, context.folder_id)
# Step 4: Execute governance evaluation
# This is where we call Layer 2 (Policy Engine)
decision = await pipeline.evaluate(request, context)
# Step 5: Handle the decision
if decision.verdict == 'DENY':
return self.handle_denial(decision)
else:
return await self.execute_and_receipt(request, decision)
The router is stateless and horizontally scalable. Each request is independent, and all the state needed for governance evaluation gets loaded from durable storage systems. This means you can run multiple router instances behind a load balancer without coordination between them, which is essential for handling production-scale traffic.
The policy engine's responsibility is evaluating requests against governance rules and returning an enforcement decision. This is where the actual governance logic lives—all the rules about folder isolation, data classification restrictions, tool access controls, budget limits, and compliance requirements.
The key architectural constraint for this layer is that policy evaluation must be deterministic and fast. As we discussed in Part 2, deterministic evaluation enables replay verification, which is how you prove to auditors that denial decisions were legitimate. Fast evaluation means you can run this synchronously on every request without adding unacceptable latency.
To achieve both determinism and speed, the policy engine operates on a snapshot of the policy that's loaded once and cached in memory. When a request comes in for evaluation, the engine doesn't query a database to find out what rules apply—it already has the rules loaded. This eliminates network latency and ensures that the evaluation is deterministic because it's using a fixed policy version rather than potentially fetching different rules on subsequent evaluations.
Policy snapshots are versioned immutably. When you update a policy, you create a new version with a new hash. The old version remains available indefinitely so that denial proofs can be replayed against the exact policy that was in effect when the original decision was made. This versioning is what enables the replay verification workflow that auditors rely on.
The engine evaluates rules in a defined sequence. Some governance frameworks call this a policy decision point, but the concept is straightforward: you have an ordered list of rules, you evaluate them one by one, and the first rule that fires determines the outcome. This sequential evaluation is important because it makes policy behavior predictable and debuggable. You can trace through exactly which rule fired and why, which is essential for both policy development and compliance documentation.
class PolicyEngine:
"""
Deterministic policy evaluation with immutable versioning.
"""
def __init__(self, policy_snapshot):
# Load immutable policy snapshot into memory
self.policy = policy_snapshot
self.version_hash = policy_snapshot.hash
def evaluate(self, request, context):
# Evaluate rules sequentially until one fires
for rule in self.policy.rules:
if rule.condition_matches(request, context):
# First matching rule determines the decision
return Decision(
verdict=rule.action, # ALLOW or DENY
rule_id=rule.id,
policy_version=self.version_hash,
reason=rule.reason_template.format(**context),
regulatory_basis=rule.citations
)
# No explicit rule fired, use default policy
return Decision(
verdict=self.policy.default_action,
policy_version=self.version_hash
)
When you're designing policies for this engine, you need to think carefully about what belongs here versus what belongs in Layer 5 analytics. The policy engine should enforce simple, explicit rules that can be evaluated quickly: folder boundaries, data classification checks, budget gates, allowlists of permitted tools. It should not run machine learning models to detect anomalies, query external APIs that might be slow or unreliable, or implement complex heuristics that might produce different results on subsequent evaluations.
Once the policy engine returns a decision, that decision needs to be captured in a tamper-evident format with cryptographic guarantees. This is where Layer 3 comes in. Its job is to take the decision from Layer 2, add cryptographic signing via a key management service, and store the signed artifact in immutable storage.
The signing step is critical because it's what prevents someone from fabricating denial proofs after the fact. When you use AWS KMS, Azure Key Vault, or Google Cloud KMS for signing, you're leveraging a hardware security module that's designed to make forging signatures computationally infeasible. The governance system calls the signing API with the decision payload, gets back a signature, and bundles them together into the signed proof artifact.
The immutability step is equally critical because it prevents tampering with the audit trail. If you store denial proofs in a regular database where administrators can delete records, an auditor can't trust that the absence of a denial proof means no denial occurred—it could mean the proof was deleted. But if you store denial proofs in S3 with Object Lock in compliance mode, or in Azure Blob Storage with immutable blob retention policies, those proofs become undeletable even by privileged administrators. The only way to "delete" them is to wait for the retention period to expire, which might be seven years for HIPAA data or even longer for other regulatory frameworks.
Batching denial proofs into Merkle trees adds an additional layer of verification efficiency. Instead of requiring auditors to verify thousands of individual signatures, you can batch decisions into hourly or daily trees, compute a root hash, sign that root with KMS, and anchor it to immutable storage. Then auditors can verify the root signature once and use the Merkle proof structure to verify that individual decisions are included in the tree. This pattern scales much better than individual signature verification when you're dealing with high-volume AI systems.
class ProofStorage:
"""
Cryptographically sign decisions and store immutably.
"""
async def store_denial(self, decision, request_hash):
# Create denial proof payload
proof = DenialProof(
decision_id=generate_id(),
request_hash=request_hash,
verdict='DENY',
rule_id=decision.rule_id,
policy_version=decision.policy_version,
timestamp=utcnow(),
reason=decision.reason
)
# Sign with KMS to prevent forgery
signature = await kms_client.sign(
key_id=GOVERNANCE_SIGNING_KEY,
message=proof.canonical_bytes(),
algorithm='RSASSA_PKCS1_V1_5_SHA_256'
)
# Bundle into signed proof
signed_proof = SignedDenialProof(
proof=proof,
signature=signature,
key_id=GOVERNANCE_SIGNING_KEY
)
# Store in immutable WORM storage
await s3_client.put_object(
Bucket=WORM_BUCKET,
Key=f'denials/{proof.decision_id}.json',
Body=signed_proof.to_json(),
ObjectLockMode='COMPLIANCE',
ObjectLockRetainUntilDate=utcnow() + timedelta(days=2555) # 7 years
)
# Queue for Merkle batching
await sqs_client.send_message(
QueueUrl=MERKLE_BATCH_QUEUE,
MessageBody=proof.decision_id
)
return signed_proof
The combination of cryptographic signing and immutable storage creates what compliance frameworks call non-repudiation. The organization that generated the denial proof cannot later claim that the proof was fabricated or tampered with, because the KMS signature proves authenticity and the WORM storage proves the proof hasn't been modified since creation.
Having signed denial proofs in immutable storage is valuable, but only if auditors can independently verify them without needing privileged access to your production systems. That's what Layer 4 provides: a public verification endpoint that anyone with a denial proof identifier can use to validate authenticity.
The verification endpoint accepts a denial proof ID, retrieves the corresponding proof from storage, and performs several checks. First, it verifies the KMS signature to confirm the proof hasn't been tampered with. Second, it checks that the proof is actually stored in the WORM bucket with retention policy intact. Third, if the proof is part of a Merkle batch, it verifies the Merkle inclusion proof showing that the decision is included in a sealed batch. Fourth, it offers a replay endpoint where someone can re-evaluate the decision using the archived policy snapshot to confirm the decision would still be DENY.
This verification endpoint is intentionally designed to work without requiring authentication. Any auditor, regulator, or customer who has a denial proof ID can verify it independently. This is similar to how blockchain verification works—you don't need to trust the organization that created the record, you can verify it yourself using public cryptographic proofs. For compliance purposes, this independent verifiability is what makes denial proofs compelling evidence rather than just self-reported logs.
class VerificationEndpoint:
"""
Public endpoint for independent verification of denial proofs.
No authentication required - verification is based on cryptography.
"""
async def verify_denial(self, proof_id):
# Retrieve proof from WORM storage
proof = await self.get_proof(proof_id)
# Check 1: Verify KMS signature
signature_valid = await kms_client.verify(
key_id=proof.key_id,
message=proof.canonical_bytes(),
signature=proof.signature,
algorithm='RSASSA_PKCS1_V1_5_SHA_256'
)
# Check 2: Verify WORM retention is intact
retention_active = await self.verify_worm_retention(proof_id)
# Check 3: Verify Merkle inclusion if batched
merkle_valid = await self.verify_merkle_inclusion(proof)
# Check 4: Offer replay verification
replay_endpoint = f'/verify/{proof_id}/replay'
return VerificationResult(
proof_id=proof_id,
signature_valid=signature_valid,
worm_retention_active=retention_active,
merkle_inclusion_valid=merkle_valid,
replay_endpoint=replay_endpoint
)
The replay endpoint deserves special attention because it's what makes deterministic policy evaluation valuable in practice. An auditor can call the replay endpoint with the original request hash and the policy version from the denial proof. The verification system retrieves the immutably stored policy snapshot, re-runs the policy evaluation, and confirms that the outcome is still DENY. If the replay produces a different result, that's a red flag that either the policy was mutated after the fact or the policy engine is non-deterministic, both of which undermine the integrity of your governance system.
The first four layers focus on enforcement and proof generation. Layer 5 is where you add the observability and analytics that make the governance system operationally manageable. This is where you aggregate decisions to build dashboards showing denial patterns, detect anomalies that might indicate policy gaps or system attacks, surface frequently denied rules that might need policy adjustment, and track compliance metrics for internal reporting.
Critically, Layer 5 is optional in the sense that the core governance enforcement works without it. You can have a fully functional pre-execution gate system with just Layers 1 through 4. Layer 5 adds operational visibility and helps you evolve policies over time, but it's not required for basic prevention and proof generation. This is an important architectural separation because it means you can start with enforcement-first and add analytics later as operational needs emerge.
The analytics layer operates on the same denial proofs and receipts that Layer 3 generates, but it processes them asynchronously after the fact rather than inline during request handling. This separation keeps the enforcement path fast and simple while allowing the analytics path to be as complex and slow as necessary. You might run machine learning models to detect unusual denial patterns, query external threat intelligence feeds to identify potentially malicious request sources, or generate compliance reports that require aggregating data across thousands of decisions.
One pattern that works well is using the analytics layer to detect when policies need updating. If you see a spike in denials for a particular rule, that might indicate a legitimate use case that your current policy doesn't account for. If you see a pattern of denials followed by successful requests with slightly modified parameters, that might indicate someone is probing your governance boundaries. The analytics layer surfaces these patterns so your security team can investigate and adjust policies as needed.
Now that we've built out the full five-layer architecture, it's worth stepping back and honestly assessing when you don't need all this complexity. Not every AI system requires pre-execution gates. If your compliance requirements focus on auditability and transparency rather than prevention, if you're operating in environments where the cost of a governance failure is low, or if you're in early-stage development where shipping velocity matters more than production hardening, receipts alone may be sufficient.
The decision tree is straightforward. If your regulatory framework uses prevention language—HIPAA's "prevent unauthorized access," PCI DSS's "prevent access beyond need-to-know," GDPR's "prevent processing beyond original purpose"—then you need pre-execution gates because receipts fundamentally cannot demonstrate prevention. But if your framework focuses on auditability and disclosure—demonstrating that you have policies, that you applied them consistently, that you can produce records on demand—then receipts provide the evidence you need without the architectural overhead of gates.
Similarly, if you're operating in regulated verticals where negative proofs matter—healthcare, financial services, government systems—pre-execution gates become table stakes because auditors will ask questions that only gates can answer. But if you're running internal analytics tools used by trusted operators in controlled environments, the prevention requirement is less stringent and the detection that receipts provide may be adequate.
The other consideration is operational maturity. Pre-execution gates require that your policies be well-defined, deterministic, and tested before you enable enforcement mode. If you're still figuring out what your governance policies should be, starting with receipt-based observability while you iterate on policy design makes more sense than trying to enforce policies that might change dramatically as you learn more about your system's actual behavior.
If you've made it this far through the series, you understand the core architectural patterns for building prevention-first AI governance. You know why signed receipts alone can't solve the negative proof problem, how pre-execution gates create denial proofs that demonstrate prevention, what deterministic policy evaluation means and why it matters, and how to structure a complete governance stack across five architectural layers.
The hard part isn't understanding these patterns—it's implementing them in your specific environment with your specific constraints and requirements. Every organization has legacy systems to integrate with, existing security controls that need to interoperate with the governance layer, and operational teams whose workflows change when you add mandatory governance gates.
The approach that tends to work is starting with Layer 1 and Layer 2 in observer mode. Build the execution router and policy engine, but configure them to always return ALLOW while logging what the decision would have been if enforcement was enabled. This lets you validate that your policies are working correctly, that performance is acceptable, and that you're not about to break production workflows. Once you have confidence in observer mode, you can start enabling selective enforcement on high-risk surfaces where the security benefit justifies the risk of blocking something incorrectly.
From there, you add Layers 3 and 4 to start generating verifiable denial proofs and providing independent verification endpoints. Finally, Layer 5 gives you the operational visibility to maintain and evolve the system over time. This incremental rollout reduces risk while letting you build the governance capabilities you need for compliance.
The AI governance landscape is maturing rapidly. What started as optional nice-to-have tooling is becoming mandatory infrastructure as AI systems move into regulated production environments. Auditors are asking harder questions, regulators are writing more specific requirements, and the organizations that solve prevention-first governance early will have a significant advantage over those still relying on detection-only approaches.
If you're building AI systems that handle sensitive data, operate in regulated industries, or face compliance requirements with prevention language, the time to start thinking about pre-execution governance architecture is now. The patterns are well-understood, the implementation approaches are proven, and the compliance benefits are clear. What's needed is the commitment to build governance as infrastructure rather than treating it as an afterthought.
Read Part 1: The Negative Proof Problem in AI Governance
Read Part 2: Pre-Execution Gates: How to Block Before You Execute
2026-04-25 11:10:01
V4 Pro launched April 24, 2026. Been running it on production agents since.
client = OpenAI(
base_url="https://integrate.api.nvidia.com/v1",
api_key="<NVIDIA_NIM_KEY>"
)
response = client.chat.completions.create(
model="deepseek-ai/deepseek-v4-pro",
messages=[...]
)
| Model | Input | Output |
|---|---|---|
| DeepSeek V4 Pro | $1.74 | $3.48 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| GPT-4o | $2.50 | $10.00 |
For agent workloads (lots of input, structured output), V4 Pro is the new sweet spot.
My agent automation guides updated for V4: https://yanmiay.gumroad.com
2026-04-25 11:07:59
Measure the height of every adult in your city.
Plot how many people are at each height. Short on the left, tall on the right, count of people on the vertical axis.
You get a bell. Narrow at the extremes, wide in the middle. Most people clustered around the average height, fewer and fewer as you go taller or shorter.
Now measure reaction times in a psychology study. Plot them.
Bell.
Measure the weight of apples coming off a production line. Plot them.
Bell.
Measure the errors in any careful scientific measurement. Plot them.
Bell.
This keeps happening. The same shape, over and over, in completely unrelated domains. It is not a coincidence. There is a mathematical reason this shape appears whenever many small independent factors add together to produce an outcome. That reason is what this post is about.
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
np.random.seed(42)
heights = np.random.normal(loc=170, scale=10, size=10000)
plt.figure(figsize=(10, 5))
plt.hist(heights, bins=60, edgecolor='black', color='steelblue', alpha=0.7)
plt.axvline(heights.mean(), color='red', linewidth=2, label=f'Mean: {heights.mean():.1f}')
plt.xlabel('Height (cm)')
plt.ylabel('Count')
plt.title('Distribution of heights (10,000 people)')
plt.legend()
plt.savefig('normal_dist.png', dpi=100, bbox_inches='tight')
plt.close()
print(f"Mean: {heights.mean():.2f} cm")
print(f"Std: {heights.std():.2f} cm")
print(f"Min: {heights.min():.2f} cm")
print(f"Max: {heights.max():.2f} cm")
Output:
Mean: 169.98 cm
Std: 10.03 cm
Min: 131.74 cm
Max: 209.85 cm
np.random.normal(loc=170, scale=10, size=10000) generates 10,000 values from a normal distribution centered at 170 with a spread of 10. The histogram you get from this is a bell curve.
loc is the mean. The center of the bell.scale is the standard deviation. How wide the bell is.
Change scale to 2 and the bell gets narrow and tall. Change it to 30 and it gets wide and flat. Same center, different spread.
This is the most practically useful thing about the normal distribution.
mean = 170
std = 10
within_1_std = (mean - std, mean + std)
within_2_std = (mean - 2*std, mean + 2*std)
within_3_std = (mean - 3*std, mean + 3*std)
sample = np.random.normal(mean, std, 100000)
pct_1 = np.mean((sample >= within_1_std[0]) & (sample <= within_1_std[1])) * 100
pct_2 = np.mean((sample >= within_2_std[0]) & (sample <= within_2_std[1])) * 100
pct_3 = np.mean((sample >= within_3_std[0]) & (sample <= within_3_std[1])) * 100
print(f"Within 1 std ({within_1_std[0]} to {within_1_std[1]}): {pct_1:.1f}%")
print(f"Within 2 std ({within_2_std[0]} to {within_2_std[1]}): {pct_2:.1f}%")
print(f"Within 3 std ({within_3_std[0]} to {within_3_std[1]}): {pct_3:.1f}%")
Output:
Within 1 std (160 to 180): 68.3%
Within 2 std (150 to 190): 95.4%
Within 3 std (140 to 200): 99.7%
68% of the data falls within one standard deviation of the mean.
95% within two.
99.7% within three.
The remaining 0.3% beyond three standard deviations is extremely rare. These are your outliers. The anomalies. The things worth investigating.
This rule works for any normal distribution regardless of what the mean and standard deviation are. The percentages stay the same. Only the actual values change.
Four places the normal distribution shows up constantly in machine learning.
Weight initialization. When you create a neural network, its weights cannot all start at zero. They need to be different from each other so different neurons learn different things. The standard approach: initialize weights from a normal distribution with mean 0 and a small standard deviation.
layer_weights = np.random.normal(loc=0, scale=0.01, size=(256, 128))
print(f"Weight matrix shape: {layer_weights.shape}")
print(f"Mean of weights: {layer_weights.mean():.6f}")
print(f"Std of weights: {layer_weights.std():.6f}")
Output:
Weight matrix shape: (256, 128)
Mean of weights: 0.000023
Std of weights: 0.010001
Random, small, centered at zero, normally distributed. This is how every neural network starts its life.
Feature distributions. Many real-world features are approximately normally distributed. When your features follow a normal distribution, many algorithms work better and faster. When they don't, you sometimes transform them to be closer to normal before training.
Residuals in regression. When you fit a line to data, the errors between your predictions and the true values should be normally distributed if your model is working well. If they're not, something is wrong with your model assumptions.
Anomaly detection. Values more than three standard deviations from the mean are rare under a normal distribution. Mark them as anomalies.
sensor_readings = np.array([
23.1, 22.8, 23.4, 22.9, 23.2, 23.0,
22.7, 23.3, 22.6, 87.4, 23.1, 22.9
])
mean = sensor_readings.mean()
std = sensor_readings.std()
print(f"Mean: {mean:.2f}, Std: {std:.2f}\n")
for i, reading in enumerate(sensor_readings):
z = (reading - mean) / std
status = "ANOMALY" if abs(z) > 2 else "normal"
print(f"Reading {i+1:2d}: {reading:6.1f} z={z:6.2f} {status}")
Output:
Mean: 30.04, Std: 18.41
Reading 1: 23.1 z=-0.38 normal
Reading 2: 22.8 z=-0.39 normal
Reading 3: 23.4 z=-0.36 normal
Reading 4: 22.9 z=-0.39 normal
Reading 5: 23.2 z=-0.37 normal
Reading 6: 23.0 z=-0.38 normal
Reading 7: 22.7 z=-0.40 normal
Reading 8: 23.3 z=-0.37 normal
Reading 9: 22.6 z=-0.40 normal
Reading 10: 87.4 z= 3.12 ANOMALY
Reading 11: 23.1 z=-0.38 normal
Reading 12: 22.9 z=-0.39 normal
One sensor reading spiked to 87.4. Everything else was between 22 and 24. The z-score of 3.12 flags it immediately.
Real data is often not perfectly normal. It is skewed, has heavy tails, or has multiple peaks. Knowing what a normal distribution looks like helps you spot when something is off.
normal_data = np.random.normal(100, 15, 5000)
skewed_data = np.random.exponential(scale=50, size=5000)
print("Normal data:")
print(f" Mean: {normal_data.mean():.1f}")
print(f" Median: {np.median(normal_data):.1f}")
print(f" Diff: {abs(normal_data.mean() - np.median(normal_data)):.1f}")
print("\nSkewed data:")
print(f" Mean: {skewed_data.mean():.1f}")
print(f" Median: {np.median(skewed_data):.1f}")
print(f" Diff: {abs(skewed_data.mean() - np.median(skewed_data)):.1f}")
Output:
Normal data:
Mean: 99.9
Median: 100.0
Diff: 0.1
Skewed data:
Mean: 49.8
Median: 34.3
Diff: 15.5
When mean and median are close, data is likely symmetric and possibly normal. When they diverge significantly, the distribution is skewed. Income, response times, and user session lengths tend to be skewed, not normal. Always check before assuming.
Here is the mathematical reason the bell curve shows up in unrelated domains.
Take any distribution. Roll a die. Draw from it randomly. Average several draws together. Repeat this many times. Plot the distribution of those averages.
Normal distribution. Every time. Regardless of the original distribution.
np.random.seed(42)
die_rolls_single = np.random.randint(1, 7, size=10000)
sample_means = []
for _ in range(10000):
sample = np.random.randint(1, 7, size=30)
sample_means.append(sample.mean())
sample_means = np.array(sample_means)
print("Single die roll:")
print(f" Mean: {die_rolls_single.mean():.2f}")
print(f" Std: {die_rolls_single.std():.2f}")
print(f" Shape: roughly uniform (1 through 6)")
print("\nAverage of 30 die rolls (10,000 experiments):")
print(f" Mean: {sample_means.mean():.2f}")
print(f" Std: {sample_means.std():.2f}")
print(f" Shape: bell curve, centered at 3.5")
Output:
Single die roll:
Mean: 3.50
Std: 1.71
Shape: roughly uniform (1 through 6)
Average of 30 die rolls (10,000 experiments):
Mean: 3.50
Std: 0.31
Shape: bell curve, centered at 3.5
A single die roll is uniformly distributed. Flat. Every outcome equally likely. But average 30 rolls together and suddenly you have a bell curve.
Human heights result from averaging many genetic and environmental factors. Measurement errors average out many tiny random disturbances. Product weights in a factory result from many small random variations in the manufacturing process. Averages of many independent things follow the normal distribution. That is why the bell shows up everywhere.
This result, called the Central Limit Theorem, is one of the most powerful ideas in all of statistics.
Create normal_distribution_practice.py.
Part one: generate 5000 student exam scores from a normal distribution with mean 72 and standard deviation 12. Using only numpy (no scipy), calculate what percentage of students scored above 90. What percentage scored below 50. What percentage scored between 60 and 85.
Then verify using the 68-95-99.7 rule: approximately what percentage should be within one standard deviation of the mean? Count how many actually are and compare.
Part two: you have this real dataset of daily temperatures:
temps = np.array([
24, 26, 23, 25, 28, 24, 27, 25, 26, 24,
23, 26, 25, 27, 24, 26, 23, 25, 42, 24,
26, 25, 27, 24, 23, 26, 25, 28, 24, 26
])
Calculate mean and standard deviation. Find any temperatures more than 2 standard deviations from the mean. Remove those outliers and recalculate statistics. How much did things change?
Part three: demonstrate the Central Limit Theorem using a skewed distribution instead of a die. Use np.random.exponential(scale=10, size=...). Take samples of size 50 and compute their means 5000 times. Print the mean and standard deviation of your sample means. Does the result look normally distributed even though the original distribution was not?
Phase 2 is almost done. One post left: all of this math running as real code using NumPy. No theory. Just you, numpy arrays, and every concept from the last eleven posts firing at once.
After that, Phase 3. The actual data tools. NumPy, Pandas, visualization. The stuff you will use every single day.
2026-04-25 11:06:31
Most CMS platforms ask you to make technical decisions that have nothing to do with your content. Which SEO plugin handles canonical tags better. Whether the sitemap plugin conflicts with the multilingual plugin. How to configure Open Graph images without another plugin. Why the favicon only shows in the browser tab but not in Google's search results.
After years of watching the same problems repeat on site after site, the natural question becomes: what if the CMS just did all of that itself?
AliothPress is a self-hosted CMS built around that question. This post walks through what it automates, so the kind of person who needs this can recognise it.
Installing a self-hosted CMS usually sounds intimidating: upload files to a server, edit configuration by hand, run commands in a terminal. AliothPress works differently, and understanding how it works removes most of the anxiety.
When you create a new Ubuntu server at a cloud provider, there is a field during setup labelled "user-data" or "cloud-init script". You paste a short script into that field before you click create. From that point, everything happens on its own. The server prepares itself, installs what it needs, downloads the latest version of the CMS from the update service, and configures itself. You never touch the server directly. There is nothing to upload, nothing to install manually, no commands to run.
After a few minutes, you open the server's IP address in your browser. A setup wizard walks you through the rest: admin account, database, basic site settings. The whole first-time setup takes around five minutes, and a step-by-step installation guide with screenshots is included for anyone who prefers a visual walkthrough.
Once your domain's DNS points to the server, you activate a free Let's Encrypt SSL certificate from the admin panel with a single click. The certificate covers both your domain and its www version. Automatic renewal is set up in the background. The redirect rules are configured correctly so visitors always land on the canonical version of the site without unnecessary hops through different URL variants.
In the admin panel, you fill in normal fields — title, content, a short description, featured image, keywords. That is all the author does. Everything a search engine needs is generated from those fields by the CMS itself, in the background, in the page's source code.
This includes the title tag, meta description, canonical URL, Open Graph tags (for Facebook and LinkedIn previews), Twitter Card tags, and hreflang (which tells Google that the same content exists in different languages).
The XML sitemap — the file search engines use to discover every page on a site — is generated from published content and stays current as pages are added, edited, or removed. The robots.txt file, which controls how search engines and AI systems are allowed to crawl the site, is managed from the admin panel and comes with sensible defaults, including permissions for AI crawlers (GPTBot, Google-Extended, ClaudeBot) for sites that want to be reachable by language models.
URL paths are cleaned up on input and checked for duplicates across the whole site. When you change the address of a page, a redirect is created automatically from the old address to the new one, so existing external links keep working. When a page, image, file, or form is deleted, every reference to it across the site's content is cleaned up on its own. No broken links appearing in Google Search Console a week later.
Answer Engine Optimisation is the counterpart to SEO for AI-driven search. When someone asks a language model a question, the model draws on training data and, increasingly, on live content from the web. Content structured for AEO is easier for those systems to understand and to quote accurately.
This structured layer is generated automatically, again from the regular content fields:
The admin panel and the public site are available in 31 languages. Each team member picks their own language in their profile. The public site serves content in the visitor's language, linked across translations with a single click.
The technical parts are handled the same way as SEO: hreflang tags for search engines, RSS feeds per language, menus that adapt to the visitor's language. None of it requires plugins or additional configuration.
Every uploaded image is automatically optimised. The CMS generates modern formats (WebP and AVIF) that load faster than traditional JPEG or PNG, plus multiple sizes for different screen widths so mobile phones don't have to download desktop-sized images. Dedicated crops are produced for social previews at the correct proportions. The dominant colour of the image is extracted for smooth loading placeholders. EXIF metadata is removed for privacy, SVG files are cleaned for security. Transparency is preserved across all generated variants, so logos and icons with transparent backgrounds stay transparent instead of picking up a white fill along the way.
Every image is served from a short, clean URL at the root of the domain — for example, yoursite.com/product-hero.png rather than a deep path through system folders. Search engines prefer these short paths for image indexing.
Favicons follow the same principle. You upload a single source image and the system generates the complete set: the classic favicon.ico in multiple sizes, PNGs for browsers and Android, an Apple Touch icon for iPhones, an SVG version, and a web manifest for progressive web apps. All served from standard paths that search engines can actually find — which is where many favicon tools fail, ending up visible only in the browser tab and not in the search results.
The automation above is the technical foundation, but the CMS itself is a complete content platform. A visual page builder lets you assemble pages from blocks — headings, text, images, columns, galleries, sliders, forms, and more — without writing HTML. A form builder covers contact forms, surveys, and subscription forms with thirteen field types, including GDPR consent and star ratings. A newsletter module handles campaigns, subscriber lists, and double opt-in confirmation, so you don't need an external email service. A file manager serves downloads like PDFs or documents with clean URLs and proper metadata.
None of these are separate plugins. They share the same interface, the same visual style, and the same automation layer underneath — so an image selected in the page builder comes from the same optimised library as images in posts, a form created in the form builder uses the same translation groups as a page, a newsletter signup can draw subscribers from any form on the site.
The parts that require judgement stay with the user: what to write, how to structure the site, which of the fifteen built-in designer themes to choose (each with dark and light mode, selected with a single click in the admin panel), how the menu is organised, when to send newsletters. An optional AI assistant is available for content generation, translation, and filling SEO fields — it works with your own API key from Anthropic, DeepSeek, or Google Gemini — but the final decisions belong to the author.
Backups are created with a single button and downloaded as a ZIP. Updates arrive through the admin panel when available; the user reviews and approves them rather than having updates applied silently.
A self-hosted CMS means you own your server, your data, your domain. There is no subscription, no vendor lock-in, no surprise pricing change. The trade-off is that someone has to maintain the server — but with browser-based setup, one-click SSL, and in-panel updates, the maintenance surface is small.
AliothPress is free for personal and non-commercial use, with a commercial licence required for business sites. You can find it at aliothpress.com — the same site runs on the engine it describes.
2026-04-25 11:02:47
Managed WordPress hosting sounds like a great deal — until it isn't. This is the story of migrating a WooCommerce + WPML site off a major managed host, the chaos that followed immediately after, and the hard lessons learned about what managed hosting was silently doing for us that we didn't fully appreciate until it was gone.
The decision wasn't dramatic. It came down to three compounding frustrations:
Cost vs. control. Managed WordPress hosting at the enterprise tier isn't cheap. As traffic and complexity grew, so did the invoice. But the control stayed locked down — no custom server config, limited cache tuning, no ability to see what was actually hitting the server at a low level.
Performance ceiling. A WooCommerce store with WPML (multilingual) generates a lot of unique URLs — filtered shop pages, language variants, paginated archives. The managed host's caching layer was a black box. When performance degraded, the answer was always "upgrade your plan." There was no way to diagnose what was actually happening underneath.
Visibility. When something went wrong, we couldn't see access logs in real time, couldn't inspect PHP worker counts, couldn't adjust server-level settings. Everything was abstracted. The host was the gatekeeper between us and the actual machine.
The move to a self-managed VPS with RunCloud and OpenLiteSpeed (OLS) promised full visibility and control. It delivered on that promise — but it also immediately exposed us to everything the managed host had been silently absorbing on our behalf.
The stack:
The migration itself was technically straightforward: export, import, update DNS. What wasn't straightforward was what we discovered in the database afterward.
During migration, the site temporarily lived on a staging domain (something like mysite.staging.temphost.link). The standard search-replace after migration should catch all references to the old domain and replace them with the new one.
It didn't catch everything.
Several plugins store data in WordPress's wp_options table as serialized PHP arrays. A normal string search-replace on serialized data corrupts it — because serialized strings encode their own length, and changing the URL changes the string length without updating the length prefix.
The plugins that caused contamination:
wp_options, all with the temp domain baked in for image pathswp-content/uploads/elementor/google-fonts/css/ with http:// URLs hardcoded, not regenerated after migrationThe fix required using WP-CLI's search-replace with the --precise flag, which handles serialized data correctly:
wp search-replace 'mysite.staging.temphost.link' 'mysite.com' --precise --all-tables
For the Elementor font CSS files on disk, a direct find + sed was needed since WP-CLI doesn't touch files:
find /var/www/mysite/wp-content/uploads/elementor/google-fonts/css/ -name "*.css" \
-exec sed -i 's|http://mysite.staging.temphost.link|https://mysite.com|g' {} \;
Despite the site running on HTTPS, Elementor kept generating http:// asset URLs. The reason was that wp-config.php was missing two lines that tell WordPress it's behind HTTPS:
$_SERVER['HTTPS'] = 'on';
define('FORCE_SSL_ADMIN', true);
Without these, WordPress doesn't know the request came in over SSL (especially behind a proxy or load balancer), so dynamic URLs default to http://. A subtle issue that caused a surprisingly large number of mixed content problems.
One of the installed security plugins had added a .htaccess rule that was blocking all PHP file access:
RewriteRule ^.*\.php$ - [F,L,NC]
This rule was intended to block direct PHP file execution in upload directories. But it was placed at the wrong level — it blocked wp-admin, wp-login.php, and every other PHP file on the site. The admin panel was completely inaccessible. The fix was removing that rule from .htaccess manually via SSH before anything else could be done.
The migration was complete. The site was live. And then within hours, the server was at 700%+ CPU — that's 7 full cores pinned on a machine that should have been comfortably handling the traffic.
On managed hosting, this had never happened. Not because the traffic wasn't there — but because the managed host was absorbing it silently. Now it was hitting our VPS directly.
Access log analysis revealed two suspicious IPs hammering the site:
grep "111.88.x.x" /var/log/ols/mysite.com_access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -30
Both IPs were sending dozens of requests per minute to wp-admin/admin-ajax.php, WooCommerce AJAX endpoints, and Contact Form 7 REST endpoints — the fingerprint of bots probing for vulnerabilities and scraping data.
Fix: blocked in .htaccess at the top of the file, before any WordPress rules:
# Block malicious IPs
<RequireAll>
Require all granted
Require not ip 111.88.x.x
Require not ip 45.77.x.x
</RequireAll>
Blocking the malicious IPs brought load down — but not to safe levels. Something else was still hammering the server. Back to the access logs:
grep "meta-externalagent" /var/log/ols/mysite.com_access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -30
Facebook's Meta crawler (meta-externalagent/1.1) was systematically crawling every combination of the WooCommerce shop's filter URLs:
/shop/?filter_color=red&filter_size=small
/shop/?filter_color=red&filter_size=medium
/shop/?filter_color=blue&filter_size=small
... hundreds of unique combinations
Here's why this is catastrophic for a WooCommerce + WPML site:
Every unique URL is a cache miss. LSCache serves cached pages instantly with zero PHP. But a cached page is keyed by URL. Each filter combination is a different URL — so each one bypasses the cache entirely, boots WordPress, boots WooCommerce, boots WPML, runs a database query, and renders a response. The crawler was generating thousands of cache misses per hour.
The fix — immediate:
# Block Meta crawler
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} meta-externalagent [NC]
RewriteRule ^ - [F,L]
After adding this rule and restarting OLS:
load average: 0.94 — 4 lsphp processes
From 700%+ CPU to essentially idle. The Meta crawler was the primary load driver the entire time.
The longer-term fix: add the shop's filter URL pattern to robots.txt so crawlers stop attempting them:
User-agent: meta-externalagent
Disallow: /
User-agent: *
Disallow: /shop/?*
This is the part that changed how we think about managed hosting.
The managed host had several layers we never thought about:
None of this was documented prominently. It was just... happening. Moving to a raw VPS removed all of it at once. The site went from being behind a shield to being fully exposed to the internet with only .htaccess and OLS between it and every bot on the planet.
The visibility was exactly what we wanted — we could finally see everything. But we also now had to handle everything ourselves.
Fixing bots with .htaccess rules is whack-a-mole. Block one IP, another appears. Block one user agent, it rotates. The real fix is a layer in front of the server that handles this at scale before it ever reaches OLS.
Cloudflare's free plan provides:
The architecture after adding Cloudflare:
Internet → Cloudflare edge (bot filtering, rate limiting, CDN cache)
→ VPS / OpenLiteSpeed (LSCache, Redis)
→ WordPress / WooCommerce / WPML
Bad traffic is rejected at Cloudflare before it touches the server. The PHP worker pool stays available for real users.
This is what managed hosting was providing implicitly. Cloudflare makes it explicit, configurable, and visible — and the free tier handles the vast majority of what a typical WooCommerce store needs.
With the immediate crisis resolved, we locked down the remaining attack surface:
Security headers via OLS:
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Strict-Transport-Security: max-age=31536000; includeSubDomains
PHP worker limits — OLS was configured with a hard cap on concurrent PHP workers:
maxConns 8
PHP_LSAPI_CHILDREN=8
Without this cap, a flood of requests spawns unlimited PHP workers and exhausts RAM. With the cap, excess requests queue rather than spawning new processes.
Snapshot immediately after stabilization — once the server was clean and stable, we took a VPS snapshot as a known-good baseline. If anything goes wrong in future, rollback is one click.
1. Managed hosting hides its value until it's gone.
The bot filtering, DDoS mitigation, and crawler management that managed hosts provide are rarely documented but deeply valuable. Budget for replacing that capability explicitly when moving to a VPS.
2. Always use --precise for search-replace on migrated WordPress sites.
Standard search-replace corrupts serialized data. The --precise flag in WP-CLI handles it correctly. Make it a default step in every migration checklist.
3. Cloudflare is not optional for a self-managed WooCommerce store.
Put it in front on day one. Not after you've been attacked. The free plan covers the essentials, and the visibility alone is worth it.
4. WooCommerce filter URLs are a crawler trap.
Any WooCommerce store with faceted filtering generates effectively infinite unique URLs. Configure LSCache to ignore query strings on shop pages, and disallow filter URL patterns in robots.txt before crawlers index them.
5. Access logs are your best friend on a VPS.
The moment something goes wrong, SSH in and read the access log. The answer is almost always there — which IPs, which user agents, which URLs, how many requests per minute. On managed hosting, you often can't do this. On a VPS, it's the first thing you reach for.
The site runs more reliably now on the VPS than it ever did on managed hosting — and at significantly lower cost. Load averages stay under 1.0 under normal traffic. The Cloudflare layer handles bot traffic before it reaches the server. LSCache and Redis handle the WordPress-level caching. And when something goes wrong, we can actually see it.
The migration pain was real. But it was a one-time cost that permanently increased visibility, control, and resilience. The managed host was comfortable — but comfort was masking problems we couldn't see or fix.
2026-04-25 10:53:42
The first time I really tried to understand sizing in Flexbox, I got stuck. Not because I didn’t understand width and height.Not because I didn’t know flex-grow or flex-shrink. But because everything I thought I knew about sizing suddenly stopped applying.
In normal layout, things feel predictable:
width and height behave like firm instructionsThen Flexbox shows up… and starts bending those rules.
I remember setting width on a flex item and expecting it to behave normally.It didn’t.Then I tried adjusting height. Still weird.
Then I added align-items, and suddenly things stretched in ways I didn’t expect.
At that point, it felt like:
“Flexbox doesn’t respect width and height.”
But that’s not true.
Flexbox does respect them — just not in the way we’re used to.
Flexbox doesn’t think in horizontal vs vertical
It thinks in main axis vs cross axis
And once I accepted that, things started making sense.
Let’s start simple.
<div style="display: flex;"></div>
That div is still a block element.
display: flexdoes not change how the container behaves outside
It only changes how its children are laid out
This was my first misconception.
Flexbox introduces a new idea: axes.
| flex-direction | Main Axis | Cross Axis |
|---|---|---|
| row | horizontal | vertical |
| column | vertical | horizontal |
This one property controls everything.
And I mean everything.
Once axes are defined, sizing becomes a game between a few properties:
flex-basis: starting size (main axis)flex-grow: how items expandflex-shrink: how items shrinkalign-items: cross-axis behaviorwidth / height: physical dimensions (sometimes ignored!).container {
display: flex;
flex-direction: row;
}
This is the main axis now.
So Flexbox asks:
width? Yes use it as a starting pointflex-basis? Yes, use thatThen:
flex-grow distributes itflex-shrink reduces itFlexbox handles it differently:
height is set, it uses itIf not:
align-items: stretch (default), fills container height
.container {
display: flex;
flex-direction: column;
}
Now:
Everything flips.
Now Flexbox treats height the way it treated width before.
It asks:
height?flex-basis?Then:
flex-grow
flex-shrink
Now width behaves like height did before:
width is set,use itIf not:
align-items: stretch, fill container width
After going back and forth enough times, I realized:
Flexbox doesn’t randomly ignore properties
It just prioritizes based on the axis
Main axis, controlled by flex properties
Cross axis, controlled by alignment
And:
widthandheightonly matter when they align with the axis being calculated
I used to think:
“
flex-basisis the minimum size”
That’s wrong.
It’s actually:
The starting size before growing or shrinking
That small misunderstanding caused a lot of confusion for me.
Whenever something looks off, I ask:
flex-basis / width / height)?align-items)?And suddenly, it’s no longer magic.
Flexbox feels confusing because it asks you to stop thinking in fixed dimensions and start thinking in relationships:
Once you make that mental switch, Flexbox stops fighting you…
and starts working with you.