2026-02-10 12:55:52
Or: How I spent a Saturday building MindfulMapper instead of doing literally anything else
Picture this: You're running a small cafe. You've got your menu in an Excel spreadsheet (because that's what everyone uses, right?). Now you need to get that data into MongoDB for your new web app.
Your options:
I wanted option 4: Talk to Claude like a human and have it just... work.
That's where MCP (Model Context Protocol) comes in.
MCP is basically a way to give Claude (or any AI) superpowers. Instead of Claude just answering questions, it can actually do things - like reading files, calling APIs, or in my case, importing Excel data into databases.
Think of it like this:
The catch? You have to build the "server" that does the actual work.
I wanted to be able to say:
"Hey Claude, import menu.xlsx into my products collection. Map 'Name (EN)' to name.en and 'Name (TH)' to name.th. Oh, and auto-generate IDs with prefix 'spb'."
And have it... just work.
Pain Point #1: The Dotenv Disaster
My first version used dotenv to load environment variables. Seemed innocent enough:
import dotenv from 'dotenv';
dotenv.config();
Turns out, dotenv prints a helpful message to stdout:
[[email protected]] injecting env (4) from .env
Claude Desktop saw this message, tried to parse it as JSON (because MCP uses JSON-RPC), and promptly died. Took me WAY too long to figure this out.
Solution: Either suppress the message or just hardcode the env vars in the Claude Desktop config. I went with option 2.
Pain Point #2: SDK Version Hell
The MCP SDK is evolving fast. Like, really fast. Version 1.26.0 uses completely different syntax than what's in the examples online.
What the examples showed:
server.addTool({
name: "my_tool",
description: "Does a thing",
parameters: z.object({...}),
execute: async ({...}) => {...}
});
What actually works (v1.26.0):
const server = new Server(
{ name: "my-server", version: "1.0.0" },
{ capabilities: { tools: {} } }
);
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [...]
}));
server.setRequestHandler(CallToolRequestSchema, async (request) => {
// Handle tool calls
});
Yeah. Completely different. Spent hours on this one.
Pain Point #3: Auto-Generating IDs
I wanted IDs like spb-0001, spb-0002, etc. Seems simple, right?
The trick is maintaining a counter in MongoDB:
async function getNextId(prefix = 'spb') {
const counterCollection = db.collection('counters');
const result = await counterCollection.findOneAndUpdate(
{ _id: 'item_id' },
{ $inc: { seq: 1 } },
{ upsert: true, returnDocument: 'after' }
);
const num = result.seq || 1;
return `${prefix}-${String(num).padStart(4, '0')}`;
}
This ensures:
Want to map Excel columns to nested MongoDB objects? Easy:
// Excel columns: "Name (EN)", "Name (TH)"
// Mapping: { "name.en": "Name (EN)", "name.th": "Name (TH)" }
// Result in MongoDB:
{
id: "spb-0001",
name: {
en: "Americano",
th: "อเมริกาโน่"
}
}
The mapper handles the dot notation automatically.
This is the magic part. Instead of writing code every time, I just tell Claude:
"Import menu.xlsx into products collection. Use prefix 'spb'. Clear existing data."
Claude translates that into the right MCP tool call with the right parameters. It's like having a very patient assistant who never gets tired of your data imports.
I'm using this with MongoDB Atlas (cloud) for a real cafe menu system. The fact that it works reliably enough for production use still surprises me.
If you want to try it yourself:
git clone https://github.com/kie-sp/mindful-mapper.git
cd mindful-mapper
npm install
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mindful-mapper": {
"command": "node",
"args": ["/full/path/to/mindful-mapper/upload-excel.js"],
"env": {
"MONGODB_URI": "your-mongodb-connection-string",
"MONGODB_DB_NAME": "your_db",
"ID_PREFIX": "spb"
}
}
}
}
Restart Claude Desktop, then:
"Import /path/to/menu.xlsx into products collection. Use prefix 'spb'."
That's it.
The SDK is changing fast. Code from 2 months ago might not work today. Always check the version you're using.
Your server crashes silently. No stack traces in Claude. Your only friend is:
node your-server.js 2>&1
And the logs at:
~/Library/Logs/Claude/mcp*.log
Once it works, it's magic. I went from dreading data imports to actually enjoying them (well, not dreading them at least).
Current limitations I want to fix:
But honestly? For 90% of my use cases, it works perfectly as-is.
If you build something cool with it, let me know! Or if you hit the same pain points I did, at least now you know you're not alone.
Building MCP servers is weird. It's not quite backend development, not quite AI engineering. It's this new thing where you're building tools for an AI to use on your behalf.
But when it works? When you can just casually tell Claude to handle your data imports while you go make coffee? That's pretty cool.
If you end up using this or building something similar, I'd love to hear about it! Feel free to open an issue on GitHub or reach out.
Happy importing! 🎉
Built with: Node.js, MongoDB, MCP SDK, and an unreasonable amount of trial and error
2026-02-10 12:49:44
You did everything right.
Split the database into 16 shards. Distributed users evenly by user_id hash. Each shard handles 6.25% of traffic. Perfect balance.
Then Black Friday happened.
One celebrity with 50 million followers posted about your product. All 50 million followers have user IDs that hash to... shard 7.
Shard 7 is now handling 80% of your traffic. The other 15 shards are idle. Shard 7 is melting.
Welcome to the Hot Partition Problem.
Hash-based sharding looks perfect on paper:
def get_shard(user_id):
return hash(user_id) % num_shards
Uniform distribution. Simple logic. What could go wrong?
Everything. Because real-world access patterns don't care about your hash function.
Scenario 1: Celebrity Effect
A viral post from one user means millions of reads on that user's shard. Followers are distributed across shards, but the content they're accessing isn't.
Scenario 2: Time-Based Clustering
Users who signed up on the same day often have sequential IDs. They also often have similar usage patterns. Your "random" distribution isn't random at all.
Scenario 3: Geographic Hotspots
Morning in Tokyo means heavy traffic from Japanese users. If your sharding key correlates with geography, one shard gets hammered while others sleep.
You can't fix what you can't see.
Monitor per-shard metrics:
Shard 1: CPU 15% | QPS 1,200 | Latency P99 45ms
Shard 2: CPU 12% | QPS 1,100 | Latency P99 42ms
Shard 7: CPU 94% | QPS 18,500 | Latency P99 890ms ← PROBLEM
Shard 8: CPU 18% | QPS 1,400 | Latency P99 51ms
Set up alerts:
Track hot keys:
Log the most frequently accessed keys per shard. The top 1% of keys often cause 50% of load.
For keys you know will be hot, add a random suffix:
def get_shard_for_post(post_id, is_viral=False):
if is_viral:
# Spread across multiple shards
random_suffix = random.randint(0, 9)
return hash(f"{post_id}:{random_suffix}") % num_shards
else:
return hash(post_id) % num_shards
A viral post now spreads across 10 shards instead of 1. Reads are distributed. Writes need to fan out, but that's usually acceptable.
The tricky part: knowing which keys will be hot before they're hot.
Accept that some data is special. Give it special treatment.
HOT_USERS = {"celebrity_1", "celebrity_2", "viral_brand"}
def get_shard(user_id):
if user_id in HOT_USERS:
return HOT_SHARD_CLUSTER # Separate, beefier infrastructure
return hash(user_id) % num_shards
The hot shard cluster has more replicas, more CPU, more memory. It's designed to handle disproportionate load.
Update the HOT_USERS list dynamically based on follower count or recent engagement metrics.
Don't let hot reads hit the database at all.
def get_post(post_id):
# Check cache first
cached = redis.get(f"post:{post_id}")
if cached:
return cached
# Cache miss - hit database
post = database.query(post_id)
# Cache with TTL based on hotness
ttl = 60 if is_hot(post_id) else 300
redis.setex(f"post:{post_id}", ttl, post)
return post
For viral content, a 60-second cache means the database sees 1 query per minute instead of 10,000 queries per second.
Shorter TTL for hot content sounds counterintuitive, but it ensures fresher data for content people actually care about.
Scale reads horizontally within each shard:
Shard 7 Primary (writes)
├── Replica 7a (reads)
├── Replica 7b (reads)
├── Replica 7c (reads)
└── Replica 7d (reads)
When shard 7 gets hot, spin up more read replicas for that specific shard. Other shards stay lean.
This works well for read-heavy hotspots. Write-heavy hotspots need different solutions.
Don't shard on a single dimension:
# Bad: Single key sharding
shard = hash(user_id) % num_shards
# Better: Composite key
shard = hash(f"{user_id}:{content_type}:{date}") % num_shards
Composite keys add entropy. A celebrity's posts are now spread across shards by date, not concentrated in one place.
The trade-off: queries that span multiple values need to hit multiple shards. Design your access patterns accordingly.
When a partition gets hot, split it:
Before:
Shard 7 handles hash range [0.4375, 0.5000]
After split:
Shard 7a handles [0.4375, 0.4688]
Shard 7b handles [0.4688, 0.5000]
Modern distributed databases like CockroachDB and TiDB do this automatically. If you're running your own sharding, you'll need to build this logic.
Key considerations:
Before your next traffic spike:
1. Know your hot keys
Run analytics on access patterns. Which users, which content, which time periods drive disproportionate load?
2. Design for celebrities
If your product could have viral users, plan for them. Don't wait until you have one.
3. Monitor per-shard, not just aggregate
Average latency across 16 shards hides the shard that's dying. Track each one individually.
4. Test with realistic skew
Load tests with uniform distribution prove nothing. Simulate 80% of traffic hitting 5% of keys.
5. Have a manual override
When detection fails, you need a way to manually mark keys as hot and reroute them.
Perfect distribution doesn't exist in production.
Users don't behave uniformly. Content doesn't go viral uniformly. Time zones don't align uniformly.
Your sharding strategy needs to handle the 99th percentile, not the average. One hot partition can take down your entire system while 15 other shards sit idle.
Design for imbalance. Monitor for hotspots. Have a plan before the celebrity tweets.
For comprehensive patterns on building resilient distributed databases—including sharding strategies, replication topologies, and connection management for high-traffic platforms:
→ Enterprise Distributed Systems Architecture Guide
16 shards. Perfect hashing. One celebrity. One fire.
2026-02-10 12:45:04
Building agentic workflows from scratch is hard.
You have to manage context across turns, orchestrate tools and commands, route between models, integrate MCP servers, and think through permissions, safety boundaries, and failure modes. Even before you reach your actual product logic, you've already built a small platform.
Most teams don't want to build a platform. They want to build a product.
This is the core problem both GitHub Copilot SDK and Azure AI Foundry Agents solve — but they solve it for different audiences, in different contexts, with different trade-offs.
After talking to customers and internal teams, one theme keeps coming up:
"We want to build AI agents. Do we use the GitHub Copilot SDK or Azure AI Foundry? What's the difference? Can we use both?"
The short answer: they solve different problems. But the overlap in marketing language ("agents," "AI," "SDK") makes it murky. Let's fix that.
The GitHub Copilot SDK (now in technical preview) removes the burden of building your own agent infrastructure. It lets you take the same Copilot agentic core that powers GitHub Copilot CLI and embed it in any application.
What the SDK handles for you:
| You used to build this yourself | Copilot SDK handles it |
|---|---|
| Context management across turns | ✅ Maintained automatically |
| Tool/command orchestration | ✅ Automated invocation and chaining |
| MCP server integration | ✅ Built-in protocol support |
| Model routing (GPT-4.1, etc.) | ✅ Dynamic, policy-based routing |
| Planning and execution loops | ✅ Multi-step reasoning out of the box |
| Permissions and safety boundaries | ✅ Enforced by the runtime |
| Streaming responses | ✅ First-class support |
| Auth and session lifecycle | ✅ Managed under the hood |
What you focus on: Your domain logic. Your tools. Your product.
Available in: TypeScript/Node.js, Python, Go, .NET
Designed for: Developer-facing applications — Copilot Extensions, dev portals, CLI tools, code review bots, internal productivity tools.
Without the SDK, you're stitching together an LLM API, a context window manager, a tool registry, an execution loop, error handling, and auth — before writing a single line of product code. The SDK collapses all of that into an import.
Your App
│
├── Your product logic + custom tools
├── Copilot SDK (agentic core)
│ │
│ └──── handles context, orchestration,
│ MCP, model routing, safety
│ │
│ └──── HTTPS ──▶ GitHub Cloud (inference)
│
└── Your UI / API / Extension
Azure AI Foundry is a full platform for building, deploying, and governing enterprise AI agents for any business domain — not just developer workflows.
What Foundry gives you:
Designed for: Business-wide automation — customer support agents, document processing pipelines, HR/finance/IT workflows, knowledge management with RAG.
@my-tool extension for Copilot Chat is a first-class use case.┌──────────────────────────────────────────────┐
│ ENTERPRISE AI STACK │
│ │
│ ┌───────────────────┐ ┌──────────────────┐ │
│ │ Developer Layer │ │ Business Layer │ │
│ │ │ │ │ │
│ │ Copilot SDK │ │ Azure AI │ │
│ │ • @extensions │ │ Foundry Agents │ │
│ │ • Dev portals │ │ • Product apps │ │
│ │ • CLI agents │ │ • Biz pipelines │ │
│ │ • Ops workflows │ │ • Ops workflows │ │
│ └───────────��───────┘ └──────────────────┘│
│ │
│ Shared: Microsoft identity, models, trust │
└──────────────────────────────────────────────┘
| Dimension | Copilot SDK | Azure AI Foundry |
|---|---|---|
| Time to first agent | Fast — import SDK, register tools, go | Slower — more config, but more control |
| Agentic core included? | ✅ Yes — same engine as Copilot CLI | You build or configure your own agent logic |
| Model choice | GitHub-hosted models (GPT-5.3, etc.) + BYO Model | BYO model, Azure OpenAI, open-source, frontier |
| MCP support | ✅ Built-in | Supported via configuration |
| Multi-agent orchestration | Early / evolving | First-class, production-ready |
| Data connectors | You build your own (via custom tools) | Built-in (SharePoint, SQL, M365, Logic Apps) |
| Deployment surface | Your app, VS Code, GitHub.com | Teams, M365 Copilot, web apps, containers |
| Network isolation (VNet) | Not available | ✅ Full VNet / private endpoint support |
| Customer-managed keys | Microsoft-managed | ✅ Azure Key Vault |
| Infrastructure ownership | GitHub-managed (less to operate) | Your Azure subscription (more control, more ops) |
| Billing model | Per Copilot seat | Azure consumption-based |
| Best for | Developer tools and workflows | Business-wide automation at scale |
There's a misconception that GitHub is only "good enough" for compliance compared to Azure. That's not accurate.
GitHub is an enterprise-grade platform trusted by the world's largest companies and governments. The compliance posture is strong and getting stronger:
| Certification / Control | GitHub (incl. Copilot) |
|---|---|
| SOC 2 Type II | ✅ Available (Copilot Business & Enterprise in scope) |
| SOC 1 Type II | ✅ Available |
| ISO/IEC 27001:2022 | ✅ Copilot in scope |
| FedRAMP Moderate | 🔄 Actively pursuing |
| IP Indemnification | ✅ Included for enterprise plans |
| No training on your code | ✅ Copilot Business/Enterprise data is never used for model training |
| Duplicate detection filtering | ✅ Available to reduce IP risk |
GitHub's story is the end-to-end developer platform — code, security, CI/CD, and now AI — all enterprise-ready under one roof. The Copilot SDK extends that story into your own applications.
On data residency specifically: yes, Azure AI Foundry offers more granular region-bound controls (VNet isolation, customer-managed keys, explicit region pinning). This matters for certain regulated workloads. But for many enterprises — especially in the US and Canada — data-in-transit concerns with GitHub Copilot are well-addressed by existing encryption, privacy controls, and contractual terms. Data residency is worth understanding, but it shouldn't be the sole deciding factor. Evaluate it alongside your actual regulatory requirements, not as a blanket blocker.
What are you building?
│
├── A developer tool, extension, or dev or ops workflow?
│ └── ✅ GitHub Copilot SDK
│ • You get a production-tested agentic core out of the box
│ • Ship fast, stay in the GitHub ecosystem
│ • Enterprise-ready compliance
│
├── A product, business/ops/customer-facing agent?
│ └── ✅ Azure AI Foundry
│ • Multi-agent orchestration, data connectors, BYO model
│ • Full Azure governance and network isolation
│
├── Both?
│ └── ✅ Use both — they're complementary
│
└── Not sure yet?
└── Start with the user:
• Developer or internal? → Copilot SDK
• End-facing user or Non-developer? → Foundry
The Copilot SDK removes the hardest part of building agents. Context, orchestration, MCP, model routing, safety — it's all handled. You focus on your product.
Foundry is for the business. When your agents are for an user facing product or orchestrate across departments, or serve non-developer users, Foundry is the right tool.
GitHub is enterprise-ready. SOC 2, ISO 27001, IP indemnification, no training on your data.
They're complementary, not competing. Copilot SDK for the developer/ops layer. Foundry for the product/business layer. Same Microsoft ecosystem underneath.
Start with the user, not the technology. Who is the agent serving? That answer picks your tool.
I’m Ve Sharma, a Solution Engineer at Microsoft focusing on Cloud & AI working on GitHub Copilot. I help developers become AI-native developers and optimize the SDLC for teams. I also make great memes. Find me on LinkedIn or GitHub.
2026-02-10 12:40:27
Picture this: It's late 2025, and I'm trying to plan my time off for 2026. Simple task, right? Just look up the bank holidays and mark my calendar.
Except it wasn't simple at all.
Every website I visited was a minefield of:
I literally just needed a list of dates. That's it. No fluff, no fuss, no "Sign up for our premium calendar experience!"
After closing my tenth ad popup, I had that moment every developer knows well: "I could build this better in an afternoon."
So I did. Meet HolidaysCalendar.net — a straightforward, clean holidays calendar that respects your time and your eyeballs.
I didn't need anything fancy here. The goal was speed and simplicity:
This one's non-negotiable. The whole point was escaping ad hell, so there are none. Not now, not ever.
I'm a dark mode enthusiast, so this was a must-have. Using Tailwind's dark mode utilities, I built a theme that's easy on the eyes whether you're planning vacation days at midnight or during your lunch break.
The implementation is native to the system preferences — if your device is in dark mode, the site follows automatically. No toggle needed (though I might add one later for the rebels who want light mode at 2 AM).
I'm pretty proud of this one. The site scores perfect 100s across the board:
It loads fast, it's accessible, and it does exactly what it says on the tin.
Honestly? Because I think we need more of this on the internet.
Not every website needs to be monetized to death. Sometimes a tool is just a tool — something useful you build because it solves a real problem you (and probably others) have.
If you've ever rage-closed a tab because you couldn't find simple information through the ad chaos, this site is for you.
Right now, it covers US and UK bank holidays. I'm thinking about adding:
But I'm keeping it lean. The whole point is simplicity.
Head over to HolidaysCalendar.net and let me know what you think — especially about the dark mode! I'd love feedback from fellow developers and regular users alike.
And if you're planning your 2026 PTO right now, you're welcome. 😊
Have thoughts on the dark mode implementation or ideas for features? I'm all ears. This is built for real people with real planning needs — your feedback makes it better.
2026-02-10 12:31:00
In this tutorial, we’ll build a Python application that:
Lets users monitor folders.
Categorizes files by size.
Organizes files into folders automatically.
Provides a live file preview with filtering and sorting.
We’ll be using Tkinter, watchdog, threading, and tkinterdnd2 for drag-and-drop support.
Step 1: Import Required Libraries
First, we need to import the libraries for GUI, file handling, and background monitoring.
import os
import json
import shutil
import tkinter as tk
from tkinter import ttk, filedialog, messagebox
from tkinterdnd2 import DND_FILES, TkinterDnD
from threading import Thread
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import sv_ttk
Explanation:
os, shutil → for interacting with the file system.
json → to save/load folder configurations.
tkinter, ttk → GUI elements.
tkinterdnd2 → drag-and-drop functionality.
watchdog → monitors folders in real-time.
threading → allows long-running tasks without freezing the GUI.
Step 2: Configuration & Helper Functions
We’ll store monitored folders in a JSON file and create some helper functions.
CONFIG_FILE = "folders_data.json"
def load_folders():
if os.path.exists(CONFIG_FILE):
with open(CONFIG_FILE, "r", encoding="utf-8") as f:
return json.load(f)
return []
def save_folders(folders_data):
with open(CONFIG_FILE, "w", encoding="utf-8") as f:
json.dump(folders_data, f, ensure_ascii=False, indent=4)
Explanation:
load_folders() → reads previously monitored folders from JSON.
save_folders() → writes current folders to JSON for persistence.
We also need some utility functions for file handling:
def sanitize_folder_name(name):
return "".join(c for c in name if c not in r'<>:"/\|?*')
def format_size(size_bytes):
if size_bytes < 1024:
return f"{size_bytes} B"
elif size_bytes < 1024**2:
return f"{size_bytes/1024:.2f} KB"
elif size_bytes < 1024**3:
return f"{size_bytes/1024**2:.2f} MB"
else:
return f"{size_bytes/1024**3:.2f} GB"
def categorize_file(size):
if size < 1024*1024:
return "Small_1MB"
elif size < 10*1024*1024:
return "Medium_1-10MB"
else:
return "Large_10MB_plus"
Explanation:
sanitize_folder_name() → removes illegal characters for folder names.
format_size() → converts bytes into human-readable format.
categorize_file() → groups files by size into small, medium, or large.
Step 3: Create the Main Application Window
We use TkinterDnD for drag-and-drop support.
root = TkinterDnD.Tk()
root.title("📂 File Size Organizer Pro - Filter & Sort")
root.geometry("1300x620")
Explanation:
TkinterDnD.Tk() → initializes the main window with drag-and-drop support.
title and geometry → set the window title and size.
Step 4: Global Variables
We’ll use some global variables for storing folder data, filtered files, and observer state.
folders_data = load_folders()
last_operation = []
combined_files = []
filtered_files = []
observer = None
current_filter = tk.StringVar(value="All")
current_sort = tk.StringVar(value="Name")
watcher_paused = False
Explanation:
folders_data → all folders being monitored.
combined_files → all files scanned from these folders.
filtered_files → filtered/sorted files for display.
observer → watchdog observer for real-time monitoring.
current_filter & current_sort → track GUI dropdown selections.
Step 5: Add & Remove Folders
We’ll create functions to add or remove folders.
def add_folder(path):
path = path.strip()
if not path:
messagebox.showwarning("Invalid Folder", "Please select or enter a folder path.")
return
path = os.path.abspath(path)
if not os.path.isdir(path):
messagebox.showwarning("Invalid Folder", "The selected path is not a valid folder.")
return
if path in folders_data:
messagebox.showinfo("Already Added", "This folder is already being monitored.")
return
folders_data.append(path)
save_folders(folders_data)
start_watcher(path)
refresh_combined_preview()
Explanation:
Validates the folder path.
Prevents duplicates.
Saves to JSON and starts a watcher for real-time updates.
def remove_folder(path):
path = os.path.abspath(path.strip())
if path in folders_data:
folders_data.remove(path)
save_folders(folders_data)
refresh_combined_preview()
set_status(f"Removed folder: {path}")
else:
messagebox.showwarning("Not Found", "Folder not found in the list.")
Explanation:
Removes a folder from monitoring and refreshes the preview.
Step 6: Scan Folders & Refresh Preview
def scan_folder(folder_path):
files = []
for f in os.listdir(folder_path):
path = os.path.join(folder_path, f)
if os.path.isfile(path):
try:
size = os.path.getsize(path)
except (FileNotFoundError, PermissionError):
continue
category = sanitize_folder_name(categorize_file(size))
files.append((folder_path, f, size, category))
return files
def refresh_combined_preview():
global combined_files
combined_files = []
for folder in folders_data:
combined_files.extend(scan_folder(folder))
apply_filter_sort()
Explanation:
scan_folder() → reads files in a folder and categorizes them.
refresh_combined_preview() → combines files from all folders and applies filter/sort.
Step 7: Filter & Sort Files
def apply_filter_sort():
global filtered_files
filtered_files = combined_files.copy() if current_filter.get() == "All" else [f for f in combined_files if f[3] == current_filter.get()]
sort_key = current_sort.get()
if sort_key == "Name":
filtered_files.sort(key=lambda x: x[1].lower())
elif sort_key == "Size":
filtered_files.sort(key=lambda x: x[2])
elif sort_key == "Folder":
filtered_files.sort(key=lambda x: x[0].lower())
elif sort_key == "Category":
filtered_files.sort(key=lambda x: x[3].lower())
update_file_tree()
Explanation:
Filters files based on category.
Sorts files based on Name, Size, Folder, or Category.
Step 8: Display Files in GUI
def update_file_tree():
file_tree.delete(*file_tree.get_children())
counts = {"Small_1MB":0, "Medium_1-10MB":0, "Large_10MB_plus":0}
for folder, f, size, category in filtered_files:
file_tree.insert("", "end", values=(folder, f, format_size(size), category))
counts[category] += 1
total_files_var.set(f"Total Files: {len(filtered_files)} | Small: {counts['Small_1MB']} | Medium: {counts['Medium_1-10MB']} | Large: {counts['Large_10MB_plus']}")
Explanation:
Updates the Treeview widget with current files.
Shows counts by category.
Step 9: Organize & Undo Files
def organize_files_thread():
global watcher_paused
watcher_paused = True
for folder, f, size, category in combined_files:
src = os.path.join(folder, f)
dst_folder = os.path.join(folder, category)
os.makedirs(dst_folder, exist_ok=True)
dst = os.path.join(dst_folder, f)
try:
shutil.move(src, dst)
except Exception:
pass
watcher_paused = False
refresh_combined_preview()
Explanation:
Moves files into categorized folders.
Pauses the watcher to avoid triggering real-time updates during move.
def undo_last_operation():
# Reverse the last organize operation
pass # Implement similarly to `organize_files_thread()` with reversed moves
Step 10: Real-time Folder Watching
class FolderEventHandler(FileSystemEventHandler):
def on_any_event(self, event):
if watcher_paused:
return
root.after(200, refresh_combined_preview)
def start_watcher(folder_path):
global observer
if observer is None:
observer = Observer()
observer.start()
handler = FolderEventHandler()
observer.schedule(handler, folder_path, recursive=True)
Explanation:
Uses watchdog to detect any file changes.
Updates GUI automatically.
Step 11: Drag & Drop Support
def drop(event):
paths = root.tk.splitlist(event.data)
for path in paths:
if os.path.isdir(path):
add_folder(path)
root.drop_target_register(DND_FILES)
root.dnd_bind('<<Drop>>', drop)
Explanation:
Enables dragging folders into the app.
Automatically adds dropped folders.
Step 12: GUI Layout
Set up the GUI frames, buttons, Treeview, filters, progress bar, and status bar.
Due to length, this can be broken into smaller sub-steps in your final tutorial.
2026-02-10 12:22:43
The "AHA" moment for me wasn't when I first used an AI coder. It was when I realized I was fragmented.
I’d do a deep architectural brainstorm in Claude, switch to Cursor to implement, and then jump into Windsurf to use its agentic flow. But Claude didn't know what Cursor did, and Cursor had no idea about the architectural epiphany I just had in Claude.
I was manually copy-pasting my own "brain" across tabs.
Nucleus is an MCP (Model Context Protocol) Recursive Aggregator.
Instead of treating MCP servers as individual plugins, Nucleus treats them as a Unified Control Plane. It creates a local-first memory layer (we call them Engrams) that stays on your hardware.
When I teach Claude something, it writes to the Nucleus ledger. When I open Cursor, Cursor reads that same ledger. The context is no longer session-bound; it's persistent and sovereign.
We’ve all seen the security warnings about giving agents full filesystem access. Nucleus solves this with a Hypervisor layer:
We just open-sourced the v1.0.1 (Sovereign) release on GitHub.
If you're tired of being a "context-courier" between agents, come check it out.
👉 Nucleus on GitHub
👉 PyPI: nucleus-mcp
👉 See it in action (58s demo): 
Let’s stop building silos and start building a shared brain. 🚀🌕