2026-04-13 01:03:44
Like any Linux nerd I have been into ricing for a while, stalking the good folks over on r/unixporn and making my own Hyprland rice.
But for the past 6 months or so I've been using KDE.
And the past 2 weeks I have gone back from Wayland to X11 (it's just more stable)
This just so happened to coincide with me discovering this community:
https://www.reddit.com/r/MoeDesktop/
It's like r/unixporn but for classic 2005 otaku desktops, invoking the feeling of Moe (pronounced mo-eh).

Which just means cuteness (or something like that) in Japanese
So I thought I would try my hand at a Moe rice.
My first stop was this KDE Aero Plasma theme, since I'm a big big fan of the Vista aesthetic.
I installed it in one Paru command and it worked 🥳 My desktop does look like Vista now!!
I'm using this wallpaper from Nekopara:

It's really cute!
There's things in Python you can do to study the images colours using ImageMagik and match your colourscheme to it.

So after Aero and my wallpaper I did this!
In KDE you can edit your cursor super easily
I went into KDE cursor store and got my Miku cursor!

Like any good Otaku and doujinsoft fan I needed widgets.

Remember how cool it was in Vista!?
So I looked at the defaults.
I added the Photo Gallery widget

Because I take so many screenshots of anime / visual novels I just pointed it at my folder.
In this game her eyes reminded me of Oshi No Ko!

Second widget is weather!

I live in the UK
Somedays it's super sunny, the next it's raining.
Last week I went to work in a coat. Everyone else was in cute dresses and skirts.... I was the odd one out :/
Never again since I added this widget!
But I had the thirst for more. MORE MOE WIDGETS!
So like any good 2026 coder I asked my AI bot to build them for me lol
Like any late 20s white girl that visited Japan, I miss it dearly
I told AI to build me a widget where once a day it'll show me a location from Japan, a photo of it, the Wikipedia description and let me click on it to go to Wikipedia:

Works quite well!
I even asked it to make curated lists, so there's one for the West of Tokyo (I lived in Suginami-ku, so I miss that part in particular)

Every day (well, every hour) it picks a random word from my Anki deck and shows it to me.
No Anki? No problem!

We can change it to Local only and see a local word list of curated, fun words :)

This is prettier than my Anki cards anyway, which are.... full of HTML and ugly...

I built this from a GitHub repo called Moe_Counter or something
It's just a clock, pretty cool!
Someone did this on Windows already so I just made it a KDE widget :)
I wanna see:
So I vibe coded this!

The moon is not visible in Tokyo rn

This is my current desktop!
2026-03-25 15:27:24

5000 words.
I hit 4000 words 87 days ago.
That's 11.49 mature words a day.
Stats since 4k words:

10k average daily characters.
2 hours of visual novels a day on average.


Since I hit 4k words in December:
800k chars read, 128 hours spent reading.

I've averaged 1 hour a day of reading (work....) and reading an average of 9.1k chars a day.

I've come across 15.5k unique words, and 5.8k of those I had never seen before.
In that time I read:

In terms of Anki:






332 hours studying Japanese since 5k words.
I've been doing Migaku Memory Japanese Course for like 3 months now for funsies.



Firstly for GameSentenceMiner I added these things:
For Yomitan:

I also made an Anki addon that turns your mature words into a freq dict so you can combine both of them.
For Manabitan, a fork of Yomitan I:

OpenAI awarded me free Codex for my contributions to the above,

I also made Bee's Character Dictionary:

To make name dicts.
And also Bee's Custom Dict Maker for easy plaintext Yomitan dicts:
Since December I now work at FAANG so I don't have much time to immerse anymore, still it is the most important thing to me.
I currently do 10 new Anki cards a day, some number of Migaku cards if I can be bothered, and I just read a VN to finish it off.
I don't watch anime all that much, maybe 6 hours total since December.
As for manga, maybe 15 minutes...
I just read visual novels :P
My plan is to just continue this. I'm making progress, albeit slowly.
2026-02-27 23:58:15
Generate auto-updating character name dictionaries for Yomitan based on what you play/watch/read
2026-02-01 18:40:38
A simple character-based heuristic outperforms fixed timers and requires zero configuration. The formula threshold = max(5, min(chars × 1.2, 120)) handles 80%+ of cases without any user tuning, while an adaptive variant using exponential moving averages can personalize detection after a brief warm-up period. For applications requiring maximum robustness, the Modified Z-Score method using Median Absolute Deviation provides statistically rigorous outlier detection that remains stable even with contaminated data.
The core problem with fixed AFK timers like the current 60-second approach is that they ignore text length entirely. Shodan A 4-character line like 「ああ...」 legitimately takes 2-3 seconds to read, making a 60-second threshold absurdly generous—it would count 57 seconds of idle time toward reading statistics. Conversely, a 200-character passage might genuinely require 90+ seconds for a learner, yet would be incorrectly flagged as AFK.
Each approach below solves the same problem with increasing sophistication. Choose based on your implementation constraints and accuracy requirements.
Algorithm 1: Multi-Tier Character Heuristic requires no historical data and works immediately. Algorithm 2: EMA Adaptive Baseline learns individual reading speeds after 5-10 text boxes. Algorithm 3: Modified Z-Score with MAD provides the most statistically robust detection but requires maintaining a rolling history window.
| Approach | Lines of Code | Accuracy | Adapts to User | Cold Start |
|---|---|---|---|---|
| Character Heuristic | ~10 | Good (80%) | No | Instant |
| EMA Adaptive | ~40 | Very Good (90%) | Yes | 5-10 samples |
| Modified Z-Score | ~60 | Excellent (95%) | Yes | 10-20 samples |
This approach requires zero configuration and no warm-up period. It works by scaling the AFK threshold proportionally to text length, bounded by sensible minimum and maximum values.
python
def is_afk(time_seconds: float, char_count: int) -> bool:
"""
Simple heuristic that works without any learning.
Returns True if the reading time indicates user was likely AFK.
"""
# Minimum threshold: even "ああ" needs reaction time
MIN_THRESHOLD = 5
# Maximum threshold: beyond this is definitely AFK
MAX_THRESHOLD = 120
# Time allowance per character (accounts for reading + processing)
# 1.2 sec/char ≈ learner reading at 50 char/min + thinking time
SECONDS_PER_CHAR = 1.2
threshold = max(MIN_THRESHOLD, min(char_count * SECONDS_PER_CHAR, MAX_THRESHOLD))
return time_seconds > threshold
Why these specific values? Japanese reading speeds vary dramatically: native speakers read 500-1,200 characters per minute Education in Japan (8-20 char/sec), while intermediate learners read 150-300 char/min (2.5-5 char/sec). The 1.2 seconds per character accommodates the slowest learners (~50 char/min) while including a 3× multiplier for dictionary lookups, re-reading, and processing time. The 5-second minimum handles reaction time for clicking through dialogue, while the 120-second cap prevents absurdly long thresholds for text walls.
Edge case behavior:
This approach learns the user's personal reading speed over time, providing increasingly accurate detection as more data accumulates. It falls back to the simple heuristic during the warm-up period.
python
class AdaptiveAFKDetector:
def __init__(self):
self.alpha = 0.2 # EMA smoothing factor
self.ema_time_per_char = None # Learned baseline
self.sample_count = 0
# Warm-up settings
self.MIN_SAMPLES = 5
self.FALLBACK_TIME_PER_CHAR = 1.2
# Detection settings
self.ANOMALY_MULTIPLIER = 3.0
self.ABSOLUTE_MIN = 5
self.ABSOLUTE_MAX = 180
def record_reading(self, time_seconds: float, char_count: int) -> None:
"""Call after user advances to next line (confirmed not AFK)."""
if char_count < 2: # Skip very short lines
return
time_per_char = time_seconds / char_count
# Clamp extreme values to avoid polluting baseline
time_per_char = max(0.1, min(time_per_char, 5.0))
if self.ema_time_per_char is None:
self.ema_time_per_char = time_per_char
else:
# EMA: new = α × current + (1-α) × old
self.ema_time_per_char = (
self.alpha * time_per_char +
(1 - self.alpha) * self.ema_time_per_char
)
self.sample_count += 1
def is_afk(self, time_seconds: float, char_count: int) -> bool:
"""Returns True if reading time indicates AFK."""
if self.sample_count < self.MIN_SAMPLES:
# Warm-up: use generous fallback
base = self.FALLBACK_TIME_PER_CHAR
else:
base = self.ema_time_per_char
threshold = char_count * base * self.ANOMALY_MULTIPLIER
threshold = max(self.ABSOLUTE_MIN, min(threshold, self.ABSOLUTE_MAX))
return time_seconds > threshold
Why EMA over simple moving average? EMA adapts faster to changes in reading speed (user improving over time or switching between easy/hard games), requires no fixed-size buffer, and uses a single recursive formula. Towards Data Science The α=0.2 value means recent readings have ~20% weight while the accumulated baseline has ~80%, providing stability while still responding to sustained speed changes.
Batch calculation variant: For after-the-fact analysis where all readings are available, first filter out obvious outliers using the simple heuristic, then compute the EMA baseline from the remaining "clean" readings:
python
def batch_detect_afk(readings: list[tuple[float, int]]) -> list[bool]:
"""
Batch AFK detection for after-the-fact analysis.
readings: list of (time_seconds, char_count) tuples
"""
# First pass: rough filter using simple heuristic
def rough_filter(time, chars):
return time <= max(5, min(chars * 2.0, 180))
clean_readings = [(t, c) for t, c in readings if rough_filter(t, c) and c >= 2]
if len(clean_readings) < 5:
# Not enough clean data, use simple heuristic
return [time > max(5, min(chars * 1.2, 120)) for time, chars in readings]
# Compute baseline from clean readings
time_per_char_values = [t / c for t, c in clean_readings]
baseline = sum(time_per_char_values) / len(time_per_char_values)
# Second pass: detect outliers
results = []
for time, chars in readings:
threshold = max(5, min(chars * baseline * 3.0, 180))
results.append(time > threshold)
return results
This method provides the most statistically rigorous outlier detection. Unlike standard Z-scores (which assume normal distributions and are sensitive to outliers), the Modified Z-Score uses medians throughout, making it robust Statology to the right-skewed distribution typical of reading times. Towards Data Science
python
from collections import deque
import statistics
class RobustAFKDetector:
def __init__(self, window_size: int = 20):
self.window_size = window_size
self.time_per_char_history = deque(maxlen=window_size)
# Modified Z-score threshold (Iglewicz & Hoaglin recommend 3.5)
self.THRESHOLD = 3.5
self.K = 0.6745 # Scaling constant for MAD
self.ABSOLUTE_MIN = 5
self.ABSOLUTE_MAX = 180
self.FALLBACK_TIME_PER_CHAR = 1.2
def record_reading(self, time_seconds: float, char_count: int) -> None:
"""Record a confirmed reading (not AFK)."""
if char_count < 2:
return
time_per_char = max(0.1, min(time_seconds / char_count, 5.0))
self.time_per_char_history.append(time_per_char)
def is_afk(self, time_seconds: float, char_count: int) -> bool:
"""Detect if current reading time is anomalous."""
if char_count < 1:
return time_seconds > self.ABSOLUTE_MIN
# Hard limit check
if time_seconds > self.ABSOLUTE_MAX:
return True
# Need minimum samples for statistical detection
if len(self.time_per_char_history) < 5:
threshold = char_count * self.FALLBACK_TIME_PER_CHAR * 3
return time_seconds > max(self.ABSOLUTE_MIN, threshold)
# Calculate MAD-based detection
data = list(self.time_per_char_history)
median = statistics.median(data)
abs_deviations = [abs(x - median) for x in data]
mad = statistics.median(abs_deviations)
# Handle edge case: MAD = 0 (all values nearly identical)
if mad < 0.01:
mad = 0.1
# Modified Z-score: M = 0.6745 × (x - median) / MAD
time_per_char = time_seconds / char_count
modified_z = self.K * (time_per_char - median) / mad
return modified_z > self.THRESHOLD
Why Modified Z-Score? Reading time distributions are right-skewed—most readings cluster near the normal speed, with a long tail of increasingly rare AFK events. Standard Z-scores using mean and standard deviation are pulled by these outliers, causing "masking" where extreme values inflate the baseline and prevent detection of moderate outliers. The Modified Z-Score using median and MAD has a 50% breakdown point, meaning it remains accurate even if half the data are outliers.
The 0.6745 constant makes the Modified Z-Score comparable to standard Z-scores under normal distributions (σ ≈ 1.4826 × MAD). The 3.5 threshold is the academic standard from Iglewicz and Hoaglin's 1993 research on robust outlier detection.
| Parameter | Recommended Value | Justification |
|---|---|---|
| MIN_THRESHOLD | 5 seconds | Reaction time floor; handles clicking through very short dialogue |
| MAX_THRESHOLD | 120-180 seconds | Beyond this is definitively AFK; 2-3 minutes is generous |
| SECONDS_PER_CHAR | 1.2 for heuristic | Accommodates 50 char/min readers with 3× processing buffer |
| EMA_ALPHA | 0.2 | 80/20 split between stability and responsiveness |
| ANOMALY_MULTIPLIER | 3.0 | Approximately 3 standard deviations from baseline |
| MODIFIED_Z_THRESHOLD | 3.5 | Academic standard for MAD-based outlier detection |
| WARM_UP_SAMPLES | 5-10 | Minimum for stable baseline estimation |
| WINDOW_SIZE | 20 | Rolling window captures recent reading patterns |
Very short sentences (「ああ...」「はい」「うん」): These 2-5 character lines legitimately take 1-3 seconds. The MIN_THRESHOLD of 5 seconds provides a generous floor while still being much better than a 60-second fixed timer. Consider flagging times under 0.3× expected as "skipped" rather than read.
Very long passages (200+ characters): The MAX_THRESHOLD cap prevents unreasonable thresholds. Even slow learners shouldn't need more than 2-3 minutes for a single text box. If they do, they're likely AFK or the game has unusually long passages that should be segmented.
Dialogue choices: When the game presents multiple options, users pause to consider choices. If detectable (multiple text options, menu state), multiply threshold by 1.5-2×.
Voice-over pacing: When audio is playing, the minimum reading time equals audio duration—users can't advance faster than the voice. If audio duration is available: threshold = max(audio_duration × 1.5, normal_threshold).
Cold start / new game: During warm-up when adaptive methods lack data, the simple heuristic provides reasonable defaults. Store per-game baselines to accelerate future sessions with the same title.
"Too fast" detection: Times significantly below expected (less than 0.3× expected) indicate the user clicked through without reading. This is relevant for reading statistics accuracy but orthogonal to AFK detection.
python
def classify_reading(time_seconds: float, char_count: int, baseline_per_char: float) -> str:
expected = char_count * baseline_per_char
ratio = time_seconds / expected if expected > 0 else 0
if ratio < 0.3:
return "skipped"
elif ratio > 3.0:
return "afk"
else:
return "normal"
Start with Algorithm 1 (character heuristic). It requires approximately 10 lines of code, zero configuration, no warm-up period, and handles the majority of cases correctly. The formula max(5, min(chars × 1.2, 120)) eliminates the fundamental problem of fixed timers ignoring text length.
Add Algorithm 2 (EMA adaptive) if users report inaccurate detection after extended use. This requires storing a single floating-point baseline per game and updating it after each valid reading. The warm-up period is brief (5-10 text boxes), and the improvement in accuracy is substantial for users whose reading speed differs significantly from the assumed default.
Consider Algorithm 3 (Modified Z-Score) only if you observe systematic accuracy problems with EMA—for instance, if users frequently have contaminated sessions where they were AFK multiple times, polluting the baseline. The MAD-based approach handles this gracefully but adds implementation complexity and requires maintaining a rolling window of historical readings.
For batch/after-the-fact calculation as specified in the requirements, the two-pass batch detection variant of Algorithm 2 is ideal: use the simple heuristic to identify clean readings, compute a baseline from those, then classify all readings against that baseline. This approach combines the robustness of having all data available with the simplicity of the character-based method.
2026-01-26 15:08:49
Do new cards on Ankidroid.
Sync with Anki Desktop.
Anki Desktop says you have not done any new cards.
Usually the issue is caused by reorder plugins, if you reorder before syncing it breaks. Reorder after syncing.
2026-01-26 14:50:09
For real-time Japanese visual novel translation requiring sub-3-second responses at minimal cost, Claude 3 Haiku emerges as the optimal choice, delivering the best balance of speed, price, and translation quality. Gemini 2.0 Flash offers an even cheaper alternative with faster responses but notably lower Japanese accuracy, while GPT-4o-mini provides superior translation quality at borderline acceptable latency. DeepSeek V3—despite excellent translation benchmarks—is unsuitable due to its 7-19 second time-to-first-token, far exceeding your latency requirement.
Based on your specific requirements (~1000 characters input, sub-3-second response, budget-focused, "good enough" quality), here are the optimal models:
| Rank | Model | Speed (300 tokens) | Cost (Input/Output per 1M) | JP Quality (VNTL) | Verdict |
|---|---|---|---|---|---|
| 1 | Claude 3 Haiku | ~2.8s ✅ | $0.25 / $1.25 | 68.9% | Best overall balance |
| 2 | Gemini 2.0 Flash | ~2.3s ✅ | $0.15 / $0.60 | ~66% | Cheapest reliable option |
| 3 | GPT-4o-mini | ~3.6-4.1s ⚠️ | $0.15 / $0.60 | 72.2% | Best quality, borderline speed |
| 4 | Gemini 2.5 Flash-Lite | ~1.1s ✅ | $0.10 / $0.40 | ~66% | Fastest, lower quality |
| 5 | Qwen 2.5 32B | ~2.5-3s ✅ | $0.20 / $0.60 | 70.7% | Best Asian language specialist |
Claude 3 Haiku achieves ~2.8 seconds for a typical 300-token translation response, comfortably under your 3-second threshold. At $0.25 per million input tokens and $1.25 per million output tokens, openrouterNebuly a typical VN translation request (1000 characters ≈ 500 tokens input, ~150 tokens output) costs approximately $0.0003 per line—meaning you could translate 10,000 lines for roughly $3.
The Visual Novel Translation Leaderboard (VNTL) ranks Claude 3 Haiku at 68.9% accuracy, huggingface which significantly outperforms traditional machine translation tools like Sugoi Translator (60.9%) and Google Translate (53.9%). huggingface Community feedback indicates Claude models excel at capturing "tone, style, and nuance" in dialogue— Designs Valleycritical for visual novel content with casual speech patterns, honorifics, and implied subjects.
If cost is your primary concern and you can tolerate slightly rougher translations, Gemini 2.0 Flash delivers responses in ~2.3 seconds at just $0.15/$0.60 per million tokens—roughly half the cost of Claude 3 Haiku. For extreme budget optimization, Gemini 2.0 Flash Experimental is currently free on OpenRouter with 1.05 million token context windows. openrouter
The tradeoff is meaningful: Gemini Flash models score around 66% on VNTL benchmarks versus Claude Haiku's 68.9%. For casual reading where you just need the gist, this difference is acceptable. For dialogue-heavy games with nuanced character interactions, you'll notice more awkward phrasing and occasional mishandled honorifics.
GPT-4o-mini achieves 72.2% VNTL accuracy—the highest among budget models and only 3% behind flagship GPT-4o (75.2%). This makes it objectively the best "good enough" translator in terms of output quality. The catch: its 85-97 tokens/second generation speed produces total response times of 3.6-4.1 seconds, slightly exceeding your 3-second requirement.
If you can tolerate occasional 4-second responses, GPT-4o-mini at $0.15/$0.60 Nebuly offers the best quality-per-dollar. LangCopilot Enabling streaming significantly improves perceived latency— OpenAItext appears as it generates, so you'll see the translation building rather than waiting for the full response. OpenAI
DeepSeek V3 scores an impressive 74.2% on VNTL—competitive with flagship models—but its 7.5-19 second time-to-first-token makes it completely unsuitable for real-time use. AIMultiple This latency occurs because DeepSeek's infrastructure prioritizes throughput over latency, and reasoning-focused models like DeepSeek R1 can take even longer.
Mistral models (including Mistral 7B and Mistral Small) receive mixed community feedback for Japanese translation, with reports of "old OPUS-MT-level issues" on nuance and honorifics. Designs Valley Llama models without Japanese-specific fine-tuning also underperform Asian-focused models like Qwen on this task.
For your use case (1000 characters input ≈ 500 tokens, ~150 tokens output per request):
| Model | Cost per Request | Cost per 1,000 Lines | Cost per Full VN (~50,000 lines) |
|---|---|---|---|
| Claude 3 Haiku | $0.0003 | $0.31 | ~$15 |
| Gemini 2.0 Flash | $0.0002 | $0.17 | ~$8 |
| GPT-4o-mini | $0.0002 | $0.17 | ~$8 |
| Gemini 2.0 Flash Exp | FREE | FREE | FREE (rate limited) |
OpenRouter adds approximately 25ms gateway overhead with its edge-based architecture— OpenRouterCodecademynegligible for your use case. Skywork Enable these optimizations for best results:
:nitro suffix on model slugs or sort by "latency" to prioritize fast providers OpenRouter
Based on community best practices, use this configuration:
Temperature: 0.0 (for consistent translations)
System prompt: "You are translating a Japanese visual novel to English.
Preserve the original tone and speaking style. Translate naturally
without over-explaining. Keep honorifics where appropriate."
Context: Include 10-15 previous lines for dialogue continuity
For real-time visual novel translation prioritizing the speed-cost-quality balance, Claude 3 Haiku is the clear winner—fast enough (2.8s), affordable (~$0.0003/line), and good enough quality (68.9% VNTL). Choose Gemini 2.0 Flash if you need to minimize costs further and can accept rougher translations. Choose GPT-4o-mini if translation quality matters most and you can tolerate occasional 4-second delays with streaming enabled. All three models dramatically outperform traditional machine translation while remaining affordable for high-volume visual novel content.