2026-01-13 04:26:59

Step 1: Transcribe with parakeet-mlx.
When I started transcribing AppStories and MacStories Unwind three years ago, I had wanted to do so for years, but the tools at the time were either too inaccurate or too expensive. That turned a corner with OpenAI’s Whisper, an open-source speech-to-text model that blew away other readily available options.
Still, the results weren’t good enough to publish those transcripts anywhere. Instead, I kept them as text-searchable archives to make it easier to find and link to old episodes.
Since then, a cottage industry of apps has arisen around Whisper transcription. Some of those tools do a very good job with what is now an aging model, but I have never been satisfied with their accuracy or speed. However, when we began publishing our podcasts as videos, I knew it was finally time to start generating transcripts because as inaccurate as Whisper is, YouTube’s automatically generated transcripts are far worse.

VidCap in action.
My first stab at video transcription was to use apps like VidCap and MacWhisper. After a transcript was generated, I’d run it through MassReplaceIt, a Mac app that lets you create and apply a huge dictionary of spelling corrections using a bulk find-and-replace operation. As I found errors in AI transcriptions by manually skimming them, I’d add those corrections to my dictionary. As a result, the transcriptions improved over time, but it was a cumbersome process that relied on me spotting errors, and I didn’t have time to do more than scan through each transcript quickly.
That’s why I was so enthusiastic about the speech APIs that Apple introduced last year at WWDC. The accuracy wasn’t any better than Whisper, and in some circumstances it was worse, but it was fast, which I appreciate given the many steps needed to get a YouTube video published.
The process was sped up considerably when Claude Skills were released. A skill can combine a script with instructions to create a hybrid automation with both the deterministic outcome of scripting and the fuzzy analysis of LLMs.

Transcribing with yap.
I’d run yap, a command line tool that I used to transcribe videos with Apple’s speech-to-text framework. Next, I’d open the Claude app, attach the resulting transcript, and run a skill that would run the script, replacing known spelling errors. Then, Claude would analyze the text against its knowledge base, looking for other likely misspellings. When it found one, Claude would reply with some textual context, asking if the proposed change should be made. After I responded, Claude would further improve my transcript, and I’d tell Claude which of its suggestions to add to the script’s dictionary, helping improve the results a little each time I used the skill.
Over the holidays, I refined my skill further and moved it from the Claude app to the Terminal. The first change was to move to parakeet-mlx, an Apple silicon-optimized version of NVIDIA’s Parakeet model that was released last summer. Parakeet isn’t as fast as Apple’s speech APIs, but it’s more accurate, and crucially, its mistakes are closer to the right answers phonetically than the ones made by Apple’s tools. Consequently, Claude is more likely to find mistakes that aren’t in my dictionary of misspellings in its final review.

Managing the built-in corrections dictionary.
With Claude Opus 4.5’s assistance, I rebuilt the Python script at the heart of my Claude skill to run videos through parakeet-mlx, saving the results as either a .srt or .txt file (or both) in the same location as the original file but prepended with “CLEANED TRANSCRIPT.” Because Claude Code can run scripts and access local files from Terminal, the transition to the final fuzzy pass for errors is seamless. Claude asks permission to access the cleaned transcript file that the script creates and then generates a report with suggested changes.

A list of obscure words Claude suggested changing. Every one was correct.
The last step is for me to confirm which suggested changes should be made and which should be added to the dictionary of corrections. The whole process takes just a couple of minutes, and it’s worth the effort. For the last episode of AppStories, the script found and corrected 27 errors, many of which were misspellings of our names, our podcasts, and MacStories. The final pass by Claude managed to catch seven more issues, including everything from a misspelling of the band name Deftones to Susvara, a model of headphones, and Bazzite, an open-source SteamOS project. Those are far from everyday words, but now, their misspellings are not only fixed in the latest episode of AppStories, they’re in the dictionary where those words will always be corrected whether Claude’s analysis catches them or not.

Claude even figured out “goti” was a reference to GOTY (Game of the Year).
I’ve used this same pattern over and over again. I have Claude build me a reliable, deterministic script that helps me work more efficiently; then, I layer in a bit of generative analysis to improve the script in ways that would be impossible or incredibly complex to code deterministically. Here, that generative “extra” looks for spelling errors. Elsewhere, I use it to do things like rank items in a database based on a natural language prompt. It’s an additional pass that elevates the performance of the workflow beyond what was possible when I was using a find-and-replace app and later a simple dictionary check that I manually added items to. The idea behind my transcription cleanup workflow has been the same since the beginning, but boy, have the tools improved the results since I first used Whisper three years ago.
2026-01-13 02:54:14
Last Friday, basketball fans in the Los Angeles Lakers market got their first glimpse of an immersive live game when the Lakers faced the Milwaukee Bucks on Spectrum Front Row on Apple Vision Pro. While that experience was limited geographically and only available to Spectrum customers via the Spectrum SportsNet app, the game replay is now available widely and for free in the NBA app. Vision Pro users in eligible regions outside Lakers territory can download the app, sign up for an NBA ID, and stream the game replay and highlights today. The full schedule and availability of immersive Lakers games were announced last week.
Being from Arkansas and not California, I missed out on the live premiere, but I was able to check out the game replay on my Vision Pro yesterday, and the experience was fantastic. Most of the game was shown from a front-row courtside perspective, which meant I was literally turning my head from side to side as the teams moved up and down the court. It was very different from the bird‘s-eye view I’m used to watching televised sports from, and it really gave me the impression of being in the arena. At one point, when a member of the Lakers scored a point, I felt the urge to start clapping as if they could hear me, even though I was sitting in my bedroom, not at the Lakers game.
There were several other camera angles that the broadcast cut to from time to time. The behind-the-basket view was a fun way to take in the action when someone was about to score, and there was a roaming camera that brought you onto the court itself before the game and during halftime as well. The cuts were sparing, which made the whole experience feel less jarring than some of the immersive sports highlights we’ve seen on Vision Pro before, but the combination of immersive video and multiple angles offered the best of both worlds. It felt like I was actually there taking in the game, and no matter what was happening, I always got to see it from the best angle.
Even if you’re not a big fan of basketball or the Lakers, it’s worth checking out the replay to see what the experience is like. Right now, broadcasting a game in this way is a big undertaking, but I have a feeling it will only become more and more common with time. If this concept eventually expands to other sports and live experiences like concerts, theatrical performances, and more, it would make a really compelling case for the Vision Pro and the sorts of capabilities only visionOS can offer.
2026-01-12 23:56:59
Apple has confirmed to CNBC that it has entered into a multi-year partnership with Google to use the search giant’s models and cloud technology for its own AI efforts. According to an unnamed Apple spokesperson:
After careful evaluation, we determined that Google’s technology provides the most capable foundation for Apple Foundation Models and we’re excited about the innovative new experiences it will unlock for our users.
The report still leaves many questions unanswered, including how Gemini fits in with Apple’s own Foundation Models and whether and to what extent Apple will rely on Google hardware. However, after months of speculation and reports from Mark Gurman at Bloomberg that Apple and Google were negotiating, it looks like we’re on the cusp of Apple’s AI strategy coming into better focus.
UPDATE:
Subsequent to the statement made by Apple to CNBC, Apple and Google released a slightly more detailed joint statement that Google published on X:
Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google’s Gemini models and cloud technology. These models will help power future Apple Intelligence features, including a more personalized Siri coming this year.
After careful evaluation, Apple determined that Google’s Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users. Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple’s industry-leading privacy standards.
So, while the Apple Foundation Models that power Apple Intelligence will be based on Gemini and unspecified cloud technology, Apple Intelligence features themselves, including more personalized Siri, will continue to run locally on Apple devices and on Apple’s Private Cloud Compute to maintain user privacy.
2026-01-12 21:22:05
Copilot Money, the only personal finance app to win an Apple Editor’s Choice award, now gives you a seamless, cross-device way to manage your money across iPhone, iPad, Mac, and Web.
With Copilot’s beautifully designed interface and powerful financial insights, you can see your spending, budgets, investments, and net worth all in one place. Everything syncs automatically across devices, so whether you’re at your desk or on the go, you always have an up-to-date picture of your finances.
Copilot is built to help you go deeper without feeling overwhelmed. Transactions are intelligently categorized, trends surface automatically, and insights are presented in a way that feels intuitive and confidence-building, not stressful.
Whether you’re tracking monthly spending, planning ahead, or working toward long-term goals, Copilot’s unified dashboard helps you feel clear, calm, and in control of your money.
Copilot Money is offering MacStories readers an extended two-month free trial with code MACSTORIES. Plus, for one more week, you can save 26% on your first year through the link below.
👉 Visit Copilot Money to explore the app and start your extended free trial today. Offer available to new users only, exclusively through this link.
Our thanks to Copilot Money for sponsoring MacStories this week.
2026-01-10 04:50:32
Enjoy the latest episodes from MacStories’ family of podcasts:
Last week was the annual predictions episode! The gang reflected on their predictions from 2025 and then made their Guaranteed To Be Correct Predictions for 2026. No boring predictions here; we started at Pro predictions and went all the way to Pro Max.
Last week’s Cozy Zone had everyone discussing the tricky business of streaming music and how we actually get artists paid.
And this week, Matt needs some help figuring out what browser to use, Niléane has a new game show, and Chris challenges the gang to clean up their desk area.
On this week’s Cozy Zone, the gang discusses their tech white whales. If they had unlimited funds, what would they buy? A nice camera? A beefy computer? A whole company?!
This week, Federico and John share how they unwound during their holiday break, John has a report on CES 2026, Federico recommends Avatar: Fire and Ash, and John does a Parks and Rec rewatch and has a superhero movie deal for listeners.
How would you have done our challenges? How would you answer the question at the end of the show? Let us know!
For even more from the Comfort Zone crew, you can subscribe to Cozy Zone. Cozy Zone is a weekly bonus episode of Comfort Zone where Matt, Niléane, and Chris invite listeners to join them in the Cozy Zone where they’ll cover extra topics, invent wilder challenges and games, and share all their great (and not so great) takes on tech. You can subscribe to Cozy Zone for $5 per month here or $50 per year here.
We deliver MacStories Unwind+ to Club MacStories subscribers ad-free with high bitrate audio every week. To learn more about the benefits of a Club MacStories subscription, visit our Plans page.
MacStories launched its first podcast in 2017 with AppStories. Since then, the lineup has expanded to include a family of weekly shows that also includes MacStories Unwind, Magic Rays of Light, Comfort Zone, NPC: Next Portable Console, and First, Last, Everything that collectively, cover a broad range of the modern media world from Apple’s streaming service and videogame hardware to apps for a growing audience that appreciates our thoughtful, in-depth approach to media.
If you’re interested in advertising on our shows, you can learn more here or by contacting our Managing Editor, John Voorhees.
2026-01-07 23:38:32
It’s CES time again, which means another edition of our annual roundup of the most eye-catching gadgets seasoned with a helping of weird and wonderful tech. I’m sure it will come as no surprise that robots, AI, and TVs are some of the most prominent themes at CES in 2026, but there’s a lot more, so buckle in for a tour of what to expect from the gadget world in the coming months.

Viture encourages customers to both unleash and embrace The Beast. Source: Viture.
I first tried Xreal AR glasses shortly before the Vision Pro was released. The experience at the time wasn’t great, but you could see the potential for what has turned out to be one of the Vision Pro’s greatest strengths: working on a huge virtual display. There’s also a lot of potential for gaming.
It looks like the tech behind AR glasses is finally getting to a point where I may dip in again this year. Xreal updated and reduced the price of its entry-level 1S glasses, which will make the category accessible to more people.
The company also introduced the Neo dock, a 10,000 mAh battery that also serves as a hub for connecting a game console or other device to its AR glasses. Notably, the Neo is compatible with the Nintendo Switch 2, which caught my eye immediately.
For its part, Viture is releasing The Beast next month. The $549 AR glasses offer a 58-degree field of view, electrochromic tint with nine adjustment levels, and a built-in camera, and they weigh only 96 grams. Unfortunately, The Beast does not include built-in nearsightedness correction, so you may need prescription lenses, too.

The Rokid Style. Source: Rokid.
Rokid is going more Meta with the Style, a pair of glasses that supports prescription lenses and uses a built-in Qualcomm chip for AI and taking photos and videos with a 12MP camera. At $300, the glasses, which will be out in just under two weeks, are much more affordable than Xreal and Viture’s glasses, but they’re also serving a very different audience.
I’m not totally sold on AR glasses yet, but Xreal and Viture have made strides that have me intrigued again. Their functionality is limited, but there’s a lot to be said for a giant virtual screen that you can take with you.

The Aqara U400 lock. Source: Aqara.
Just like last year, Aqara is showing off a bunch of new smart home devices, including the U400, the first smart lock to feature Apple’s ultra-wideband functionality. With UWB built in, the $269.99 U400 will be able to sense when you approach your door from the outside and unlock it for you when you’re within six feet of the door. This is tech that I’ve been impatiently waiting for since I moved into our new house, and I can’t wait to get my hands on it.
Aqara also showed off the FP400, an update to the company’s mmWave wired presence sensor; a thermostat that supports Wi-Fi, Thread, and Zigbee; the company’s first Matter-compatible camera, which looks like a rabbit; and the P100 Spatial Multi-State Sensor, a high-precision sensor with 9-axis sensing and built-in algorithms that can sense things like door openings, knocks, and more.

I’ll be shocked if this SwitchBot robot ships and, if it does, can do half of what they claim. Source: SwitchBot.
SwitchBot throws a lot at the wall every year at CES, and 2026 is no different. The company, best known for gizmos that press buttons and flip switches, is showing off:
UGREEN is expanding into the smart home this year, too, with indoor, outdoor, and doorbell cameras. There’s no word on the outdoor camera’s resolution, but the indoor models are 4K, as is the doorbell, and all of the cameras will work with UGREEN’s SynCare Smart Display D500 and store data locally using UGREEN NASync. The company hasn’t said which smart home technologies will be supported, nor has it announced pricing or availability. My guess is that these products are a long way off still but worth keeping an eye on.
Narwal introduced the Flow 2 robot vacuum and mop at CES this year, which adds a couple of 1080p cameras to the device. That’s good to see, and I’ll be interested to find out how well it works because, as I noted in my review of the Freo X10 Pro, one of its only downsides compared to my older Roomba is that because it relies on LiDAR, the Freo X10 Pro doesn’t see cords and cables as well.

8BitDo’s FlipPad. Source: 8BitDo.
As we recently predicted on NPC XL, we’ll see a lot more of AMD’s Strix Halo chips in gaming handhelds in the future. To date, the adoption has been limited because the chips run so hot that some manufacturers resorted to separate battery packs to manage heat.
However, with the new lower-end Ryzen AI Max Plus chips (the 392 has 12 cores, and the 388 has 8 cores), AMD is specifically targeting handhelds. Although they have fewer CPU cores, the chips have the same graphics compute units as the higher-end chip in my mini PC, which should allow them to push a lot of pixels, albeit in devices that will undoubtedly cost over $1,000.
NVIDIA also announced that its game streaming service, GeForce NOW, is coming to Linux and Fire TV. I can’t say I’m too excited about the Fire TV, but the service is built into my LG TV, which does give it a certain Netflix-like convenience of being connected to all my screens. But what’s really interesting is the Linux introduction. That strikes me as preparation for Valve’s upcoming Steam Machine. It wasn’t that long ago that NVIDIA added official GeForce NOW support for the Steam Deck, so it seems likely we’ll see the app on the Steam Machine too.
I’m also keen to try 8BitDo’s FlipPad iPhone controller accessory that attaches via USB-C and folds up to lay flat against your iPhone, looking a little like a Game Boy. There have been previous attempts at something similar to this with more in the works, all of which we’ve covered on NPC, but 8BitDo’s track record of producing good controllers makes me optimistic that this might be the best of the bunch for emulating old systems in portrait mode. Sadly, it won’t be out until this summer.
.](https://cdn.macstories.net/elihealth-jpg-1767733000377.webp)
Can you guess what this is? Source: Eli Health via The Verge.
I have to hand it to Dell. Computer monitors don’t usually make my weird and wonderful list, but a 52” 6K display that costs $2,900 fits the bill. My neck hurts just looking at the photos. It’s also a Thunderbolt hub and KVM switch. It doesn’t really have the kind of specs you’d expect from a gaming display, but you could always watch TV from your couch and then KVM over to a mini PC to show off your Excel skills, I suppose.

Agibot A2 Ultra: A Well-Dressed Bot. Source: Agibot.
There are robots all over CES this year, but Agibot’s can dance, so that’s the one I want. It also comes in two sizes in case you prefer hobbit-sized bots.
Mui Board is a plank of wood you can use to control your smart home. I like the natural wood look, but what am I supposed to do with this? Mount it on the wall with a dangling cord, as you can see from the photo in this story from The Verge, or leave a random piece of wood lying on the coffee table like a carpenter came by and left it behind? Knock on wood that this thing actually gets released someday after showing up at CES for the past seven years.
.](https://cdn.macstories.net/babyfufu_1-jpg-1767735870079.webp)
Image via The Verge.
Do you remember the “nib nib” cat that chewed on your fingers, the headless cat pillow, or the pillow that breathed? Well, the company that brought you all of those CES classics is back with Baby FuFu, a faceless bear that blows air at babies. Like all of Yukai Engineering’s CES reveals, it’s vaguely creepy and seems designed to attract little fingers. My kids are too big, so maybe I’ll get one for Myke to test – you know, “for the content.”
Of everything shown off at CES so far, I’m most immediately excited about Aqara’s U400 deadbolt lock. Based on Jennifer Pattison Tuohy’s hands-on story for The Verge, it sounds like a big improvement over older smart lock technology.
I’m also very interested in what handheld companies will do with AMD’s cheaper Strix Halo chipset. To date, Strix Halo handhelds have been too big and too expensive for my taste, and while I’m sure devices with the new chips will still carry a premium price, it should be a step in the right direction toward balancing size, power, and cost.
It’s good to have gadgets back. There was a long dry spell when everything was reduced to apps. Don’t get me wrong: I’m a fan of apps, but gadgets are great, too, even when they’re mostly pie-in-the-sky vaporware or just plain weird. CES is just getting started. Tune into the next episode of MacStories Unwind for more of my finds.