New v0.4.0 — Speaker diarization, AI transcripts, custom hotkeys, and more. See what's new →

← Blog

MacParakeet Is Now Free and Open-Source

MacParakeet — fast, private voice dictation and transcription for Mac — is now GPL-3.0 open-source. Why I built a paid app and then gave it away.

MacParakeet is now open-source under the GPL-3.0 license. You can download the app from macparakeet.com or clone the repo and build it yourself.

There is no paid tier. No “community edition” with features stripped out. The app you download is the app I develop. Everything is included.

What MacParakeet does

MacParakeet is a voice app for Mac with two modes.

System-wide dictation. Press a hotkey — Fn by default — speak, and text appears wherever your cursor is. Hold to talk, release to paste. Double-tap for persistent recording. It works in any app because it operates at the system level: capture audio, transcribe locally, simulate a paste.

File transcription. Drag an audio file, video file, or YouTube URL onto the app. Get a full transcript with word-level timestamps, speaker labels, and export to seven formats — TXT, Markdown, SRT, VTT, JSON, DOCX, PDF.

Everything runs locally on the Neural Engine. No cloud. No accounts. No internet required after the initial model download.

Why I’m giving it away

Trust

Every voice app asks for microphone access. Every one of them tells you your audio stays on your device. But when the app is closed-source, you’re taking the vendor’s word for it.

That has always felt insufficient to me. This is software that listens to you. Not occasionally — continuously, whenever you activate it. It hears your meetings, your drafts, your half-formed thoughts spoken aloud. The privacy stakes are as high as they get for desktop software.

“100% local” is easy to claim. Reading the source code is how you confirm it. Open-sourcing the codebase means anyone can audit the network calls (there aren’t any), inspect the data flow, and verify that no audio ever leaves the machine. For software that literally listens to everything you say, that transparency isn’t a nice-to-have. It’s the baseline.

The training data problem

This is the harder one to articulate, but it’s the one that kept nagging at me.

MacParakeet was built primarily with AI coding tools — Claude Code doing most of the implementation. The speech models it runs were trained on enormous datasets. And here’s the uncomfortable truth: not a single large language model available today, open or closed, can claim a fully ethical training pipeline. They all trained on a mixture of open data, copyrighted works, and private content that the creators never consented to have ingested.

Suchir Balaji, a former OpenAI researcher, wrote about this before his death in 2024. His argument was precise: when AI outputs compete with and substitute for the works the model was trained on, the fair use defense collapses. The scale of data collection transforms what might be acceptable individual learning into something qualitatively different. The value flows from millions of creators to a handful of AI companies, with no consent and no compensation.

I’ve spent a lot of time thinking about this. Reading Balaji’s writing. Reading the discussions on LessWrong. Sitting with the question of what it means to build commercial software on top of tools whose own foundations are ethically contested.

And I kept arriving at the same conclusion: what ownership can I really claim? I used these tools. I benefited enormously from them. The code is good, but I can’t pretend it’s entirely “mine” in some pure sense. It was written in collaboration with models that learned from the collective output of millions of developers who never opted in.

Open-sourcing MacParakeet is the honest response to that reality. I benefited from the commons — open-source libraries, community-contributed training data, shared knowledge that accumulated over decades. The least I can do is contribute something back.

Infrastructure, not product

The local speech recognition stack is now good enough that it should be infrastructure, not a product. FluidAudio and Parakeet TDT did the hard work — model conversion, Neural Engine optimization, a clean Swift SDK. MacParakeet is an application layer on top of that infrastructure. Charging $49 for it started to feel like charging for a well-designed frontend to technology that wants to be free.

Voice input on macOS is bad. Apple’s built-in dictation requires Siri, sends audio to iCloud, and produces mediocre results. The third-party alternatives are either cloud-dependent, subscription-priced, or both. Making MacParakeet free and open-source means anyone with a Mac can have fast, private voice input without paying for it or trusting a server with their audio.

Why GPL-3.0

The license choice was deliberate.

GPL-3.0 means anyone can download, use, modify, and redistribute MacParakeet. Anyone can read the source to verify it does what I claim. And anyone who distributes a modified version must also release their source under GPL-3.0.

That last point is why I chose GPL over MIT or Apache. FluidAudio — the speech engine underneath — is Apache 2.0, which is appropriate for an SDK that other developers embed in their own apps. MacParakeet is an end-user application. If someone improves it, those improvements should be available to everyone, not locked inside a proprietary fork.

If you’re a developer who wants to use FluidAudio in a closed-source app, you absolutely can — FluidAudio’s Apache 2.0 license permits that. The GPL applies to MacParakeet’s application code, not to its dependencies.

What’s inside

For developers who want to look under the hood or contribute:

Stack. Swift 6, SwiftUI, macOS 14.2+, Apple Silicon only. SQLite via GRDB for persistence. Parakeet TDT 0.6B-v3 via FluidAudio CoreML for speech recognition. No Electron, no web views, no Python subprocesses.

Architecture. ~125 source files across four targets: the main app, a CLI tool, a shared core library (no UI dependencies), and a separate ViewModel layer. The core library handles transcription, dictation, database, export, and text processing. The GUI imports the core and ViewModels. The CLI imports the core directly.

Testing. 825 tests. Unit tests for text processing and models, database tests against in-memory SQLite, integration tests for service boundaries. swift test runs the full suite.

Text processing. A deterministic pipeline — filler word removal, custom word replacements, text snippet expansion, whitespace normalization — that runs after transcription. No LLM required. Predictable output every time.

Speaker diarization. Identifies who spoke when, using FluidAudio’s offline diarization pipeline. Available for file transcription.

AI features. Optional cloud LLM integration for transcript summarization and chat. Bring your own API key (OpenAI, Anthropic, Ollama, OpenRouter). Entirely opt-in — the app works completely offline without it.

The app was built primarily with AI coding agents — Claude Code doing most of the implementation, guided by a structured spec kernel. I wrote about the methodology if you’re curious about that process.

How to get it

Download the app: macparakeet.com — notarized DMG, drag to Applications, done. Auto-updates via Sparkle.

Build from source:

git clone https://github.com/moona3k/macparakeet
cd macparakeet
swift test              # verify everything passes
scripts/dev/run_app.sh  # build and launch

The first launch downloads the Parakeet TDT model (~6 GB) to ~/Library/Application Support/MacParakeet/models/. After that, everything runs offline.

This is not abandonware

MacParakeet is my daily driver. I use it for dictation in every app, for transcribing meetings, for turning voice memos into text. It’s the first thing I’d reinstall on a fresh Mac. There’s something quietly satisfying about knowing other people are running software you built and use yourself every day.

That daily use matters because open-source projects without active maintainers rot. This one won’t. I notice when something breaks because it breaks for me first, and I fix it. The app is at v0.4 — feature-complete for dictation and transcription, stable for daily use, still being actively polished.

The GitHub issues are open. If something doesn’t work, file a bug. If something should work differently, say so. The codebase is documented — CLAUDE.md at the root has the full project context, architecture decisions, and contribution patterns.

Local voice input on Mac should be fast, private, and free. Now it is.