VoiceNote is a free, open-source speech-to-text app built with Tauri. It runs Freely on your device, keeps your voice private, and gives you optional AI features only when you choose to use them.
Three pieces working in sync: a local ASR engine for raw transcription, an optional model for polish, and a voice bar that stays out of the way until you need it.
Whisper‑family & Parakeet models tuned for CPU. No upload, no waiting, no cloud bill.
Streaming decoder starts producing tokens before you finish your sentence.
Audit every byte, swap any model, contribute upstream. Free, forever.
Plug in Anthropic, OpenAI, Gemini or a local LLM to summarize, format or translate.
Default mode runs entirely on‑device. No upload, no account, no API call. The difference between cloud dictation and VoiceNote, side by side:
Your audio is streamed to a server, transcribed remotely, and stored — often with a login, retention policy, and a per‑minute bill attached.
A local ASR engine handles transcription on your CPU. Nothing leaves the machine. Nothing is logged. No account is created.
Flip on AI mode and your raw transcript can be cleaned, summarised, reformatted or translated — using a model you choose, with a key you own. We never proxy your traffic.
Headings, bullets, code fences, punctuation. The AI tidies the shape without changing your words.
Drop a 30‑minute standup transcript in, get the decisions, blockers and action items out — formatted however you like.
A pass that removes filler, false starts and verbal tics — keeping your voice, losing the hesitation.
Dictate in any of the 25 supported languages, get the transcript in your target — instant cross‑lingual notes.
No copy‑paste dance. Press the shortcut anywhere — browser, IDE, terminal, chat — and VoiceNote types straight into the active text field.
Tested on 60+ apps. Custom shortcut per AI tool · push‑to‑talk or hands‑free · respects focused field.
No hidden defaults, no nudges towards a paid model. Local or AI is a single switch — and the rest of the app is just preferences.
Push‑to‑talk shortcuts, for different AI modes/styles, Prompting, Clean up, Translate, etc.
Stay offline by default. Let AI kick in only for specific apps, or only when you press a modifier.
Swap whisper‑small for whisper‑medium when accuracy matters. Or Nvidia Parakeet V3 when u want multi-lang with Auto Detect
Your key, your billing, your provider. No proxy, no logs, no markup on tokens.
Five rough shapes of how teams put VoiceNote to work. Yours will look different.
VoiceNote sits in the background. Speak the change, get a clean commit message — Freely, so proprietary code never touches a cloud.
Long Zendesk replies become a 30‑second voice memo. AI mode formats them into a polite, structured response.
Record a 90‑minute lecture, get a marked‑up transcript with definitions surfaced. Works in Danish, English, German and 22 others.
Walk‑and‑dictate first drafts. AI cleanup removes filler without rewriting your voice — the cadence stays yours.
Speak the client call summary while it's fresh. AI turns it into a scope doc you can send before the kettle's boiled.
We add real workflows to the docs as testers share them. If yours is novel, we'll feature it (with permission).
Every feature is free — local transcription and AI features alike. No paywall, no Pro tier, no upsell. VoiceNote stays alive because people who can afford to chip in, do. And because a small number of values‑aligned partners help us keep the lights on.
The whole app. No tiers, no gated features, no trial countdown.
no credit card · no account · MIT licensed
Two ways the project stays alive — both optional, both transparent.
VoiceNote started as a weekend script to dump a long voice memo into clean markdown. It grew into a small, opinionated tool: capture audio, transcribe it Freely, optionally polish it with an LLM you control — then get out of the way.
No accounts, no telemetry, no subscription. The desktop app is one binary. Everything else lives in a public repository.
Yes. Once a language model is downloaded (≈40 MB per language), transcription runs entirely on your CPU. Airplane mode, terminal, train — VoiceNote keeps working. AI mode is the only feature that needs a connection, and only when you actively trigger it.
No. Audio buffers live in RAM for the duration of a transcription, then they're discarded. Nothing is written to disk, nothing is sent over the network. If you enable AI mode, only the resulting text transcript is sent to the provider you chose — never the audio.
No. Download the binary, run it, dictate. There's no sign‑up flow, no email gate, no telemetry ping. Pro is unlocked by a license key you can buy without creating an account — we email it once, you paste it in.
Yes — that's how AI mode is designed. Paste an Anthropic, OpenAI, Mistral, or local LLM endpoint into Settings. Traffic flows directly from your machine to the provider; we never proxy, never see your tokens, never mark them up.
25 European languages including English, Danish, German, French, Spanish, Italian, Dutch, Portuguese, Polish, Swedish, Norwegian, Finnish, Czech, Slovak, Hungarian, Romanian, Greek, Bulgarian, Croatian, Serbian, Slovenian, Estonian, Latvian, Lithuanian and Ukrainian. Each ships as a separate ~40 MB model — install only the ones you use.
Windows 10/11, macOS 12+ (Apple Silicon and Intel), and major Linux distros (Ubuntu, Fedora, Arch). One binary per platform, no installer dependencies.
Yes, MIT licensed. The local app, the model‑downloader and the workflow runtime all live in the same public repository. Fork it, audit it, ship a derivative — that's the point.
Get the install link the day v1.0 ships. No marketing, no drip — one email, one download.