VoiceNote
warming up local model
v0.4 · open beta now live

Speak Freely
Stay In Control.

VoiceNote is a free, open-source speech-to-text app built with Tauri. It runs Freely on your device, keeps your voice private, and gives you optional AI features only when you choose to use them.

Join the waitlistWatch the 60s demo
100% offline ASR Windows · macOS · Linux Bring your own API key
Listening…
whisper‑local · 0.21s latency
EN ▾
How it works

Speak naturally.
Get text that's already useful.

Three pieces working in sync: a local ASR engine for raw transcription, an optional model for polish, and a voice bar that stays out of the way until you need it.

~/voicenote/languages
25 EUROPEAN LANGUAGES · LOCAL
ENDEFRESITNLPTPLSVNODAFICSSKHUROELBGHRSRSLETLVLTUK
↑ tap any tile to swap. models stream in ~40 mb each.
live · session_412.txt
00:01So the next steps are to ship the beta build,
00:04make sure the local model handles Polish properly,
00:08and then write a blog post about why we went
00:11open source from day one
settings · ai
AI POLISH (OPTIONAL)
providerAnthropic ▾
modelclaude‑haiku‑4‑5
api keysk‑ant‑···‑a4f2 ✓
actionsummarize + format
your key, your billing. we never proxy.
01 · LOCAL

On‑device ASR

Whisper‑family & Parakeet models tuned for CPU. No upload, no waiting, no cloud bill.

02 · FAST

~200ms to first word

Streaming decoder starts producing tokens before you finish your sentence.

03 · OPEN

MIT, forkable

Audit every byte, swap any model, contribute upstream. Free, forever.

04 · BYOK

Bring your own AI

Plug in Anthropic, OpenAI, Gemini or a local LLM to summarize, format or translate.

Local‑first privacy

Your voice never leaves your machine.

Default mode runs entirely on‑device. No upload, no account, no API call. The difference between cloud dictation and VoiceNote, side by side:

Most voice tools

Speech goes to the cloud.

Your audio is streamed to a server, transcribed remotely, and stored — often with a login, retention policy, and a per‑minute bill attached.

miccloud servertext
VoiceNote · default

Speech stays on your device.

A local ASR engine handles transcription on your CPU. Nothing leaves the machine. Nothing is logged. No account is created.

micyour machinetext
Optional AI features

When you want polish, plug in your own AI.

Flip on AI mode and your raw transcript can be cleaned, summarised, reformatted or translated — using a model you choose, with a key you own. We never proxy your traffic.

01 · Format

Turn rambles into structure.

Headings, bullets, code fences, punctuation. The AI tidies the shape without changing your words.

Rawso the three things we need to fix are auth the empty state and the slow query on dashboard
PolishedThree things to fix:
· Auth
· Empty state
· Slow dashboard query
02 · Summarise

Long meetings, short notes.

Drop a 30‑minute standup transcript in, get the decisions, blockers and action items out — formatted however you like.

12 min talk"…and then Lena said the API was rate‑limited, Petra wants to push the release, and we agreed to ship Tuesday…"
SummaryDecided: ship Tuesday.
Blocker: API rate‑limit (Lena).
Next: Petra to review.
03 · Tone cleanup

Strip the "um, like, you know".

A pass that removes filler, false starts and verbal tics — keeping your voice, losing the hesitation.

Spokenum so like, I think we should probably, you know, push the launch by a week or so
CleanI think we should push the launch by a week.
04 · Translate

Speak Danish, write English.

Dictate in any of the 25 supported languages, get the transcript in your target — instant cross‑lingual notes.

DAVi skal nok få det færdigt inden fredag, men jeg vil gerne tjekke med Anne først.
ENWe'll get it done before Friday, but I'd like to check with Anne first.
Works where you work

Types into whatever's focused.

No copy‑paste dance. Press the shortcut anywhere — browser, IDE, terminal, chat — and VoiceNote types straight into the active text field.

Tested on 60+ apps. Custom shortcut per AI tool · push‑to‑talk or hands‑free · respects focused field.

Built for control

Every knob, surfaced.

No hidden defaults, no nudges towards a paid model. Local or AI is a single switch — and the rest of the app is just preferences.

  • Custom shortcuts, per AI tool

    Push‑to‑talk shortcuts, for different AI modes/styles, Prompting, Clean up, Translate, etc.

  • Local · AI · Auto modes

    Stay offline by default. Let AI kick in only for specific apps, or only when you press a modifier.

  • Model choices

    Swap whisper‑small for whisper‑medium when accuracy matters. Or Nvidia Parakeet V3 when u want multi-lang with Auto Detect

  • API‑key control

    Your key, your billing, your provider. No proxy, no logs, no markup on tokens.

Use cases

For people who think out loud.

Five rough shapes of how teams put VoiceNote to work. Yours will look different.

DV
Developers

Dictate commits & PR descriptions.

VoiceNote sits in the background. Speak the change, get a clean commit message — Freely, so proprietary code never touches a cloud.

"Refactored the auth middleware to drop the legacy token path. Adds tests for the rotation case."
IT
IT support

Type tickets at speaking speed.

Long Zendesk replies become a 30‑second voice memo. AI mode formats them into a polite, structured response.

From a 25‑second memo to a four‑paragraph customer response, with steps numbered and tone neutralised.
ST
Students

Lecture notes, on‑device.

Record a 90‑minute lecture, get a marked‑up transcript with definitions surfaced. Works in Danish, English, German and 22 others.

"Photosynthesis → light‑dependent reactions → ATP." Auto‑bulleted, searchable, offline.
WR
Writers

Draft at the speed of thought.

Walk‑and‑dictate first drafts. AI cleanup removes filler without rewriting your voice — the cadence stays yours.

3,000 spoken words → a 2,400‑word draft with paragraphs, retaining sentence rhythm.
FL
Freelancers

Briefs, invoices, follow‑ups.

Speak the client call summary while it's fresh. AI turns it into a scope doc you can send before the kettle's boiled.

Discovery call → bullet summary → estimate → email draft, in one continuous voice session.
YOU
Your team

Drop us a use case.

We add real workflows to the docs as testers share them. If yours is novel, we'll feature it (with permission).

"I dictate every standup, every commit message, half my emails. It's the most invisible piece of software I use."
Pricing

Free, forever. Funded by donations & partners.

Every feature is free — local transcription and AI features alike. No paywall, no Pro tier, no upsell. VoiceNote stays alive because people who can afford to chip in, do. And because a small number of values‑aligned partners help us keep the lights on.

Free, all of it

VoiceNote

The whole app. No tiers, no gated features, no trial countdown.

$0/ forever

no credit card · no account · MIT licensed

  • On‑device ASR · 25 European languages
  • AI polish, summarise, tone cleanup, translate
  • Custom shortcuts, per AI tool
  • Bring your own API key — no markup
  • Works fully offline · open‑source · forkable
About

Built by people who hate typing meetings into Notion.

VoiceNote started as a weekend script to dump a long voice memo into clean markdown. It grew into a small, opinionated tool: capture audio, transcribe it Freely, optionally polish it with an LLM you control — then get out of the way.

No accounts, no telemetry, no subscription. The desktop app is one binary. Everything else lives in a public repository.

"I dictate every standup, every commit message, half my emails. The voice bar floats over whatever I'm doing and I just press a key. It's the most invisible piece of software I use."
— Petra K., early tester · Berlin
25langs
European languages, all running on‑device
0
Forever. No paywall, no "pro" tier, no upsell.
~200ms
From the moment you speak to text on screen
FAQ

Questions, answered honestly.

Yes. Once a language model is downloaded (≈40 MB per language), transcription runs entirely on your CPU. Airplane mode, terminal, train — VoiceNote keeps working. AI mode is the only feature that needs a connection, and only when you actively trigger it.

No. Audio buffers live in RAM for the duration of a transcription, then they're discarded. Nothing is written to disk, nothing is sent over the network. If you enable AI mode, only the resulting text transcript is sent to the provider you chose — never the audio.

No. Download the binary, run it, dictate. There's no sign‑up flow, no email gate, no telemetry ping. Pro is unlocked by a license key you can buy without creating an account — we email it once, you paste it in.

Yes — that's how AI mode is designed. Paste an Anthropic, OpenAI, Mistral, or local LLM endpoint into Settings. Traffic flows directly from your machine to the provider; we never proxy, never see your tokens, never mark them up.

25 European languages including English, Danish, German, French, Spanish, Italian, Dutch, Portuguese, Polish, Swedish, Norwegian, Finnish, Czech, Slovak, Hungarian, Romanian, Greek, Bulgarian, Croatian, Serbian, Slovenian, Estonian, Latvian, Lithuanian and Ukrainian. Each ships as a separate ~40 MB model — install only the ones you use.

Windows 10/11, macOS 12+ (Apple Silicon and Intel), and major Linux distros (Ubuntu, Fedora, Arch). One binary per platform, no installer dependencies.

Yes, MIT licensed. The local app, the model‑downloader and the workflow runtime all live in the same public repository. Fork it, audit it, ship a derivative — that's the point.

Waitlist · v1.0

Talk less to your keyboard.

Get the install link the day v1.0 ships. No marketing, no drip — one email, one download.

2,180 already in·shipping Q3 2026