Changelog

Major product updates across Windows and macOS.

This page tracks the bigger FluidVox milestones pulled from the git history so we have a durable release record we can keep extending over time.

Windows latest April 2, 2026

Natural style zero-network mode, hands-free hotkey customization, and text injection reliability.

macOS latest March 22, 2026

Local file transcription — transcribe imported audio files using on-device models, free for all users.

April 2, 2026

Natural style zero-network mode, hands-free hotkey customization, and injection fix (v2.6.1–2.6.3)

Natural transcription style now runs entirely on-device with zero network calls — no cloud cleanup, no quota usage, and no BYO key required.
Hands-free recording hotkey is now fully customizable in Settings, with the same modifier+key picker used for the dictation and command hotkeys.
Three-way hotkey conflict detection prevents assigning the same key combo to dictation, command, and hands-free hotkeys.
Text injection spacing logic was simplified — a space is now only prepended when actual punctuation is detected immediately before the cursor, reducing false positives.
The target window is now focused before reading the cursor position, fixing intermittent spacing failures in apps like Notepad.

April 1, 2026

WASAPI audio fixes, multi-language auto-detect, and default style improvements (v2.6.0)

WASAPI microphone device now stays alive for 5 minutes between recordings, eliminating broken warm restarts that caused silent captures.
The RMS silence threshold was lowered from 0.02 to 0.01, fixing false speech rejection from quiet wireless microphones.
Multi-language auto-detect mode lets you select target languages for Whisper, using native probability scoring to pick the best candidate — especially helpful for similar languages like Urdu and Hindi.
Default per-app style categories now seed with Natural (follow global setting) instead of specific overrides, giving new users a cleaner starting experience.
Fixed a false "Performance restored" power notification that appeared when already on AC power.

March 27, 2026

Custom styles, system audio muting, and audio capture upgrade (v2.4.11–2.4.13)

Custom Styles let you create your own transcription style with a free-text prompt — translate to another language, rewrite as a haiku, add annotations, or anything else you can describe. Up to 5 custom styles per user, assignable per-app.
Custom styles now work correctly with personal Gemini API keys — translation and annotation prompts are no longer overridden by built-in language constraints.
A new "Mute System Audio on Record" toggle silences other apps while you dictate, so background noise doesn't interfere with transcription.
Microphone capture migrated from WaveIn to WASAPI, giving instant mic release after recording ends — other apps regain mic access immediately.
English-only Whisper models (Tiny, Base, Small) no longer show Auto Detect or non-English languages in the language picker, preventing misconfiguration.
Win key combo hotkey now stops cleanly on key release, fixing edge cases where the hotkey could get stuck.
Device identification migrated to a more reliable hardware fingerprint.

March 26, 2026

Whisper language detection fix and text injection reliability (v2.4.10)

Fixed Whisper auto-detect language not working correctly — the processor was configured with an empty language string instead of "auto", causing unreliable language detection for multilingual users.
Text injection now releases any stuck modifier keys (Alt, Shift, Win) before pasting, preventing the destination app's menu bar from activating instead of receiving the paste.
Alt key release uses dummy scancode padding to avoid triggering Windows menu bar activation when clearing stuck state.

March 26, 2026

Power state resilience and performance scaling (v2.4.9)

The app now automatically recovers when switching between AC power and battery — no restart required.
Audio devices, hotkeys, and efficiency mode settings are revalidated after every power transition.
Local transcription models (Parakeet and Whisper) are automatically reinitialized when the power source changes.
Whisper CUDA sessions gracefully fall back to CPU if the GPU becomes unavailable on battery, and restore GPU when AC power returns.
Parakeet thread count now scales dynamically based on your CPU cores and power state for optimal performance.
The system tray icon automatically recovers if Windows Explorer restarts.
Fixed an issue where holding the Alt hotkey for long recordings could trigger the Windows menu bar.

March 25, 2026

Streaming local transcription and microphone improvements (v2.4.3–2.4.5)

Local transcription now uses real-time streaming instead of file-based recording, significantly reducing latency.
Recording starts instantly with no delay between pressing the hotkey and audio capture beginning.
The selected microphone is now correctly used across all recording modes, including Help Mode voice input.
Automatic gain control with fade-in improves audio quality, especially for quiet microphones.
Empty audio with no speech detected is now skipped instead of producing a blank transcription.
Alt-key combo hotkeys no longer trigger Word's KeyTips ribbon after recording stops.

March 24, 2026

CUDA auto-detection and transcription reliability (v2.4.0–2.4.2)

The app now auto-detects your GPU and CUDA availability, with a built-in GPU setup guide for optimal local transcription performance.
Cold-start model warmup eliminates the first-transcription delay after launching the app.
Fixed garbled transcription caused by an incorrect sample rate mismatch detection.
Fixed a crash that could occur when recording very short silence-only audio with Whisper.
Improved spacing between consecutive dictations.

March 22, 2026

Local file transcription (v2.3.0)

Imported audio and video files can now be transcribed using on-device Whisper or Parakeet models — no cloud access required.
A new Local/Cloud toggle in the Audio Transcriptions tab lets you choose between on-device and cloud transcription for files.
Notes now save correctly for Local tier users.

March 21, 2026

Local Monthly subscription plan (v2.2.0)

A new Local Monthly plan is available for users who want local-only transcription on a monthly billing cycle.
Device management support was added for multi-device licensing.

March 12, 2026

Help Mode: real-time screen-sharing guidance (v2.1.0)

Help Mode lets you share your screen and get spoken step-by-step guidance in real time, powered by Gemini Live API.
Local transcription quality improved with proper audio resampling and automatic gain control.
Fixed an issue where the update dialog could appear in a loop on launch.
Fixed a race condition where Alt-key hotkeys could leave the recording stuck.

March 11, 2026

One-click support diagnostics (v2.0.1)

A new "Copy Support Info" button in Settings copies a full diagnostic snapshot (account, hardware, settings) to the clipboard for easy pasting into support messages.
Internal analytics and telemetry improvements for better reliability.

March 10, 2026

Parakeet local transcription — fast, accurate, no GPU required (v2.0.0)

Parakeet CPU-based transcription models are now available as a second local engine alongside Whisper, powered by sherpa-onnx.
Two Parakeet models ship: a fast English-only model and a multilingual model supporting 25 languages — both run efficiently on CPU with no GPU needed.
The onboarding hardware check now recommends Parakeet for machines without a capable GPU, so every user gets a good local transcription experience out of the box.
The Manage Local Models dialog shows both Parakeet and Whisper models with hardware-aware "Recommended" badges and descriptions explaining when each engine works best.
Active Parakeet models are preloaded in the background at app startup to eliminate cold-start latency on the first transcription.
Model extraction was rebuilt to handle large archives reliably, fixing a rare issue where downloaded model files could be silently truncated.

March 10, 2026

Local Whisper language detection and Gemini response filtering (v1.3.3)

Local Whisper transcription now detects the spoken language in auto-detect mode and passes it to Gemini for correct cleanup.
The "Auto" transcription engine option was removed — users now choose Cloud or Local directly, simplifying settings.
Gemini meta-responses (e.g. "Please provide the audio file...") are now filtered out instead of being injected as text.

March 9, 2026

Rich plan dialog, media pause, Microsoft OAuth, and hotkey improvements (v1.3.0)

A new Choose Your Plan dialog matches the macOS design with rich plan cards and feature breakdowns.
Media now automatically pauses when you start recording and resumes when you stop.
Microsoft OAuth was added for Calendar and Tasks integration alongside Google.
Configurable Win key hotkeys and additional hotkey combinations were added in Settings.
Text injection was improved with streaming and progressive insertion for faster results.

March 8, 2026

Smoother Google Sign-In and smarter model recommendations (v1.2.9)

Google Sign-In no longer triggers the "unverified app" warning — only basic profile scopes are requested at login.
Calendar and Tasks permissions are now requested incrementally, only when the user first uses those features.
The Whisper model management dialog now shows a GPU-aware "Recommended" badge next to the best model for your hardware.

March 7, 2026

Windows agent file handling got a major usability pass

The file-management tool now supports fuzzy file search, better directory filtering, and optional search parameters.
PowerShell output is being surfaced more clearly, including deferred display for larger results.
The Vox result window can now render open-file and open-folder actions directly from agent results.
Agent prompt rules were tightened so tool output is repeated back to the user instead of being treated as visible by default.
Math and markdown rendering was updated to support the new file-action links inside results.

March 6, 2026

Windows moved into the 1.2.x release line

Windows 1.2.4 through 1.2.6 shipped with Gemini key UX fixes and a Windows version bump.
Large local Whisper model crashes were addressed with VRAM detection and recovery logic.
Bring-your-own Gemini key routing was wired into the Windows product flow.
RDP clipboard handling was fixed for remote desktop users.

March 5, 2026

Update prompts, verification flow, and toolbar behavior improved

Windows 1.2.0 added an update dialog on launch so new builds are easier to discover.
Email verification for password sign-up shipped, then moved onto the Resend-based verification flow.
The floating toolbar became draggable.
Speculative miss handling was optimized to keep the interaction loop tighter.

March 2-4, 2026

Vox Agent, auto-updates, and a broader Windows UI overhaul landed

A major Windows feature update shipped with hotkey fixes, Vox Agent support, and UI overhaul work.
Velopack auto-update support was integrated into the Windows app.
Tray menu behavior and per-app style handling were improved.
Cloud function prompt issues were fixed as the Windows agent stack matured.

February 23, 2026

Recording flow and hotkey behavior got much tighter

Instant recording landed, with better start and stop sound timing.
Alt-key recording issues were fixed, including menu activation on key release.
Short silent recordings now skip unnecessary fallbacks instead of creating noisy failures.
System output is muted during recording to reduce interference.

February 23, 2026

Update delivery and release infrastructure were added

A visible Check for Updates button was added to the Windows app.
The auto-update channel was corrected to the Windows release stream.
Release pipeline work shipped with code signing and packaging improvements.
License binding and auto-start behavior were hardened for real installs.

February 21-23, 2026

Vox actions expanded beyond plain dictation

Hey Vox gained screenshot capture support for visual tasks.
Notepad fallback handling improved text insertion reliability.
Audio transcription UI work shipped so longer-form audio flows are visible in the app.

February 20-22, 2026

Security, onboarding, and subscription checks were upgraded

Security hardening landed across the Windows app and release flow.
Local Whisper support was added as part of the on-device transcription path.
Email verification gating shipped for new accounts.
Trial dates were added to new user profiles so account state is clearer.

February 10-11, 2026

The Windows app launched and quickly closed the first UX gaps

The initial WinUI 3 Windows app foundation shipped.
Dashboard UI parity work brought the Windows experience closer to macOS.
Tray icon context menu, text injection, microphone selection, and latency issues were fixed early.

March 22, 2026

Local file transcription for all users (v2.5.0)

Imported audio and video files can now be transcribed using on-device Whisper or Parakeet models — no cloud access or Pro subscription required.
A new Local/Cloud toggle in the Audio Transcriptions tab lets you choose between on-device and cloud transcription.
Free users with a downloaded local model now have full access to the file transcription feature.
The audio language picker automatically filters to languages supported by the active local engine.
Model loading happens on demand — if the selected model isn't in memory, it loads automatically before transcription begins.

March 9, 2026

Mouse button hotkeys, web integrations, and agent tool upgrades

Any mouse button (M4/M5 side buttons, middle click, etc.) can now be assigned as the recording trigger — hold for walkie-talkie, tap for hands-free.
Mouse button events are intercepted and suppressed so they don't fire their default action (e.g. Back/Forward) in other apps.
Web integration services were added for Gmail, Google Calendar, and Microsoft services via OAuth.
The Vox Agent gained enhanced calendar, reminders, email, and AppleScript tools for richer voice-driven actions.
A device switch confirmation sheet was added for smoother multi-device handling.

March 3-4, 2026

Core stability work focused on freezes and agent recovery

Main-thread freezes caused by mouse monitoring and audio-level storms were fixed.
Engine health checks now include a startup grace period to avoid false positives.
The Vox Agent loop can now detect and recover from malformed Gemini tool calls.
Audio engine race conditions in the correction monitor were addressed.

February 28-March 1, 2026

Plans, API key flexibility, and usage messaging were refined

Pro, Pro+, Local, and trial users got clearer plan management paths with a View Plans button.
Usage messaging switched from exact token estimates to simpler relative usage labels.
Pro, Pro+, and trial users can optionally bring their own Gemini API key.
Pricing and token budgets were simplified around the lower Pro price point.

February 28, 2026

Vox Agent became much more capable and less noisy

New agent tools shipped alongside better PDF and presentation creation flows.
Math rendering and cloud backend support expanded the output quality of generated documents.
Brief action confirmations moved to toast notifications instead of heavier interruptions.
The agent stopped opening unnecessary browser tabs for simple knowledge questions.
Screenshot sending in iMessage was fixed and content safety rules were added.

February 26-27, 2026

Document, spreadsheet, and style workflows expanded fast

Live spreadsheet editing shipped for Numbers and Excel, including chart support.
Presentation and PDF editing tools were added, followed by real file creation workflows.
Per-app writing style was applied directly to Vox Agent composition.
Style switcher chips were added to the floating toolbar and synced with the dashboard.
Email reply workflows and AppleScript feedback loops were fixed.

February 24-25, 2026

Screenshots, image generation, and subscription packaging matured

Screenshot capture moved to ScreenCaptureKit for more reliable visual context.
Full-page capture and read-screen fallback tooling were added for the agent.
Generated images can now render inline in the Vox Result window with a copy action.
Image generation switched to Gemini's native image API.
Pro+ and unified token-budget subscription routing were introduced.

February 23, 2026

Vox Agent rolled out in phases and became a headline feature

The initial Vox Agent release shipped with multi-step tools and confirmation handling.
Conversation memory, prompt improvements, and new tool phases followed the same day.
Document-generation tools, command hotkeys, file reading, and language support expanded the feature set.
Screenshot behavior, errors, hotkeys, and sound handling were tightened immediately after launch.

February 8-19, 2026

The macOS app foundation filled in updates, local models, and dictation polish

Sparkle auto-updates, release pipeline work, and download packaging shipped.
Bring-your-own Gemini key support and local transcription models were added.
Automatic language detection, style matching improvements, and correction learning kept improving output quality.
Custom hotkeys, hide-from-dock behavior, Hey Vox voice commands, and audio transcription all landed in this stretch.