v8v
1.0.0indexedOrchestrates native on-device speech-to-text into local app actions, cross-app MCP commands, or remote webhook workflows; offline-first, multilingual, privacy-respecting, with unified single-package API.
Orchestrates native on-device speech-to-text into local app actions, cross-app MCP commands, or remote webhook workflows; offline-first, multilingual, privacy-respecting, with unified single-package API.
An open-source, cross-platform voice orchestration framework built with Kotlin Multiplatform. Uses native on-device speech-to-text to turn spoken language into local app actions, cross-app commands via MCP, or remote workflows via webhooks — offline-first, multilingual, and privacy-respecting.
Microphone → Native STT → Transcript → Intent Resolver → Action Router
├── LOCAL (in-app lambda)
├── MCP (local cross-app)
└── REMOTE (n8n webhook)
No audio upload by default. Everything runs on-device unless explicitly configured otherwise.
Single package everywhere — one dependency per platform, includes LOCAL + MCP + REMOTE support:
graph TD
subgraph core ["core (single package)"]
VA[VoiceAgent] --> IR[IntentResolver]
VA --> AR[ActionRouter]
VA --> SE[SpeechRecognitionEngine]
VA --> CB[VoiceAgentCallbacks]
AR --> LH[LocalActionHandler]
AR --> MH[McpActionHandler]
AR --> WH[WebhookActionHandler]
end
subgraph engines ["Platform Engines"]
SE --> AndroidEng[AndroidSpeechEngine]
SE --> IosEng[IosSpeechEngine]
SE --> MacosEng[MacosSpeechEngine]
SE --> WebEng[WebSpeechEngine]
end
1. Add dependency (Gradle):
// settings.gradle.kts
dependencyResolutionManagement {
repositories {
mavenCentral()
}
}
// build.gradle.kts
dependencies {
implementation("io.github.alimomin1998:core-android:0.3.0")
}
2. Use VoiceAgentCallbacks (same callback API as iOS/macOS/Web):
Try: "create task prepare Q3 budget draft".
Advanced: If you prefer Kotlin Flows over callbacks, use
VoiceAgentdirectly — it exposestranscript,actionResults,errors,state, andaudioLevelas Flows.
1. Add via Swift Package Manager:
In Xcode: File > Add Package Dependencies > paste this repo URL.
Or add to Package.swift:
.package(url: "https://github.com/alimomin1998/v8v.git", from: "0.3.0")
2. Use VoiceAgentCallbacks from Swift (same callback API as Android/Web):
Why VoiceAgentCallbacks? Kotlin Flows cannot be directly observed from Swift.
VoiceAgentCallbacksinternally collects all Flows and invokes simple callbacks.
Requirements:
NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription in Info.plistcom.apple.security.device.audio-input entitlement + NSSpeechRecognitionUsageDescription in Info.plist1. Install:
npm install @v8v/core
2. Load a single script (no bundler needed):
<script src="node_modules/@v8v/core/v8v-core.bundle.js"></script>
3. Use VoiceAgentJs (same callback API as Android/iOS/macOS):
V8V works in React web apps using the same @v8v/core npm package. Create a custom hook to wrap VoiceAgentJs:
1. Install:
npm install @v8v/core
2. Load one script in public/index.html (before your React bundle):
<script src="node_modules/@v8v/core/v8v-core.bundle.js"></script>
3. Create a custom hook (useVoiceAgent.js):
4. Use in a component:
Note:
@v8v/coreuses the Web Speech API under the hood, so React web apps get the same browser-based speech recognition.
1. Add dependency (pubspec.yaml):
dependencies:
v8v_flutter: ^0.3.0
2. Use in a widget:
import 'package:v8v_flutter/v8v_flutter.dart';
final agent = V8VVoiceAgent(language: 'en-US', continuous: true);
// Callbacks
agent.onTranscript.listen((text) => print('Heard: $text'));
agent.onIntent.listen((intent) => print('${intent.name}: ${intent.message}'));
agent.onError.listen((msg) => print('Error: $msg'));
agent.onStateChange.listen((state) => print('State: $state'));
agent.onAudioLevel.listen((level) => print('Level: $level'));
// LOCAL action
agent.registerAction('todo.add', {'en-US': ['add *', 'add * to todo']});
// MCP action
agent.registerMcpAction('task.create',
{'en-US': ['create task *']}, 'http://localhost:3001/mcp', 'create_task');
// REMOTE action
agent.registerWebhookAction('notify.team',
{'en-US': ['notify *']}, 'https://n8n.example.com/webhook/voice');
agent.start();
Native bridge via platform channels. All voice processing runs natively — no Dart speech plugins needed.
React Native and Expo support is available as a separate standalone package: @v8v/react-native.
It is a pure TypeScript reimplementation of the V8V voice agent framework (no dependency on the Kotlin Multiplatform core), with thin native modules only for the MCP embedded HTTP server.
npm install @v8v/react-native
npx expo install expo-speech-recognition # or: npm install @react-native-voice/voice
See the @v8v/react-native README for full usage instructions.
The main entry point. Wires a speech engine, intent resolver, and action router together.
JavaScript-friendly facade. Available as V8V.VoiceAgentJs after loading the bundle:
Callback-based facade available on all platforms (lives in commonMain). Provides a uniform API whether you're in Kotlin, Swift, or any KMP target:
When App A sends an MCP command to App B, App B may not be running. V8V solves this with delivery strategies — platform-specific fallbacks that guarantee command delivery even when the HTTP server is down.
All delivery strategies use the same MCP protocol (JSON-RPC 2.0). The strategy only changes the transport — how the JSON-RPC message reaches App B. The McpRequestRouter on App B processes the request identically regardless of how it arrived:
┌─────────────────────────────────┐
│ App B (Server) │
│ │
┌───── HTTP ──────────►│ Ktor HTTP Server │
│ │ │ │
│ │ ▼ │
App A ── MCP JSON-RPC ──►│ McpRequestRouter.handleRequest()│
│ │ │ │
│ │ ▼ │
└── ContentProvider ──►│ McpContentProvider.call() │
(Android IPC) │ │ │
│ ▼ │
│ Tool handler (create_task, etc) │
└─────────────────────────────────┘
Both paths produce an MCP McpToolResult and return it as a JSON-RPC response. ContentProvider is not a replacement for MCP — it is MCP delivered over Android's Binder IPC instead of HTTP.
Auto is the recommended strategy. It always tries HTTP first and only falls back when the HTTP call fails:
contentProviderAuthority is set — delivers MCP via ContentProvider (Android)Android — Auto with ContentProvider fallback:
agent.registerMcpAction(
"task.create", phrases, url, "create_task",
strategy = McpDeliveryStrategy.Auto(
contentProviderAuthority = "io.v8v.mcp.taskapp",
),
)
macOS — Launch and Retry:
agent.registerMcpAction(
"task.create", phrases, url, "create_task",
strategy = McpDeliveryStrategy.LaunchAndRetry(
bundleId = "io.v8v.example.server",
),
)
iOS — Queue and Notify:
agent.registerMcpAction(
intent: "task.create", phrases: phrases, serverUrl: url, toolName: "create_task",
strategy: McpDeliveryStrategy.QueueAndNotify(
notificationTitle: "Voice Command Pending",
retryOnForeground: true
)
)
Extend McpContentProvider and declare it in AndroidManifest. The ContentProvider reuses the same McpRequestRouter and tool handlers as the HTTP server:
class TaskMcpContentProvider : McpContentProvider() {
override fun createRouter(): McpRequestRouter {
val router = McpRequestRouter("my-app")
router.registerTool("create_task", "Create a task") { args ->
text = args[] ?:
TaskRepository.addTask(text)
mcpSuccess()
}
router
}
}
<provider android:name=".TaskMcpContentProvider"
android:authorities="io.v8v.mcp.taskapp" android:exported="true" />
1. Context initialization — call V8VMcp.init(context) once before using MCP reliability features. The best place is Application.onCreate():
class MyApp : Application() {
override fun onCreate() {
super.onCreate()
V8VMcp.init(this)
}
}
2. Package visibility (Android 11+) — App A must declare App B's ContentProvider in <queries> to access it. Without this, ContentResolver.call() silently fails on API 30+:
<!-- In App A's AndroidManifest.xml -->
<queries>
<provider android:authorities="io.v8v.mcp.taskapp" />
</queries>
MCP servers can advertise themselves on the local network:
// Server side (App B)
server.start(advertise = true)
// Client side (App A)
agent.discoverMcpServers { service ->
agent.registerMcpAction("task.create", phrases, service.url, "create_task")
}
Supported on Android (NsdManager), iOS/macOS (Network framework), and JVM (jmDNS). No-op on Web.
Register * wildcard patterns in any language:
agent.registerAction(
intent = "task.create",
phrases = mapOf(
"en-US" to listOf("create task *", "add task *"),
"hi-IN" to listOf("* task banao"),
"es" to listOf("crear tarea *"),
),
) { /* ... */ }
Pass 1 — Wildcard regex: Pattern create task * becomes regex ^create task (.+)$. Exact match gives confidence 1.0.
Pass 2 — Fuzzy (Dice similarity): When fuzzyThreshold > 0 and exact matching fails:
Dice = (2 * |intersection|) / (|A| + |B|)
# JVM + JS compilation
./gradlew :core:compileKotlinJvm :core:compileKotlinJs
# Run tests (all merged into core)
./gradlew :core:jvmTest
# Android example
./gradlew :example-android:assembleDebug
# Build XCFramework (iOS + macOS)
./gradlew :core:assembleV8VCoreReleaseXCFramework
# Lint check (ktlint)
./gradlew ktlintCheck
Six distribution channels are published via GitHub Actions CI or locally:
# Local release (core only — Maven Central + @v8v/core npm)
./scripts/release.sh 0.3.0
# All packages are published via GitHub Actions on tag push.
# See PUBLISHING.md for full details.
./gradlew :example-android:installDebug
Uses io.github.alimomin1998:core-android:0.3.0 from Maven Central.
Try: "add project status update" (LOCAL), "create task schedule review" (MCP), "notify team build is ready" (REMOTE)
cd example-web
npm install # installs @v8v/core from npm
npm start # serves on http://localhost:5174
Open in Chrome. Set MCP URL to http://localhost:3001/mcp in Settings (optional).
./gradlew :core:assembleV8VCoreReleaseXCFramework
open example-ios/V8VPhone.xcodeproj
Requires a real device — the iOS Simulator does not support speech recognition.
./gradlew :core:assembleV8VCoreReleaseXCFramework
open example-macos/V8VMac.xcodeproj
Each platform has a matching server example. The server runs on port 3001 and
exposes create_task, list_tasks, and complete_task MCP tools:
# Node.js (simplest for quick testing)
cd example-server-web && npm install && npm start # port 3001
# Android (Ktor CIO embedded server)
./gradlew :example-server-android:installDebug
# Flutter
cd example-server-flutter && flutter run
# React (Node.js + dashboard)
cd example-server-react && npm install && npm start # port 3001
iOS and macOS server examples use Xcode — open the .xcodeproj and run on device.
cd packages/flutter/example && flutter run
Copyright 2026 V8V Contributors
Licensed under the Apache License, Version 2.0
See LICENSE for the full text.
| Platform | Package | Install |
|---|
| Android / Kotlin | io.github.alimomin1998:core-android | Maven Central |
| iOS / macOS | V8VCore.xcframework | SPM / CocoaPods |
| Web / JS / TS | @v8v/core | npm |
| React (Web) | @v8v/react | npm |
| Flutter | v8v_flutter | pub.dev |
| React Native / Expo | @v8v/react-native | npm (separate repo) |
| JVM | io.github.alimomin1998:core-jvm | Maven Central |
| Platform | Status | Engine | Distribution |
|---|
| Android | Available | android.speech.SpeechRecognizer | Maven Central |
| iOS | Available | SFSpeechRecognizer + AVAudioEngine | XCFramework / SPM |
| macOS | Available | SFSpeechRecognizer + AVAudioEngine | XCFramework / SPM |
| Web | Available | Web Speech API | npm (@v8v/core) |
| React (Web) | Available | Web Speech API | npm (@v8v/react) |
| Flutter | Available | Native bridge (Android + iOS) | pub.dev (v8v_flutter) |
| JVM (Desktop) | Core only | Bring your own engine | Maven Central |
| React Native / Expo | Available | Native STT (via adapters) | npm (@v8v/react-native) — separate repo |
| Windows | Planned | — | — |
| Linux | Planned | — | — |
| Dependency | Minimum | Tested |
|---|
| Android SDK | API 24 (Android 7.0) | API 35 (Android 15) |
| iOS | 16.0 | 17+ |
| macOS | 13.0 (Ventura) | 14+ (Sonoma) |
| Web Browser | Chrome 33+ / Edge 79+ | Chrome 120+ |
| Safari (Web) | Not supported (no Web Speech API) | — |
| Firefox (Web) | Not supported (no Web Speech API) | — |
| JDK | 17 | 17 |
| Kotlin | 2.1.20 | 2.1.20 |
| Gradle | 8.0 | 8.7+ |
| Xcode | 15.0 | 15+ |
| Ktor | 3.0.3 | 3.0.3 |
| Node.js (MCP server) | 18+ | 20+ |
v8v/
├── core/ # Single KMP module: VoiceAgent + MCP + Webhooks
│ ├── commonMain/ # Shared: VoiceAgent, IntentResolver, ActionRouter,
│ │ # McpClient, McpActionHandler, McpDeliveryStrategy,
│ │ # McpCommandQueue, PlatformMcpDelivery (expect),
│ │ # McpServiceAdvertiser/Browser (expect)
│ ├── androidMain/ # Android SpeechRecognizer + McpEmbeddedServer (Ktor CIO)
│ │ # + McpContentProvider + NsdManager discovery
│ ├── iosMain/ # iOS SFSpeechRecognizer + UNNotification queue
│ ├── macosMain/ # macOS SFSpeechRecognizer + McpEmbeddedServer (Ktor CIO)
│ │ # + NSWorkspace app launcher
│ ├── jsMain/ # Web Speech API + @JsExport facade (VoiceAgentJs)
│ ├── appleMain/ # Shared Apple: Ktor Darwin engine + NWBrowser discovery
│ └── jvmMain/ # JVM McpEmbeddedServer + jmDNS discovery
│
├── packages/ # Platform wrapper packages
│ ├── flutter/ # v8v_flutter — pub.dev (Android + iOS)
│ └── react/ # @v8v/react — npm (Web)
│
├── example-android/ # Android client — LOCAL + MCP + REMOTE
├── example-ios/ # iOS SwiftUI client — all 3 scopes + settings
├── example-macos/ # macOS SwiftUI client — all 3 scopes + settings
├── example-web/ # Web client — HTML + vanilla JS (no bundler)
│
├── example-server-android/ # Android MCP server example
├── example-server-ios/ # iOS MCP server example (NWListener)
├── example-server-macos/ # macOS MCP server example (Ktor CIO)
├── example-server-web/ # Node.js MCP server example
├── example-server-flutter/ # Flutter MCP server example
├── example-server-react/ # React + Node.js MCP server example
│
├── Package.swift # Swift Package Manager manifest
└── scripts/ # Release automation scripts
val agent = VoiceAgentCallbacks(
engine = createPlatformEngine(context),
config = VoiceAgentConfig(
language = "en-US",
continuous = true,
partialResults = true,
fuzzyThreshold = 0.3f,
silenceTimeoutMs = 1500L,
),
)
// Callbacks — identical pattern across all platforms
agent.onTranscript { text -> println("Heard: $text") }
agent.onIntent { intent, message -> println("$intent: $message") }
agent.onError { msg -> println("Error: $msg") }
// LOCAL action
agent.registerLocalAction(
intent = "task.create",
phrases = mapOf("en-US" to listOf("create task *", "add task *")),
) { resolved ->
taskService.createTask(resolved.extractedText)
}
// MCP action (cross-app via local MCP server)
agent.registerMcpAction(
intent = "task.sync",
phrases = mapOf("en-US" to listOf("sync task *")),
serverUrl = "http://localhost:3001/mcp",
toolName = "create_task",
)
// REMOTE action (webhook)
agent.registerWebhookAction(
intent = "notify.team",
phrases = mapOf("en-US" to listOf("notify *")),
webhookUrl = "https://n8n.example.com/webhook/voice",
)
agent.start()
import V8VCore
let agent = VoiceAgentCallbacks(
engine: MacosSpeechEngine(), // or IosSpeechEngine() on iOS
config: VoiceAgentConfig(
language: "en-US",
continuous: true,
partialResults: true,
fuzzyThreshold: 0.3,
silenceTimeoutMs: 1500
),
permissionHelper: MacosPermissionHelper() // or IosPermissionHelper()
)
agent.onTranscript { text in print("Heard: \(text)") }
agent.onIntent { intent, message in print("\(intent): \(message)") }
agent.onError { msg in print("Error: \(msg)") }
// LOCAL action
agent.registerLocalAction(
intent: "task.create",
phrases: ["en-US": ["create task *", "add task *"]],
handler: { resolved in
print("Create task: \(resolved.extractedText)")
}
)
// MCP action
agent.registerMcpAction(
intent: "task.sync",
phrases: ["en-US": ["sync task *"]],
serverUrl: "http://localhost:3001/mcp",
toolName: "create_task"
)
// REMOTE action
agent.registerWebhookAction(
intent: "notify.team",
phrases: ["en-US": ["notify *"]],
webhookUrl: "https://n8n.example.com/webhook/voice"
)
agent.start()
const agent = new V8V.VoiceAgentJs('en');
agent.onTranscript(text => console.log('Heard:', text));
agent.onIntent((intent, message) => console.log(intent, message));
agent.onError(msg => console.error(msg));
// LOCAL action
agent.registerPhrase('todo.add', 'en', 'add *');
// MCP action (via library's built-in Ktor-based McpClient)
agent.registerMcpAction('task.create', 'en',
['create task *', 'new task *'],
'http://localhost:3001/mcp', 'create_task');
// REMOTE action (via library's built-in Ktor-based WebhookActionHandler)
agent.registerWebhookAction('notify.team', 'en',
['notify *', 'send notification *'],
'https://n8n.example.com/webhook/voice');
// Request mic permission first (must be in click handler)
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
stream.getTracks().forEach(t => t.stop());
agent.start();
import { useState, useEffect, useRef, useCallback } from 'react';
export function useVoiceAgent(language = 'en') {
const agentRef = useRef(null);
const [transcript, setTranscript] = useState('');
const [error, setError] = useState('');
const [listening, setListening] = useState(false);
useEffect(() => {
if (!globalThis.V8V) return;
const agent = new V8V.VoiceAgentJs(language);
agent.onTranscript(text => setTranscript(text));
agent.onError(msg => setError(msg));
agentRef.current = agent;
return () => agent.destroy();
}, [language]);
const registerPhrase = useCallback((intent, lang, phrase) => {
agentRef.current?.registerPhrase(intent, lang, phrase);
}, []);
const registerMcpAction = useCallback((intent, lang, phrases, url, tool) => {
agentRef.current?.registerMcpAction(intent, lang, phrases, url, tool);
}, []);
const registerWebhookAction = useCallback((intent, lang, phrases, url) => {
agentRef.current?.registerWebhookAction(intent, lang, phrases, url);
}, []);
const start = useCallback(async () => {
try {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
stream.getTracks().forEach(t => t.stop());
} catch (e) { /* proceed anyway */ }
agentRef.current?.start();
setListening(true);
}, []);
const stop = useCallback(() => {
agentRef.current?.stop();
setListening(false);
}, []);
return {
transcript, error, listening,
registerPhrase, registerMcpAction, registerWebhookAction,
onIntent: (cb) => agentRef.current?.onIntent(cb),
start, stop,
};
}
import { useVoiceAgent } from './useVoiceAgent';
function VoiceApp() {
const { transcript, error, listening, registerPhrase, start, stop } = useVoiceAgent('en');
const [todos, setTodos] = useState([]);
useEffect(() => {
registerPhrase('todo.add', 'en', 'add *');
}, [registerPhrase]);
return (
<div>
<button onClick={listening ? stop : start}>
{listening ? 'Stop' : 'Start Listening'}
</button>
<p>Heard: {transcript}</p>
{error && <p style={{ color: 'red' }}>{error}</p>}
</div>
);
}
| Method | Description |
|---|
registerAction(intent, phrases, handler) | Register a voice command |
start() | Begin listening |
stop() | Stop listening |
updateConfig(config) | Change language, continuous mode, fuzzy threshold at runtime |
destroy() | Release all resources |
| Flow / State | Type | Description |
|---|
state | StateFlow<AgentState> | IDLE, LISTENING, PROCESSING |
transcript | SharedFlow<String> | Every final (or partial) transcript |
errors | SharedFlow<VoiceAgentError> | Structured errors (permission, engine, action) |
actionResults | SharedFlow<ActionResult> | Success/Error from dispatched actions |
audioLevel | StateFlow<Float> | Normalized 0.0-1.0 mic volume |
| Method | Description |
|---|
registerPhrase(intent, lang, pattern) | Register a LOCAL voice command |
registerMcpAction(intent, lang, phrases[], serverUrl, toolName) | Register MCP action (Ktor HTTP) |
registerWebhookAction(intent, lang, phrases[], webhookUrl) | Register webhook action (Ktor HTTP) |
onTranscript(callback) | Called on each transcript |
onIntent(callback) | Called with (intentName, message) |
onError(callback) | Called on errors |
onUnhandled(callback) | Called when no intent matched |
start() / stop() / destroy() | Lifecycle |
| Method | Description |
|---|
onTranscript(callback) | Called on each transcript |
onIntent(callback) | Called with (intentName, message) on success |
onError(callback) | Called on errors (engine + action failures) |
onUnhandled(callback) | Called when no intent matched |
onStateChange(callback) | Called on IDLE/LISTENING/PROCESSING |
onAudioLevel(callback) | Called with mic volume (0.0-1.0) |
registerLocalAction(intent, phrases, handler) | Register a LOCAL action |
registerMcpAction(intent, phrases, serverUrl, toolName) | Register an MCP action |
registerMcpAction(intent, phrases, serverUrl, toolName, strategy) | Register an MCP action with delivery strategy |
registerWebhookAction(intent, phrases, webhookUrl) | Register a webhook action |
onQueued(callback) | Called when an MCP command is queued (iOS) |
discoverMcpServers(onFound) | Browse for MCP servers via mDNS/Bonjour |
stopDiscovery() | Stop mDNS browsing |
retryQueuedCommands(onResult) | Retry queued MCP commands |
| Property | Type | Default | Description |
|---|
language | String | "en" | BCP-47 language tag |
continuous | Boolean | true | Auto-restart after each utterance |
partialResults | Boolean | false | Forward partial transcripts |
fuzzyThreshold | Float | 0.0 | Dice similarity threshold (0 = exact only) |
silenceTimeoutMs | Long | 1500 | Auto-promote partial to final after silence (ms) |
| Scope | Handler | Use Case |
|---|
LOCAL | LocalActionHandler | In-app actions, offline, default |
MCP | McpActionHandler | Cross-app via local MCP server |
REMOTE | WebhookActionHandler | Cloud workflows via n8n/Zapier |
| Strategy | Platform | Behavior |
|---|
Http (default) | All | Single HTTP call, fail on error |
ContentProvider(authority) | Android | MCP over ContentProvider — wakes App B's process |
LaunchAndRetry(bundleId) | macOS | Launches App B via NSWorkspace, retries MCP over HTTP |
QueueAndNotify() | iOS | Queues MCP command, shows local notification |
Auto(...) | All | Tries HTTP first, picks the right fallback per platform |
macOsBundleId is set — launches App B, retries MCP over HTTP (macOS)iosQueueEnabled is true — queues command, shows notification (iOS)| Channel | Package | Tool |
|---|
| Maven Central | io.github.alimomin1998:core-* | Gradle |
| npm | @v8v/core | npm |
| npm | @v8v/react | npm |
| pub.dev | v8v_flutter | dart pub publish |
| GitHub Releases / SPM | V8VCore.xcframework.zip | gh release create |
Surfaced from shared tags and platforms — no rankings paid for.