v8v

1.0.0indexed

Orchestrates native on-device speech-to-text into local app actions, cross-app MCP commands, or remote webhook workflows; offline-first, multilingual, privacy-respecting, with unified single-package API.

AndroidJSJVMNative·alimomin1998/v8v

Stars

—

Used by

dependents

—

Health

/ 100

V8V

An open-source, cross-platform voice orchestration framework built with Kotlin Multiplatform. Uses native on-device speech-to-text to turn spoken language into local app actions, cross-app commands via MCP, or remote workflows via webhooks — offline-first, multilingual, and privacy-respecting.

Microphone → Native STT → Transcript → Intent Resolver → Action Router
                                                           ├── LOCAL  (in-app lambda)
                                                           ├── MCP    (local cross-app)
                                                           └── REMOTE (n8n webhook)

No audio upload by default. Everything runs on-device unless explicitly configured otherwise.

Single package everywhere — one dependency per platform, includes LOCAL + MCP + REMOTE support:

Platform Support

Compatibility Matrix

Architecture

graph TD
    subgraph core ["core (single package)"]
        VA[VoiceAgent] --> IR[IntentResolver]
        VA --> AR[ActionRouter]
        VA --> SE[SpeechRecognitionEngine]
        VA --> CB[VoiceAgentCallbacks]
        AR --> LH[LocalActionHandler]
        AR --> MH[McpActionHandler]
        AR --> WH[WebhookActionHandler]
    end

    subgraph engines ["Platform Engines"]
        SE --> AndroidEng[AndroidSpeechEngine]
        SE --> IosEng[IosSpeechEngine]
        SE --> MacosEng[MacosSpeechEngine]
        SE --> WebEng[WebSpeechEngine]
    end

Project Structure

Quick Start

Android / Kotlin

1. Add dependency (Gradle):

// settings.gradle.kts
dependencyResolutionManagement {
    repositories {
        mavenCentral()
    }
}

// build.gradle.kts
dependencies {
    implementation("io.github.alimomin1998:core-android:0.3.0")
}

2. Use VoiceAgentCallbacks (same callback API as iOS/macOS/Web):

Try: "create task prepare Q3 budget draft".

Advanced: If you prefer Kotlin Flows over callbacks, use VoiceAgent directly — it exposes transcript, actionResults, errors, state, and audioLevel as Flows.

iOS / macOS (Swift)

1. Add via Swift Package Manager:

In Xcode: File > Add Package Dependencies > paste this repo URL.

Or add to Package.swift:

.package(url: "https://github.com/alimomin1998/v8v.git", from: "0.3.0")

2. Use VoiceAgentCallbacks from Swift (same callback API as Android/Web):

Why VoiceAgentCallbacks? Kotlin Flows cannot be directly observed from Swift. VoiceAgentCallbacks internally collects all Flows and invokes simple callbacks.

Requirements:

iOS: NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription in Info.plist
macOS: com.apple.security.device.audio-input entitlement + NSSpeechRecognitionUsageDescription in Info.plist
iOS requires a real device (simulator does not support speech recognition)

Web (JavaScript / TypeScript)

1. Install:

npm install @v8v/core

2. Load a single script (no bundler needed):

<script src="node_modules/@v8v/core/v8v-core.bundle.js"></script>

3. Use VoiceAgentJs (same callback API as Android/iOS/macOS):

React (Web)

V8V works in React web apps using the same @v8v/core npm package. Create a custom hook to wrap VoiceAgentJs:

1. Install:

npm install @v8v/core

2. Load one script in public/index.html (before your React bundle):

<script src="node_modules/@v8v/core/v8v-core.bundle.js"></script>

3. Create a custom hook (useVoiceAgent.js):

4. Use in a component:

Note: @v8v/core uses the Web Speech API under the hood, so React web apps get the same browser-based speech recognition.

Flutter

1. Add dependency (pubspec.yaml):

dependencies:
  v8v_flutter: ^0.3.0

2. Use in a widget:

import 'package:v8v_flutter/v8v_flutter.dart';

final agent = V8VVoiceAgent(language: 'en-US', continuous: true);

// Callbacks
agent.onTranscript.listen((text) => print('Heard: $text'));
agent.onIntent.listen((intent) => print('${intent.name}: ${intent.message}'));
agent.onError.listen((msg) => print('Error: $msg'));
agent.onStateChange.listen((state) => print('State: $state'));
agent.onAudioLevel.listen((level) => print('Level: $level'));

// LOCAL action
agent.registerAction('todo.add', {'en-US': ['add *', 'add * to todo']});

// MCP action
agent.registerMcpAction('task.create',
    {'en-US': ['create task *']}, 'http://localhost:3001/mcp', 'create_task');

// REMOTE action
agent.registerWebhookAction('notify.team',
    {'en-US': ['notify *']}, 'https://n8n.example.com/webhook/voice');

agent.start();

Native bridge via platform channels. All voice processing runs natively — no Dart speech plugins needed.

React Native / Expo

React Native and Expo support is available as a separate standalone package: @v8v/react-native.

It is a pure TypeScript reimplementation of the V8V voice agent framework (no dependency on the Kotlin Multiplatform core), with thin native modules only for the MCP embedded HTTP server.

npm install @v8v/react-native
npx expo install expo-speech-recognition   # or: npm install @react-native-voice/voice

See the @v8v/react-native README for full usage instructions.

Core API

VoiceAgent

The main entry point. Wires a speech engine, intent resolver, and action router together.

VoiceAgentJs (Web)

JavaScript-friendly facade. Available as V8V.VoiceAgentJs after loading the bundle:

VoiceAgentCallbacks (Android / iOS / macOS)

Callback-based facade available on all platforms (lives in commonMain). Provides a uniform API whether you're in Kotlin, Swift, or any KMP target:

VoiceAgentConfig

Action Scopes

MCP Reliability (v1.0)

When App A sends an MCP command to App B, App B may not be running. V8V solves this with delivery strategies — platform-specific fallbacks that guarantee command delivery even when the HTTP server is down.

How it works

All delivery strategies use the same MCP protocol (JSON-RPC 2.0). The strategy only changes the transport — how the JSON-RPC message reaches App B. The McpRequestRouter on App B processes the request identically regardless of how it arrived:

                         ┌─────────────────────────────────┐
                         │           App B (Server)         │
                         │                                  │
  ┌───── HTTP ──────────►│  Ktor HTTP Server                │
  │                      │       │                          │
  │                      │       ▼                          │
App A ── MCP JSON-RPC ──►│  McpRequestRouter.handleRequest()│
  │                      │       │                          │
  │                      │       ▼                          │
  └── ContentProvider ──►│  McpContentProvider.call()       │
       (Android IPC)     │       │                          │
                         │       ▼                          │
                         │  Tool handler (create_task, etc) │
                         └─────────────────────────────────┘

Both paths produce an MCP McpToolResult and return it as a JSON-RPC response. ContentProvider is not a replacement for MCP — it is MCP delivered over Android's Binder IPC instead of HTTP.

Delivery strategies

Auto mode decision flow

Auto is the recommended strategy. It always tries HTTP first and only falls back when the HTTP call fails:

HTTP succeeds — returns the result immediately (fastest path)
HTTP fails + contentProviderAuthority is set — delivers MCP via ContentProvider (Android)

Usage

Android — Auto with ContentProvider fallback:

agent.registerMcpAction(
    "task.create", phrases, url, "create_task",
    strategy = McpDeliveryStrategy.Auto(
        contentProviderAuthority = "io.v8v.mcp.taskapp",
    ),
)

macOS — Launch and Retry:

agent.registerMcpAction(
    "task.create", phrases, url, "create_task",
    strategy = McpDeliveryStrategy.LaunchAndRetry(
        bundleId = "io.v8v.example.server",
    ),
)

iOS — Queue and Notify:

agent.registerMcpAction(
    intent: "task.create", phrases: phrases, serverUrl: url, toolName: "create_task",
    strategy: McpDeliveryStrategy.QueueAndNotify(
        notificationTitle: "Voice Command Pending",
        retryOnForeground: true
    )
)

App B setup (Android ContentProvider)

Extend McpContentProvider and declare it in AndroidManifest. The ContentProvider reuses the same McpRequestRouter and tool handlers as the HTTP server:

class TaskMcpContentProvider : McpContentProvider() {
    override fun createRouter(): McpRequestRouter {
        val router = McpRequestRouter("my-app")
        router.registerTool("create_task", "Create a task") { args ->
             text = args[] ?: 
            TaskRepository.addTask(text)
            mcpSuccess()
        }
         router
    }
}

<provider android:name=".TaskMcpContentProvider"
    android:authorities="io.v8v.mcp.taskapp" android:exported="true" />

Android requirements

1. Context initialization — call V8VMcp.init(context) once before using MCP reliability features. The best place is Application.onCreate():

class MyApp : Application() {
    override fun onCreate() {
        super.onCreate()
        V8VMcp.init(this)
    }
}

2. Package visibility (Android 11+) — App A must declare App B's ContentProvider in <queries> to access it. Without this, ContentResolver.call() silently fails on API 30+:

<!-- In App A's AndroidManifest.xml -->
<queries>
    <provider android:authorities="io.v8v.mcp.taskapp" />
</queries>

mDNS / Bonjour Discovery

MCP servers can advertise themselves on the local network:

// Server side (App B)
server.start(advertise = true)

// Client side (App A)
agent.discoverMcpServers { service ->
    agent.registerMcpAction("task.create", phrases, service.url, "create_task")
}

Supported on Android (NsdManager), iOS/macOS (Network framework), and JVM (jmDNS). No-op on Web.

Intent Matching

agent.registerAction(
    intent = "task.create",
    phrases = mapOf(
        "en-US" to listOf("create task *", "add task *"),
        "hi-IN" to listOf("* task banao"),
        "es" to listOf("crear tarea *"),
    ),
) { /* ... */ }

Pass 1 — Wildcard regex: Pattern create task * becomes regex ^create task (.+)$. Exact match gives confidence 1.0.

Pass 2 — Fuzzy (Dice similarity): When fuzzyThreshold > 0 and exact matching fails:

Dice = (2 * |intersection|) / (|A| + |B|)

Building from Source

Prerequisites

JDK 17+
Android SDK 35
Xcode 15+ (for Apple targets)

Build & Test

# JVM + JS compilation
./gradlew :core:compileKotlinJvm :core:compileKotlinJs

# Run tests (all merged into core)
./gradlew :core:jvmTest

# Android example
./gradlew :example-android:assembleDebug

# Build XCFramework (iOS + macOS)
./gradlew :core:assembleV8VCoreReleaseXCFramework

# Lint check (ktlint)
./gradlew ktlintCheck

Publishing

Six distribution channels are published via GitHub Actions CI or locally:

# Local release (core only — Maven Central + @v8v/core npm)
./scripts/release.sh 0.3.0

# All packages are published via GitHub Actions on tag push.
# See PUBLISHING.md for full details.

Running Examples

Android

./gradlew :example-android:installDebug

Uses io.github.alimomin1998:core-android:0.3.0 from Maven Central.

Try: "add project status update" (LOCAL), "create task schedule review" (MCP), "notify team build is ready" (REMOTE)

Web

cd example-web
npm install    # installs @v8v/core from npm
npm start      # serves on http://localhost:5174

Open in Chrome. Set MCP URL to http://localhost:3001/mcp in Settings (optional).

iOS (SwiftUI)

./gradlew :core:assembleV8VCoreReleaseXCFramework
open example-ios/V8VPhone.xcodeproj

Requires a real device — the iOS Simulator does not support speech recognition.

macOS (SwiftUI)

./gradlew :core:assembleV8VCoreReleaseXCFramework
open example-macos/V8VMac.xcodeproj

MCP Server Examples (for testing)

Each platform has a matching server example. The server runs on port 3001 and exposes create_task, list_tasks, and complete_task MCP tools:

# Node.js (simplest for quick testing)
cd example-server-web && npm install && npm start    # port 3001

# Android (Ktor CIO embedded server)
./gradlew :example-server-android:installDebug

# Flutter
cd example-server-flutter && flutter run

# React (Node.js + dashboard)
cd example-server-react && npm install && npm start   # port 3001

iOS and macOS server examples use Xcode — open the .xcodeproj and run on device.

Flutter

cd packages/flutter/example && flutter run

License

Copyright 2026 V8V Contributors
Licensed under the Apache License, Version 2.0

See LICENSE for the full text.

Related libraries

Surfaced from shared tags and platforms — no rankings paid for.

TextToSpeechKt★ 57

Marc-JBCross-platform text-to-speech library enabling speech synthesis with coroutine support. Features include volume, pitch, and rate adjustments, with Compose integration for enhanced functionality.Shared: web, kotlin-coroutines, desktop

wgpu4k★ 110

wgpu4kCreates WebGPU bindings for web, desktop, and mobile, enabling cross-platform graphics development. Offers example executions and is progressing through API implementation and refinement phases.Shared: web, native, desktop

inspektify★ 249

BVanturEnables real-time network monitoring within applications using Ktor library. Offers configurable settings, session tracking, and supports both static and dynamic frameworks for enhanced flexibility.Shared: ktor-client, ktor, desktop

kaluga★ 397

splendoOffers modular tools for cross-platform mobile app development, including features like MVVM architecture, location services, permissions management, Bluetooth communication, and more, using coroutines and `Flow` for efficient design.Shared: kotlin-flow, kotlin-coroutines, apple

material3-windowsizeclass-multiplatform★ 361

chrisbanesFacilitates responsive UI design by implementing Material 3 window size classes across multiple platforms, ensuring layouts adapt to various screen sizes. Supports Android, iOS, desktop, and web environments.Shared: web, desktop, apple

kotlin-multiplatform-oidc★ 153

kalinjulLightweight implementation of OpenID Connect/OAuth 2.0 supporting Authorization Code Grant Flow, discovery, PKCE, and simple JWT parsing. Includes Android, iOS, desktop support, and OkHttp/Ktor integration.Shared: ktor-client, desktop, apple

Android SDK	API 24 (Android 7.0)	API 35 (Android 15)
iOS	16.0	17+
macOS	13.0 (Ventura)	14+ (Sonoma)
Web Browser	Chrome 33+ / Edge 79+	Chrome 120+
Safari (Web)	Not supported (no Web Speech API)	—
Firefox (Web)	Not supported (no Web Speech API)	—
JDK	17	17
Kotlin	2.1.20	2.1.20
Gradle	8.0	8.7+
Xcode	15.0	15+
Ktor	3.0.3	3.0.3
Node.js (MCP server)	18+	20+

`registerAction(intent, phrases, handler)`	Register a voice command
`start()`	Begin listening
`stop()`	Stop listening
`updateConfig(config)`	Change language, continuous mode, fuzzy threshold at runtime
`destroy()`	Release all resources

`state`	`StateFlow<AgentState>`	`IDLE`, `LISTENING`, `PROCESSING`
`transcript`	`SharedFlow<String>`	Every final (or partial) transcript
`errors`	`SharedFlow<VoiceAgentError>`	Structured errors (permission, engine, action)
`actionResults`	`SharedFlow<ActionResult>`	Success/Error from dispatched actions
`audioLevel`	`StateFlow<Float>`	Normalized 0.0-1.0 mic volume

`registerPhrase(intent, lang, pattern)`	Register a LOCAL voice command
`registerMcpAction(intent, lang, phrases[], serverUrl, toolName)`	Register MCP action (Ktor HTTP)
`registerWebhookAction(intent, lang, phrases[], webhookUrl)`	Register webhook action (Ktor HTTP)
`onTranscript(callback)`	Called on each transcript
`onIntent(callback)`	Called with `(intentName, message)`
`onError(callback)`	Called on errors
`onUnhandled(callback)`	Called when no intent matched
`start()` / `stop()` / `destroy()`	Lifecycle

`onTranscript(callback)`	Called on each transcript
`onIntent(callback)`	Called with `(intentName, message)` on success
`onError(callback)`	Called on errors (engine + action failures)
`onUnhandled(callback)`	Called when no intent matched
`onStateChange(callback)`	Called on IDLE/LISTENING/PROCESSING
`onAudioLevel(callback)`	Called with mic volume (0.0-1.0)
`registerLocalAction(intent, phrases, handler)`	Register a LOCAL action
`registerMcpAction(intent, phrases, serverUrl, toolName)`	Register an MCP action
`registerMcpAction(intent, phrases, serverUrl, toolName, strategy)`	Register an MCP action with delivery strategy
`registerWebhookAction(intent, phrases, webhookUrl)`	Register a webhook action
`onQueued(callback)`	Called when an MCP command is queued (iOS)
`discoverMcpServers(onFound)`	Browse for MCP servers via mDNS/Bonjour
`stopDiscovery()`	Stop mDNS browsing
`retryQueuedCommands(onResult)`	Retry queued MCP commands

Android / Kotlin	`io.github.alimomin1998:core-android`	Maven Central
iOS / macOS	`V8VCore.xcframework`	SPM / CocoaPods
Web / JS / TS	`@v8v/core`	npm
React (Web)	`@v8v/react`	npm
Flutter	`v8v_flutter`	pub.dev
React Native / Expo	`@v8v/react-native`	npm (separate repo)
JVM	`io.github.alimomin1998:core-jvm`	Maven Central

Android	Available	`android.speech.SpeechRecognizer`	Maven Central
iOS	Available	`SFSpeechRecognizer` + `AVAudioEngine`	XCFramework / SPM
macOS	Available	`SFSpeechRecognizer` + `AVAudioEngine`	XCFramework / SPM
Web	Available	Web Speech API	npm (`@v8v/core`)
React (Web)	Available	Web Speech API	npm (`@v8v/react`)
Flutter	Available	Native bridge (Android + iOS)	pub.dev (`v8v_flutter`)
JVM (Desktop)	Core only	Bring your own engine	Maven Central
React Native / Expo	Available	Native STT (via adapters)	npm (`@v8v/react-native`) — separate repo
Windows	Planned	—	—
Linux	Planned	—	—

`language`	`String`	`"en"`	BCP-47 language tag
`continuous`	`Boolean`	`true`	Auto-restart after each utterance
`partialResults`	`Boolean`	`false`	Forward partial transcripts
`fuzzyThreshold`	`Float`	`0.0`	Dice similarity threshold (0 = exact only)
`silenceTimeoutMs`	`Long`	`1500`	Auto-promote partial to final after silence (ms)

`LOCAL`	`LocalActionHandler`	In-app actions, offline, default
`MCP`	`McpActionHandler`	Cross-app via local MCP server
`REMOTE`	`WebhookActionHandler`	Cloud workflows via n8n/Zapier

`Http` (default)	All	Single HTTP call, fail on error
`ContentProvider(authority)`	Android	MCP over ContentProvider — wakes App B's process
`LaunchAndRetry(bundleId)`	macOS	Launches App B via NSWorkspace, retries MCP over HTTP
`QueueAndNotify()`	iOS	Queues MCP command, shows local notification
`Auto(...)`	All	Tries HTTP first, picks the right fallback per platform

Maven Central	`io.github.alimomin1998:core-*`	Gradle
npm	`@v8v/core`	npm
npm	`@v8v/react`	npm
pub.dev	`v8v_flutter`	`dart pub publish`
GitHub Releases / SPM	`V8VCore.xcframework.zip`	`gh release create`