react-native-fluidaudio

React Native wrapper for FluidAudio - a Swift library for ASR, VAD, Speaker Diarization, and TTS on Apple platforms.

Features

ASR (Automatic Speech Recognition) - High-quality speech-to-text using Parakeet TDT models
Streaming ASR - Real-time transcription from microphone or system audio
VAD (Voice Activity Detection) - Detect speech segments in audio
Speaker Diarization - Identify and track different speakers
TTS (Text-to-Speech) - Natural voice synthesis using Kokoro TTS

Requirements

iOS 17.0+
React Native 0.71+ or Expo SDK 50+

Installation

Expo (Recommended)

npm install @fluidinference/react-native-fluidaudio
npx expo run:ios

Expo automatically handles native dependencies - no manual pod install needed.

Note: Expo Go is not supported - native modules require a development build.

React Native CLI

npm install @fluidinference/react-native-fluidaudio
cd ios && pod install

Usage

Basic Transcription

import { ASRManager, onModelLoadProgress } from '@fluidinference/react-native-fluidaudio';

// Monitor model loading progress
const subscription = onModelLoadProgress((event) => {
  console.log(`Model loading: ${event.status} (${event.progress}%)`);
});

// Initialize ASR (downloads models on first run)
const asr = new ASRManager();
await asr.initialize();

// Transcribe an audio file
const result = await asr.transcribeFile('/path/to/audio.wav');
console.log(result.text);
console.log(`Confidence: ${result.confidence}`);
console.log(`Processing speed: ${result.rtfx}x realtime`);

// Clean up
subscription.remove();

Streaming Transcription

import { StreamingASRManager, onStreamingUpdate } from '@fluidinference/react-native-fluidaudio';

const streaming = new StreamingASRManager();

// Start streaming with update callback
await streaming.start({ source: 'microphone' }, (update) => {
  console.log('Confirmed:', update.confirmed);
  console.log('Volatile:', update.volatile);
});

// Feed audio data (16-bit PCM, 16kHz, base64 encoded)
await streaming.feedAudio(base64AudioChunk);

// Stop and get final result
const result = await streaming.stop();
console.log('Final transcription:', result.text);

Voice Activity Detection

import { VADManager } from '@fluidinference/react-native-fluidaudio';

const vad = new VADManager();
await vad.initialize({ threshold: 0.85 });

// Process audio file
const result = await vad.processFile('/path/to/audio.wav');

// Get speech segments
const segments = vad.getSpeechSegments(result);
segments.forEach((seg) => {
  console.log(`Speech from ${seg.start}s to ${seg.end}s`);
});

Speaker Diarization

import { DiarizationManager } from '@fluidinference/react-native-fluidaudio';

const diarizer = new DiarizationManager();
await diarizer.initialize({
  clusteringThreshold: 0.7,
  numClusters: -1, // Auto-detect number of speakers
});

// Diarize audio file
const result = await diarizer.diarizeFile('/path/to/meeting.wav');

// Get speaker information
const speakers = diarizer.getUniqueSpeakers(result);
const speakingTime = diarizer.getSpeakingTime(result);

result.segments.forEach((segment) => {
  console.log(`${segment.speakerId}: ${segment.startTime}s - ${segment.endTime}s`);
});

// Pre-register known speakers for identification
await diarizer.setKnownSpeakers([
  { id: 'alice', name: 'Alice', embedding: aliceEmbedding },
  { id: 'bob', name: 'Bob', embedding: bobEmbedding },
]);

Text-to-Speech

import { TTSManager } from '@fluidinference/react-native-fluidaudio';

const tts = new TTSManager();
await tts.initialize({ variant: 'fiveSecond' });

// Synthesize to audio data
const result = await tts.synthesize('Hello, world!');
console.log(`Audio duration: ${result.duration}s`);
// result.audioData is base64-encoded 16-bit PCM

// Synthesize directly to file
await tts.synthesizeToFile('Hello, world!', '/path/to/output.wav');

System Information

import { getSystemInfo } from '@fluidinference/react-native-fluidaudio';

const info = await getSystemInfo();
console.log(info.summary);
// e.g., "Apple A17 Pro, iOS 17.0"

Cleanup

import { cleanup } from '@fluidinference/react-native-fluidaudio';

// Clean up all resources when done
await cleanup();

API Reference

Managers

Manager	Description
`ASRManager`	Speech-to-text transcription
`StreamingASRManager`	Real-time streaming transcription
`VADManager`	Voice activity detection
`DiarizationManager`	Speaker identification
`TTSManager`	Text-to-speech synthesis

Events

Event	Description
`onStreamingUpdate`	Streaming transcription updates
`onModelLoadProgress`	Model download/compilation progress
`onTranscriptionError`	Transcription errors

Types

See src/types.ts for complete TypeScript definitions.

Notes

Model Loading

First initialization downloads and compiles ML models (~500MB total). This can take 20-30 seconds as Apple's Neural Engine compiles the models. Subsequent loads use cached compilations (~1 second).

TTS License

The TTS module uses ESpeakNG which is GPL licensed. Check license compatibility for your project.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
__mocks__		__mocks__
__tests__		__tests__
docs		docs
example		example
expo-test		expo-test
ios		ios
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
babel.config.js		babel.config.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
react-native-fluidaudio.podspec		react-native-fluidaudio.podspec
react-native.config.js		react-native.config.js
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

react-native-fluidaudio

Features

Requirements

Installation

Expo (Recommended)

React Native CLI

Usage

Basic Transcription

Streaming Transcription

Voice Activity Detection

Speaker Diarization

Text-to-Speech

System Information

Cleanup

API Reference

Managers

Events

Types

Notes

Model Loading

TTS License

License

About

Uh oh!

Releases

Packages

Contributors 2

Languages

FluidInference/react-native-fluidaudio

Folders and files

Latest commit

History

Repository files navigation

react-native-fluidaudio

Features

Requirements

Installation

Expo (Recommended)

React Native CLI

Usage

Basic Transcription

Streaming Transcription

Voice Activity Detection

Speaker Diarization

Text-to-Speech

System Information

Cleanup

API Reference

Managers

Events

Types

Notes

Model Loading

TTS License

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages