The MicVAD
API is for recording user audio in the browser and running callbacks on speech segments and related events.
Package |
Supported |
@ricky0123/vad-web |
Yes |
@ricky0123/vad-node |
No |
@ricky0123/vad-react |
No, use the useMicVAD hook |
import { MicVAD } from "@ricky0123/vad-web"
const myvad = await MicVAD.new({
onSpeechEnd: (audio) => {
},
})
myvad.start()
New instances of MicVAD
are created by calling the async static method MicVAD.new(options)
. The options object can contain the following fields (all are optional).
Option |
Type |
Description |
additionalAudioConstraints |
|
constraints to pass to getUserMedia via the audio field |
onFrameProcessed |
(probabilities: {isSpeech: float; notSpeech: float}) => any |
Callback to run after each frame. |
onVADMisfire |
() => any |
Callback to run if speech start was detected but onSpeechEnd will not be run because the audio segment is smaller than minSpeechFrames |
onSpeechStart |
() => any |
Callback to run when speech start is detected |
onSpeechEnd |
(audio: Float32Array) => any |
Callback to run when speech end is detected. Takes as arg a Float32Array of audio samples between -1 and 1, sample rate 16000. This will not run if the audio segment is smaller than minSpeechFrames |
positiveSpeechThreshold |
number |
see algorithm configuration |
negativeSpeechThreshold |
number |
see algorithm configuration |
redemptionFrames |
number |
see algorithm configuration |
frameSamples |
number |
see algorithm configuration |
preSpeechPadFrames |
number |
see algorithm configuration |
minSpeechFrames |
number |
see algorithm configuration |
Attributes |
Type |
Description |
listening |
boolean |
Is the VAD listening to mic input or is it paused? |
pause |
() => void |
Stop listening to mic input |
start |
() => void |
Start listening to mic input |
The NonRealTimeVAD
API is for identifying segments of user speech if you already have a Float32Array of audio samples.
Package |
Supported |
@ricky0123/vad-web |
Yes |
@ricky0123/vad-node |
Yes |
@ricky0123/vad-react |
No |
const vad = require("@ricky0123/vad-node")
const options: Partial<vad.NonRealTimeVADOptions> = { }
const myvad = await vad.NonRealTimeVAD.new(options)
const audioFileData, nativeSampleRate = ...
for await (const {audio, start, end} of myvad.run(audioFileData, nativeSampleRate)) {
}
New instances of MicVAD
are created by calling the async static method MicVAD.new(options)
. The options object can contain the following fields (all are optional).
Attributes |
Type |
Description |
run |
async function* (inputAudio: Float32Array, sampleRate: number): AsyncGenerator |
Run the VAD model on your audio |
A React hook wrapper for MicVAD
. Use this if you want to run the VAD model on mic input in a React application.
Package |
Supported |
@ricky0123/vad-web |
No, use MicVAD |
@ricky0123/vad-node |
No |
@ricky0123/vad-react |
Yes |
import { useMicVAD } from "@ricky0123/vad-react"
const MyComponent = () => {
const vad = useMicVAD({
startOnLoad: true,
onSpeechEnd: (audio) => {
console.log("User stopped talking")
},
})
return <div>{vad.userSpeaking && "User is speaking"}</div>
}
The useMicVAD
hook takes an options object with the following fields (all optional).
Option |
Type |
Description |
startOnLoad |
boolean |
Should the VAD start listening to mic input when it finishes loading? |
additionalAudioConstraints |
|
constraints to pass to getUserMedia via the audio field |
onFrameProcessed |
(probabilities: {isSpeech: float; notSpeech: float}) => any |
Callback to run after each frame. |
onVADMisfire |
() => any |
Callback to run if speech start was detected but onSpeechEnd will not be run because the audio segment is smaller than minSpeechFrames |
onSpeechStart |
() => any |
Callback to run when speech start is detected |
onSpeechEnd |
(audio: Float32Array) => any |
Callback to run when speech end is detected. Takes as arg a Float32Array of audio samples between -1 and 1, sample rate 16000. This will not run if the audio segment is smaller than minSpeechFrames |
positiveSpeechThreshold |
number |
see algorithm configuration |
negativeSpeechThreshold |
number |
see algorithm configuration |
redemptionFrames |
number |
see algorithm configuration |
frameSamples |
number |
see algorithm configuration |
preSpeechPadFrames |
number |
see algorithm configuration |
minSpeechFrames |
number |
see algorithm configuration |
Attributes |
Type |
Description |
listening |
boolean |
Is the VAD currently listening to mic input? |
errored |
false | { message: string; } |
Did the VAD fail to load? |
loading |
boolean |
Did the VAD finish loading? |
userSpeaking |
boolean |
Is the user speaking? |
pause |
() => void |
Stop the VAD from running on mic input |
start |
() => void |
Start running the VAD on mic input |