GithubHelp home page GithubHelp logo

node-personal-wakeword's Introduction

node-personal-wakeword

Based on https://medium.com/snips-ai/machine-learning-on-voice-a-gentle-introduction-with-snips-personal-wake-word-detector-133bd6fb568e

Installation

npm i @mathquis/node-personal-wakeword

Usage

const WakewordDetector = require('@mathquis/node-personal-wakeword')
const Recorder = require('mic')
const Stream = require('stream')

async function main() {
	// Create a new wakeword detection engine
	const detector = new WakewordDetector({
		/*
		sampleRate: 16000,
		bitLength: 16,
		frameShiftMS: 10.0,
		frameLengthMS: 30.0, // Must be a multiple of frameShiftMS
		vadMode: WakewordDetector.VadMode.AGGRESSIVE, // See node-vad modes
		vadDebounceTime: 500,
		band: 5, // DTW window width
		ref: 0.22, // See Snips paper for explanation about this parameter
		preEmphasisCoefficient: 0.97, // Pre-emphasis ratio
		*/
		threshold: 0.5 // Default value
	})

	// *****

	// KEYWORD MANAGEMENT

	// Add a new keyword using multiple "templates"
	await detector.addKeyword('alexa', [
		// WAV templates (trimmed with no noise!)
		'./keywords/alexa1.wav',
		'./keywords/alexa2.wav',
		'./keywords/alexa3.wav'
	], {
		// Options
		disableAveraging: true, // Disabled by default, disable templates averaging (note that resources consumption will increase)
		threshold: 0.52 // Per keyword threshold
	})

	// Keywords can be enabled/disabled at runtime
	detector.disableKeyword('alexa')
	detector.enableKeyword('alexa')

	// *****

	// EVENTS

	// The detector will emit a "ready" event when its internal audio frame buffer is filled
	detector.on('ready', () => {
		console.log('listening...')
	})

	// The detector will emit an "error" event when it encounters an error (VAD, feature extraction, etc.)
	detector.on('error', err => {
		console.error(err.stack)
	})

	// The detector will emit a "keyword" event when it has detected a keyword in the audio stream
	/* The event payload is:
		{
			"keyword"     : "alexa", // The detected keyword
			"score"       : 0.56878768987, // The detection score
			"threshold"   : 0.5, // The detection threshold used (global or keyword)
			"frames"      : 89, // The number of audio frames used in the detection
			"timestamp"   : 1592574404789, // The detection timestamp (ms)
			"audioData"   : <Buffer> // The utterance audio data (can be written to a file for debugging)
		}
	*/
	detector.on('keyword', ({keyword, score, threshold, timestamp}) => {
		console.log(`Detected "${keyword}" with score ${score} / ${threshold}`)
	})

	// Note that as the detector is a transform stream the standard "data" event also works...
	// I just added the "keyword" event for clarity :)

	// *****

	// STREAMS

	// As an alternative to events, the detector is a transform stream that takes audio buffers in and output keyword detection payload
	const detectionStream = new Stream.Writable({
		objectMode: true,
		write: (data, enc, done) => {
			// `data` is equivalent to "keyword" and "data" event payload
			console.log(data)
			done()
		}
	})

	detector.pipe(detectionStream)

	// *****

	// Create an audio stream from an audio recorder (arecord, sox, etc.)
	const recorder = new Recorder({
		channels      : detector.channels, // Defaults to 1
		rate          : detector.sampleRate, // Defaults to 16000
		bitwidth      : detector.bitLength // Defaults to 16
	})

	// Pipe to wakeword detector
	recorder.pipe(detector)

	recorder.start()
}

main()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.