GithubHelp home page GithubHelp logo

govpr's Introduction

声纹识别

来自于阿里聚安全对声纹识别的介绍:探秘身份认证利器——声纹识别

简介

govpr是golang 实现的基于 GMM-UBM 说话人识别引擎(声纹识别),可用于语音验证,身份识别的场景. 目前暂时仅支持汉语数字的语音,语音格式为wav格式(比特率16000,16bits,单声道)

安装

go get -v -u github.com/liuxp0827/govpr

cd $GOPATH/src/github.com/liuxp0827/govpr/example

go run main.go

示例

如下是一个简单的示例. 可跳转至 example 查看详细的例子,示例中的语音为纯数字8位数字.语音验证后得到一个得分,可设置阈值来判断验证语音是否为注册训练者本人. 示例中,预设阈值1.0,语音验证得分>=1.0,可认定为是本人语音,语音验证得分<1.0则非本人语音.

得分

(注:阈值设为1.0并非最优值,仅是给出一个示例.另女性声纹得分相对较低,理论上应对不同性别给出不同阈值等级,govpr暂未实现通过声音分辨性别,后续会开发该功能)

注意

示例中,使用了五组完全不同的语音内容进行训练和验证,但实际上 govpr 更适合于文本相关的说话人识别,采用五组训练语音和验证语音内容相同的语音数据,可得到更好的识别效果.

package main

import (
	"github.com/liuxp0827/govpr"
	"github.com/liuxp0827/govpr/log"
	"github.com/liuxp0827/govpr/waveIO"
	"io/ioutil"
)

type engine struct {
	vprEngine *govpr.VPREngine
}

func NewEngine(sampleRate, delSilRange int, ubmFile, userModelFile string) (*engine, error) {
	vprEngine, err := govpr.NewVPREngine(sampleRate, delSilRange, false, ubmFile, userModelFile)
	if err != nil {
		return nil, err
	}
	return &engine{vprEngine: vprEngine}, nil
}

func (this *engine) DestroyEngine() {
	this.vprEngine = nil
}

func (this *engine) TrainSpeech(buffers [][]byte) error {

	var err error
	count := len(buffers)
	for i := 0; i < count; i++ {
		err = this.vprEngine.AddTrainBuffer(buffers[i])
		if err != nil {
			log.Error(err)
			return err
		}
	}

	defer this.vprEngine.ClearTrainBuffer()
	defer this.vprEngine.ClearAllBuffer()

	err = this.vprEngine.TrainModel()
	if err != nil {
		log.Error(err)
		return err
	}

	return nil
}

func (this *engine) RecSpeech(buffer []byte) (float64, error) {

	err := this.vprEngine.AddVerifyBuffer(buffer)
	defer this.vprEngine.ClearVerifyBuffer()
	if err != nil {
		log.Error(err)
		return -1.0, err
	}

	err = this.vprEngine.VerifyModel()
	if err != nil {
		log.Error(err)
		return -1.0, err
	}

	return this.vprEngine.GetScore(), nil
}

func main() {
	log.SetLevel(log.LevelDebug)

	vprEngine, err := NewEngine(16000, 50, "../ubm/ubm", "model/test.dat")
	if err != nil {
		log.Fatal(err)
	}

	trainlist := []string{
		"wav/train/01_32468975.wav",
		"wav/train/02_58769423.wav",
		"wav/train/03_59682734.wav",
		"wav/train/04_64958273.wav",
		"wav/train/05_65432978.wav",
	}

	trainBuffer := make([][]byte, 0)

	for _, file := range trainlist {
		buf, err := loadWaveData(file)
		if err != nil {
			log.Error(err)
			return
		}
		trainBuffer = append(trainBuffer, buf)
	}

	err = vprEngine.TrainSpeech(trainBuffer)
	if err != nil {
		log.Fatal(err)
	}

	var threshold float64 = 1.0

	selfverifyBuffer, err := waveIO.WaveLoad("wav/verify/self_34986527.wav")
	if err != nil {
		log.Fatal(err)
	}

	self_score, err := vprEngine.RecSpeech(selfverifyBuffer)
	if err != nil {
		log.Fatal(err)
	}

	log.Infof("self score %f, pass? %v", self_score, self_score >= threshold)

	otherverifyBuffer, err := waveIO.WaveLoad("wav/verify/other_38974652.wav")
	if err != nil {
		log.Fatal(err)
	}

	other_score, err := vprEngine.RecSpeech(otherverifyBuffer)
	if err != nil {
		log.Fatal(err)
	}

	log.Infof("other score %f, pass? %v", other_score, other_score >= threshold)
}

func loadWaveData(file string) ([]byte, error) {
	data, err := ioutil.ReadFile(file)
	if err != nil {
		return nil, err
	}
	// remove .wav header info 44 bits
	data = data[44:]
	return data, nil
}

govpr's People

Contributors

liuxp0827 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

govpr's Issues

女性声纹识别区别度低

你好,我使用了govpr做声纹识别实验时发现女性声纹识别得分上区别不明显,身份鉴定不准确,请问能够通过什么手段改善。

(识别实验采用了项目里自带的ubm文件,实验程序实现完全参照example中的程序开发)

我现在完全没有解决的思路,还请提示,谢谢!

How to generate UBM file?

Hi,
Thank for your project.
In the example, the UBM file ("../ubm/ubm") is needed and you created by yourself.

vprEngine, err := NewEngine(16000, 50, "../ubm/ubm", "model/test.dat")

Can you please tell me how to generate the UBM file by myself?
Thank you

请问下

waveIO.WaveSave() 这个方法怎么使用啊

example错误

运行example时,报错
2017/03/04 21:29:18 [engine.go:59] [E] gzip: invalid header
2017/03/04 21:29:18 [main.go:73] [F] model load failed: gzip: invalid header
但将file/vprfile.go 中的NewVPRFile函数不使用gzip,而是直接换成注释里的内容,问题解决。

Index out of range error

I've replace wav files in example/wav/train with my own recordings and found an error when I rebuilt and run the example program.

./example
panic: runtime error: index out of range [62415] with length 62415

goroutine 1 [running]:
github.com/liuxp0827/govpr.(*VPREngine).AddTrainBuffer(0xc000098000, 0xc0000d402c, 0xf3cf, 0xf5cf, 0x0, 0x0)
	/Users/ferdi/GOPATH/src/github.com/liuxp0827/govpr/engine.go:163 +0x241
main.(*engine).TrainSpeech(0xc00000e038, 0xc000120000, 0x5, 0x8, 0x0, 0x0)
	/Users/ferdi/GOPATH/src/github.com/liuxp0827/govpr/example/main.go:31 +0x9d
main.main()
	/Users/ferdi/GOPATH/src/github.com/liuxp0827/govpr/example/main.go:95 +0x210

It's caused by the wrong range in a for loop, where ii < length should be ii < length - 1 because it's used as int16(buf[ii+1]) in the loop.

	for ii := 0; ii < length; ii += 2 {
		cBuff16 := int16(buf[ii])
		cBuff16 |= int16(buf[ii+1]) << 8
		sBuff = append(sBuff, cBuff16)
	}

Please have a look at the issue and I'm willing to send a PR.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.