GithubHelp home page GithubHelp logo

goxr3plus / java-google-speech-api Goto Github PK

View Code? Open in Web Editor NEW
78.0 12.0 36.0 137 KB

๐Ÿ™Š Speech Recognition , Text To Speech , Google Translate

Home Page: https://github.com/goxr3plus/java-google-speech-api

License: GNU General Public License v3.0

Java 100.00%
speechrecognition text-to-speech google-translate

java-google-speech-api's Introduction

ko-fi

THIS LIBRARY IS NOT SUPPORTED BY ME ACTIVELY ANYMORE , feel free to contribute :)


Java Google Speech Api ( Library )

๐ŸŽค

This project is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.


Latest Version GitHub contributors HitCount Total Downloads

Google has released it's official library for Google Speech Recognition . Check this issue for Official Google Speech Library code solution -> #4

Add it to your project using JitPack :

https://jitpack.io/private#goxr3plus/java-google-speech-api

Step 1. Add the JitPack repository to your build file

<repositories>
	<repository>
	   <id>jitpack.io</id>
	   <url>https://jitpack.io</url>
        </repository>
</repositories>

Step 2. Add the dependency

<dependency>
   <groupId>com.github.goxr3plus</groupId>
   <artifactId>java-google-speech-api</artifactId>
   <version>8.0.0</version> 
</dependency>

Java Google Speech API

Warning : The default secret key i was using is not working anymore (because ... i have to pay lol ) , you have to make your own , check tutorials :)

Description

This project is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Features

This project is separated on 3 parts :

1) Google Speech Recognition based on Chromium Speech API (which is free with restrictions for commercial applications) through GSpeechDuplex.java

 - Microphone Capture API is used (Wrapped around the current Java API for simplicity)
 - Converts WAVE files from microphone input to FLAC (using existing API, see CREDITS)
 - Retrieves Response from Google, including confidence score and text
Keep in mind that:

It doesn't currently support the new official Google Cloud Speech API(which is also free but for a certain amount of words)

Update 2/7/2018

Check this issue for Official Google Speech Library code solution -> #4

The new Google Cloud Speech API is not supported yet but you can see here the official Alpha Library from supported by Google

Create Google Cloud Account Generate Speech Recognition Private API Keys
First Second

2) Google translate full support through GoogleTranslate.java

- A translator using Google Translate (courtesy of Skylion's Google Toolkit)
Tutorial 1 Tutorial 2
First Second

3) Text to Speech , Audio Synthesizer through SynthesiserV2.java

- Retrieves synthesized text in an InputStream (MP3 data ready to be played)
Tutorial 1 Tutorial 2
First Second

The program supports dozens of languages and even has the ability to auto-detect languages!

Maven Build

Maven Clean Package [ With Javadocs produced ]

mvn clean package

Maven Clean Package [ No Javadocs produced ]

mvn -Dmaven.javadoc.skip=true clean package

Java Swing speech recognition example using GSpeechDuplex.java

package Try_Google_Speech_Recognition_Simple;

import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.IOException;

import javax.swing.BoxLayout;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;

import com.darkprograms.speech.microphone.Microphone;
import com.darkprograms.speech.recognizer.GSpeechDuplex;
import com.darkprograms.speech.recognizer.GSpeechResponseListener;
import com.darkprograms.speech.recognizer.GoogleResponse;

import net.sourceforge.javaflacencoder.FLACFileWriter;

public class TryGoogleSpeechRecognitionSimple implements GSpeechResponseListener {
	
	public static void main(String[] args) throws IOException {
		final Microphone mic = new Microphone(FLACFileWriter.FLAC);
		// You have to make your own GOOGLE_API_KEY 
		GSpeechDuplex duplex = new GSpeechDuplex("GOOGLE_API_KEY");
		
		duplex.setLanguage("en");
		
		JFrame frame = new JFrame("Jarvis Speech API DEMO");
		frame.setDefaultCloseOperation(3);
		JTextArea response = new JTextArea();
		response.setEditable(false);
		response.setWrapStyleWord(true);
		response.setLineWrap(true);
		
		final JButton record = new JButton("Record");
		final JButton stop = new JButton("Stop");
		stop.setEnabled(false);
		
		record.addActionListener(new ActionListener() {
			public void actionPerformed(ActionEvent evt) {
				new Thread(() -> {
					try {
						duplex.recognize(mic.getTargetDataLine(), mic.getAudioFormat());
					} catch (Exception ex) {
						ex.printStackTrace();
					}
					
				}).start();
				record.setEnabled(false);
				stop.setEnabled(true);
			}
		});
		stop.addActionListener(new ActionListener() {
			public void actionPerformed(ActionEvent arg0) {
				mic.close();
				duplex.stopSpeechRecognition();
				record.setEnabled(true);
				stop.setEnabled(false);
			}
		});
		JLabel infoText = new JLabel(
				"<html><div style=\"text-align: center;\">Just hit record and watch your voice be translated into text.\n<br>Only English is supported by this demo, but the full API supports dozens of languages.<center></html>",
				
				0);
		frame.getContentPane().add(infoText);
		infoText.setAlignmentX(0.5F);
		JScrollPane scroll = new JScrollPane(response);
		frame.getContentPane().setLayout(new BoxLayout(frame.getContentPane(), 1));
		frame.getContentPane().add(scroll);
		JPanel recordBar = new JPanel();
		frame.getContentPane().add(recordBar);
		recordBar.setLayout(new BoxLayout(recordBar, 0));
		recordBar.add(record);
		recordBar.add(stop);
		frame.setVisible(true);
		frame.pack();
		frame.setSize(500, 500);
		frame.setLocationRelativeTo(null);
		
		duplex.addResponseListener(new GSpeechResponseListener() {
			String old_text = "";
			
			public void onResponse(GoogleResponse gr) {
				String output = "";
				output = gr.getResponse();
				if (gr.getResponse() == null) {
					this.old_text = response.getText();
					if (this.old_text.contains("(")) {
						this.old_text = this.old_text.substring(0, this.old_text.indexOf('('));
					}
					System.out.println("Paragraph Line Added");
					this.old_text = ( response.getText() + "\n" );
					this.old_text = this.old_text.replace(")", "").replace("( ", "");
					response.setText(this.old_text);
					return;
				}
				if (output.contains("(")) {
					output = output.substring(0, output.indexOf('('));
				}
				if (!gr.getOtherPossibleResponses().isEmpty()) {
					output = output + " (" + (String) gr.getOtherPossibleResponses().get(0) + ")";
				}
				System.out.println(output);
				response.setText("");
				response.append(this.old_text);
				response.append(output);
			}
		});
	}
	
	@Override
	public void onResponse(GoogleResponse paramGoogleResponse) {
		// TODO Auto-generated method stub
		
	}
}

java-google-speech-api's People

Contributors

dependabot[bot] avatar goxr3plus avatar vinhtq115 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

java-google-speech-api's Issues

Help with using GoogleTranslate.java program

Hi, I am getting an exception when I run the program given below. The input file is a text file containing the details of 500 tweets. JavaGoogleSpeechAPISample is a class which is the same as this: java-google-speech-api/src/main/java/com/darkprograms/speech/translator/GoogleTranslate.java . When I executed the program the first time, it managed to translate 19 tweets but when I tried executing it again, it immediately gave me this exception without translating any tweets. Please help me fix this issue.

Exception in thread "main" java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.google.com/sorry/index?continue=http://translate.google.com/translate_a/single%3Fclient%3Dwebapp%26hl%3Den%26sl%3Dauto%26tl%3Den%26q%3DRT%26multires%3D1%26otf%3D0%26pc%3D0%26trs%3D1%26ssel%3D0%26tsel%3D0%26kc%3D1%26dt%3Dt%26ie%3DUTF-8%26oe%3DUTF-8%26tk%3D922019.533213&hl=en&q=EgR1wEmUGN2RsNsFIhkA8aeDSx_FqW2PwJSg7Q9x4dKugT-g9ylxMgFy
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1894)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
	at gettingtweets3.JavaGoogleSpeechAPISample.urlToText(JavaGoogleSpeechAPISample.java:173)
	at gettingtweets3.JavaGoogleSpeechAPISample.translate(JavaGoogleSpeechAPISample.java:149)
	at gettingtweets3.JavaGoogleSpeechAPISample.translate(JavaGoogleSpeechAPISample.java:130)
	at gettingtweets3.TranslatingTweets4.main(TranslatingTweets4.java:45)
package gettingtweets3;

import java.util.*;
import java.util.regex.*;
import java.io.*;
import java.io.IOException;

public class TranslatingTweets4 {
    public static void main(String[] args) throws FileNotFoundException, IOException {
        Scanner console = new Scanner(System.in);
        Scanner input = checkFileName(console);
        PrintStream output = outputFile(console);
        System.out.println();
        
        while (input.hasNextLine()) {
            String line = input.nextLine();
            Pattern pattern = Pattern.compile("^(.*?said:\\s*)(.*)$");
            Matcher matcher = pattern.matcher(line);
            
            if(matcher.find()) {
                //Prints the offset after the last character matched.
                output.print(matcher.group(1));
                String tweet = matcher.group(2);
                
//                //Translate tweet by tweet
//                output.println(JavaGoogleSpeechAPISample.translate("en", tweet));
                
                //Translate word by word
                Scanner tweetScan = new Scanner(tweet);
                while (tweetScan.hasNext()) {
                    String word = tweetScan.next();
                    output.print(JavaGoogleSpeechAPISample.translate("en", word) + " ");                          
                }
                output.println(); //for translating word by word
            }
        }
    }
    
    //Uses the given Scanner object to prompt the user for an input file name
    //until the user provides a valid input file name. Then returns a scanner
    //to read the contents of the input file.
    //Parameter needed:
    // console = a user-input scanner which prompts the user for an input file name
    //           and reads the user's input
    public static Scanner checkFileName(Scanner console) throws FileNotFoundException {
       System.out.print("Input file name: ");
       File inputFile = new File(console.nextLine());
       while (!inputFile.canRead()) {
          System.out.print("File not found. Try again: ");
          inputFile = new File(console.nextLine());
       }
       System.out.println();
       return new Scanner(inputFile);
    }
   
    //Uses the given Scanner object to prompt the user for an output file name.
    //Then returns a PrintStream to print the completed mad lib into the output file.
    //Parameter needed:
    // console = a user-input scanner which prompts the user for an output file name
    //           and reads the user's input
    public static PrintStream outputFile(Scanner console) throws FileNotFoundException {
       System.out.print("Output file name: ");
       return new PrintStream(new File(console.nextLine()));
    }
}

@goxr3plus

Connection abort after a short time

Hello,

I would like to use the Speech API for a home system. That's why the connection needs to be alive for a long time. Unfortunately, at the moment the connection aborts after a short time, so that the message "Finished write on down stream..." appears.

What could be the reason for this error? Doesn't Google allow a permanent connection or is it the fault of the HttpsURLConnection created by Java?

Official Google Cloud Speech API Support implementation .

This project is based on Chromium Speech API key.That API has a lot stricter limits than the new Speech API on Google Cloud (which is also free).

We have to add support for the Official Google Cloud Speech API . I don't know if this would be hard or not but i know it should be done .

Google is releasing it's own library for that , though it is very very alpha check here

Maven and import (user) problem

I have a pom.xml file which includes the repository and dependency given in the readme.md file. It seems to work.

        <dependency>
            <groupId>com.github.goxr3plus</groupId>
            <artifactId>java-google-speech-api</artifactId>
            <version>8.0.0</version>
        </dependency>

But I cannot

import com.goxr3plus.speech.util.Complex;

only

import com.darkprograms.speech.util.Complex;

To me, it looks that i'm not getting the files of this repository, but probably those of an upstream library. What am I doing wrong?

Second issue, unrelated or not: Using the Maven dependency, I can only get version 8.0.0, not version V2.1.

stopSpeechRecognition() is undefined

I have this error when compiling the given sample:
The method stopSpeechRecognition() is undefined for the type GSpeechDuplex
at line
duplex.stopSpeechRecognition();

Please advise.

Btw, is this the same source code as the SpeechDemo.jar released here ?

is this api still working

I tried your api with the example code in read.me I got 401 error
I am trying to use your api via example code that u share when I run I got 401 error in GSpeechDuplex.class
I release that you didnt add google api key, blow code block that u use in GspeechDuplex.class.

        String API_DOWN_URL = "https://www.google.com/speech-api/full-duplex/v1/down?maxresults=1&pair=" + PAIR;
        String API_UP_URL = "https://www.google.com/speech-api/full-duplex/v1/up?lang=" + this.language + "&lm=dictation&client=chromium&pair=" + PAIR + "&key=" + this.API_KEY + "&continuous=true&interim=true";
        Thread downChannel = this.downChannel(API_DOWN_URL);
        Thread upChannel = this.upChannel(API_UP_URL, tl, af);

There is no google api key in API_DOWN_URL

Official Google Cloud Speech API code

@DeathStrokeAlpha @twister21 Hello my friends , i am working with Google Cloud Speech Library so :

Here is working code Google Cloud Speech Official

Any problems you might have about setting the credentials check this stackoverflow question i did :

For some reason it has the same problem as this library , stopping after 65 seconds , google has made it like this .... gonna find a work around soon

Check this -> googleapis/google-cloud-java#3188

package googleSpeech;

import java.io.IOException;
import java.sql.Date;
import java.time.LocalDate;
import java.util.Arrays;
import java.util.HashMap;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.Line;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.Mixer;
import javax.sound.sampled.TargetDataLine;

import com.google.api.gax.rpc.ClientStream;
import com.google.api.gax.rpc.ResponseObserver;
import com.google.api.gax.rpc.StreamController;
import com.google.auth.oauth2.AccessToken;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.StreamingRecognitionConfig;
import com.google.cloud.speech.v1.StreamingRecognizeRequest;
import com.google.cloud.speech.v1.StreamingRecognizeResponse;
import com.google.protobuf.ByteString;

public class GoogleSpeechTest {
	
	public GoogleSpeechTest() {
		
		//Set credentials?
		//	GoogleCredentials credentials = GoogleCredentials.create(new AccessToken("AIzaSyCtrBlhBiqNd7kI4BiOn2kWiCYlwp1azVM",Date.valueOf(LocalDate.now())));
		//	System.out.print(credentials.getAccessToken());
		
		//Target data line
		TargetDataLine microphone;
		AudioInputStream audio = null;
		
		//Check if Microphone is Supported
		checkMicrophoneAvailability();
		
		//Print available mixers
		//printAvailableMixers();
		
		//Capture Microphone Audio Data
		try {
			
			// Signed PCM AudioFormat with 16kHz, 16 bit sample size, mono
			AudioFormat format = new AudioFormat(16000, 16, 1, true, false);
			DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
			
			//Check if Microphone is Supported
			if (!AudioSystem.isLineSupported(info)) {
				System.out.println("Microphone is not available");
				System.exit(0);
			}
			
			//Get the target data line
			microphone = (TargetDataLine) AudioSystem.getLine(info);
			microphone.open(format);
			microphone.start();
			
			//Audio Input Stream
			audio = new AudioInputStream(microphone);
			
		} catch (Exception ex) {
			ex.printStackTrace();
		}
		
		//Send audio from Microphone to Google Servers and return Text
		try (SpeechClient client = SpeechClient.create()) {
			
			ResponseObserver<StreamingRecognizeResponse> responseObserver = new ResponseObserver<StreamingRecognizeResponse>() {
				
				public void onStart(StreamController controller) {
					System.out.println("Started....");
				}
				
				public void onResponse(StreamingRecognizeResponse response) {
					System.out.println(response.getResults(0));
				}
				
				public void onComplete() {
					System.out.println("Complete");
				}
				
				public void onError(Throwable t) {
					System.err.println(t);
				}
			};
			
			ClientStream<StreamingRecognizeRequest> clientStream = client.streamingRecognizeCallable().splitCall(responseObserver);
			
			RecognitionConfig recConfig = RecognitionConfig.newBuilder().setEncoding(RecognitionConfig.AudioEncoding.LINEAR16).setLanguageCode("en-US").setSampleRateHertz(16000)
					.build();
			StreamingRecognitionConfig config = StreamingRecognitionConfig.newBuilder().setConfig(recConfig).build();
			
			StreamingRecognizeRequest request = StreamingRecognizeRequest.newBuilder().setStreamingConfig(config).build(); // The first request in a streaming call has to be a config
			
			clientStream.send(request);
			
			//Infinity loop from microphone
			while (true) {
				byte[] data = new byte[10];
				try {
					audio.read(data);
				} catch (IOException e) {
					System.out.println(e);
				}
				request = StreamingRecognizeRequest.newBuilder().setAudioContent(ByteString.copyFrom(data)).build();
				clientStream.send(request);
			}
		} catch (Exception e) {
			System.out.println(e);
		}
		
	}
	
	/**
	 * Checks if the Microphone is available
	 */
	public static void checkMicrophoneAvailability() {
		enumerateMicrophones().forEach((string , info) -> {
			System.out.println("Name :" + string);
		});
	}
	
	/**
	 * Generates a hashmap to simplify the microphone selection process. The keyset is the name of the audio device's Mixer The value is the first
	 * lineInfo from that Mixer.
	 * 
	 * @author Aaron Gokaslan (Skylion)
	 * @return The generated hashmap
	 */
	public static HashMap<String,Line.Info> enumerateMicrophones() {
		HashMap<String,Line.Info> out = new HashMap<String,Line.Info>();
		Mixer.Info[] mixerInfos = AudioSystem.getMixerInfo();
		for (Mixer.Info info : mixerInfos) {
			Mixer m = AudioSystem.getMixer(info);
			Line.Info[] lineInfos = m.getTargetLineInfo();
			if (lineInfos.length >= 1 && lineInfos[0].getLineClass().equals(TargetDataLine.class))//Only adds to hashmap if it is audio input device
				out.put(info.getName(), lineInfos[0]);//Please enjoy my pun
		}
		return out;
	}
	
	/**
	 * Print available mixers
	 */
	public void printAvailableMixers() {
		
		//Get available Mixers
		Mixer.Info[] mixerInfos = AudioSystem.getMixerInfo();
		
		//Print available Mixers
		Arrays.asList(mixerInfos).forEach(info -> {
			System.err.println("\n-----------Mixer--------------");
			
			Mixer mixer = AudioSystem.getMixer(info);
			
			System.err.println("\nSource Lines");
			
			//SourceLines
			Arrays.asList(mixer.getSourceLineInfo()).forEach(lineInfo -> {
				//Line Name
				System.out.println(info.getName() + "---" + lineInfo);
				Line line = null;
				try {
					line = mixer.getLine(lineInfo);
				} catch (LineUnavailableException e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				}
				System.out.println("\t-----" + line);
			});
			
			System.err.println("\nTarget Lines");
			//TargetLines
			Arrays.asList(mixer.getTargetLineInfo()).forEach(lineInfo -> {
				
				//Line Name
				System.out.println(mixer + "---" + lineInfo);
				Line line = null;
				try {
					line = mixer.getLine(lineInfo);
				} catch (LineUnavailableException e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				}
				System.out.println("\t-----" + line);
				
			});
			
		});
	}
	
	public static void main(String[] args) {
		new GoogleSpeechTest();
	}
	
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.