Comments (7)
Missing files:
- CommonVoice-Data/names.py (needs data from a Italian source, and properly patched to be able to parse italian data) #4
- CommonVoice-Data/libretheatre.py (needs data from Italian source, maybe has to be rewritten entirely) #5
- CommonVoice-Data/wikipedia.py (done)
- CommonVoice-Data/wikisource.py (needs italian translation of the book "Le forceures de blocus", has to be rewritten to be able to scrap the italian book) #5
- CommonVoice-Data/framabook.py #5
- CommonVoice-Data/utils.py (needs to be adapted to italian language)
from deepspeech-italian-model.
Just an update, docker file is already aligned to use italian so if you want contribute:
- Read the docker file also if you don't know that because it is the entrypoint for all the scripts
- Read run.sh
- Read run_fr.sh to understand the order of the scripts
- Migrate every single file to use italian parameters or italian files
For any other info we are on telegram [at]mozitabot in the Developers channel.
You can do a pr also for only one file if you don't have time and other people will start work on the others.
from deepspeech-italian-model.
Already started working on it #3
YEAH!
from deepspeech-italian-model.
Talking with @lissyx libretheather and framabook can be removed and replaced with something else insted for wikisource we can use another book.
The point of this script is to have resources to test the model generated so we can evaluate what to use.
So let's focus on names.py
from deepspeech-italian-model.
One of the things that we have to do is to generate the model and upload it, https://github.com/MozillaItalia/commonvoice-it/blob/master/DeepSpeech/build_lm.sh#L12
This script in case doesn't exist, automatically download one already avalaible.
from deepspeech-italian-model.
We have now 2 new ticket to track better what we have to do at #4 and #5
from deepspeech-italian-model.
We can remove trainingspeech because it is a french project and I don't think that exists a similar for italian https://gitlab.com/nicolaspanel/TrainingSpeech
Done
from deepspeech-italian-model.
Related Issues (20)
- MITADS - Transcript roman numbers HOT 4
- Readme improvements
- Not clear how to do a simple speech recognition HOT 9
- deepspeech - lm.binary and trie: how to? HOT 4
- Create the "contributing" file HOT 1
- Experiment on creating a new dataset audio+text HOT 3
- Voxforge bad samples, help for cleaning up HOT 3
- MITADS - convert numbers to their literal expression HOT 2
- LIST OF AUDIO+TEXT DATASETS HOT 10
- Really bad results on Raspberry Pi 4 HOT 1
- Other italian models for transfer learning HOT 4
- MITADS - new corpora to import HOT 3
- MLS and MAILABS: considerations and issues ( Have you seen my apostrophe?) HOT 9
- Building a custom external scorer (extending the Italian text corpus) HOT 4
- ERROR: Model provided has model identifier 'K�+�', should be 'TFL3' HOT 5
- Project license HOT 3
- Migrate to Coqui
- Docker build fail HOT 2
- Documentation about how to run the various bash script alone
- DOCKERFILE Merge flag TRANSFER_LEARNING and DROP_SOURCE_LAYER HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepspeech-italian-model.