Comments (12)
Hi Bera,
This is indeed very doable!
Question for you -- it isn't enough to have a dictionary for another
language. It needs to be a ranked dictionary, where you have some sense of
common (a, the, banana) vs uncommon (adolescent, optometrist). Did you come
across such a list for Portuguese? It'd be fun to build your own too with
e.g. nltk and a nice corpus.
Recommended steps:
- install zxcvbn dependencies: coffeescript, java, python, and the
simplejson python module - clone zxcvbn, confirm these steps work for you:
cd zxcvbn/scripts
python build_frequency_lists.py
cd .. - adapt build_frequency_lists.py to add your Portuguese lists and
(optionally) remove the English lists. I recommend doing this by adding
your datasource to zxcvbn/data and making as minor a change as possible to
build_frequency_lists.py to read it in.
Hope that helps, let me know if you have any other questions.
Dan
On 13 May 2014 06:49, Bera [email protected] wrote:
Hi
First off all I want to congrats everyone involved in the development of
this project.I'm interested to improve the usage of this library for Portuguese
language, so to achieve it I need to research for common words in and most
popular password words in this language to build a more accurate bad
password list, I guess.There is some information describing the process for change the code to
provide the dictionary for a set of words in another language, or maybe
some simple approach to use this lib with bad password list and not
permitted password in another language too?Thanks!
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/43
.
from zxcvbn.
Google n-grams might be an acceptable source for ranked dictionaries. Freely available at
http://commondatastorage.googleapis.com/books/syntactic-ngrams/index.html
I suppose tweet word frequency lists could be an even better estimate for casually spelled and keyboard-based word usage, but I have no source for those.
EDIT:
I gave a bad (English-only) link to Google's n-gram data. This one covers more languages, but still does not include Portuguese: http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
from zxcvbn.
Good idea!
On 13 May 2014 14:44, Björn Stein [email protected] wrote:
Google n-grams might be an acceptable source for ranked dictionaries.
Freely available at
http://commondatastorage.googleapis.com/books/syntactic-ngrams/index.htmlI suppose tweet word frequency lists could be an even better estimate for
casually spelled and keyboard-based word usage, but I have no source for
those.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-43017223
.
from zxcvbn.
Thank you guys! I'll try and let you know soon.
from zxcvbn.
@BERA did you ever have any success with this?
from zxcvbn.
Hi Erik
Unfortunately in this year I gave up of it. But I'm planning to add this in
my todo list in the next year.
This will be a great experience for sure. Maybe I can port this for golang
and make an API for IT.
Thanks for your contact and I'm apologize for let this abandon issue for
now.
Em qui, 5 de nov de 2015 às 17:40, Erik Beeson [email protected]
escreveu:
@BERA https://github.com/Bera did you ever have any success with this?
—
Reply to this email directly or view it on GitHub
#43 (comment).
from zxcvbn.
Hello, I would like to add an Italian dictionary to this library. As a first step I added to the file data my dictionary and I modified the file built_frequency_list.py:
DICTIONARIES = dict
(
us_tv_and_film = 30000,
english_wikipedia = 30000,
passwords = 30000,
surnames = 10000,
male_names = None,
female_names = None,
italian_dictionary = None, )
adding the least line before ")" , unfortunately, I could not to compile file built_frequency_list.py. Would you please help me to figure out how to do it. Thank you for your time.
from zxcvbn.
I forked this repository and made some adjustments for Dutch. I added first and last names, and I added words from the Dutch Wikipedia using the same method as for English.
Repository here: https://github.com/pepve/zxcvbn-nl
Relevant commit here: pepve/zxcvbn-nl@30fad91
from zxcvbn.
maybe http://letterfrequency.org/letter-frequency-by-language/ could assist in porting the library to other languages...
from zxcvbn.
To anyone interested, I'm working on adding italian words and names to zxcvbn here, based on wikipedia entries and common italian names.
from zxcvbn.
To help anyone that sat like me with no pyhton experience at alll:
Add your files to the data folder,
Change build_frequency_lists.py so it includes your file name as a dict,
run python build_frequency_lists.py ../data ../src/frequency_lists.coffee,
run npm install
That created a new .js file which contain the new dictionaries!
Good luck to anyone having this issue
from zxcvbn.
And you need pyhon 2 because some things are deprecated in python 3. For example iterItems() are now items().
from zxcvbn.
Related Issues (20)
- Password File
- Electron app
- Code security
- Bypass password File HOT 2
- Creating Electron app
- README.md
- please delete this issue
- Password entropy property HOT 1
- Very slow for certain inputs
- recent year regex is... out of date. HOT 2
- Different score for the password strength for the same string between TypeScript and Java versions HOT 5
- 密码强度检测 HOT 1
- Just saying thanks
- Dead "Dropbox Blog Post" URL at demo site
- Possible DOS when run server side
- A ReDoS vulnerability exists in matching.coffee HOT 8
- Microsoft Edge component version showing wrong HOT 1
- Feature request: provide options for smaller builds
- Recommended successors? HOT 1
- I created a pure Swift version of this - how do I get it listed here?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zxcvbn.