crate-ci / typos Goto Github PK
View Code? Open in Web Editor NEWSource code spell checker
License: Apache License 2.0
Source code spell checker
License: Apache License 2.0
Some hex numbers might appear as words and those words might look like typos. We should do like scspell and ignore them.
For large projects, it can be helpful to support layered configs.
These would exist for helping to debug configurations
Right now we proactively parse out lines and then parse within a line. What if instead we found out our line number by counting the new lines afterwards? This puts the cost on the typo case, which should be rare, rather than on every case when parsing
See https://github.com/myint/scspell/blob/master/scspell/__init__.py#L363 for how scspell does it.
I'm thinking this should be one of the programmatic messages sent to stdout rather than using a logging crate.
Currently we load the entire file in memory and search for a null byte through all of it.
See #29 for other implementations for how to speed it up
Seems like some people might like this as an optional feature
See https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
split_ident
allocates into a Vec
rather than lazily returning values as an iterator.
See https://docs.rs/exit-code/1.0.0/exit_code/
Including broken pipe
Codes of interest
Source
Include
Scspell will parse the file assumng escape characters exist but let you opt out.
See https://github.com/myint/scspell/blob/master/scspell/__init__.py#L78
Escape character support could instead be a file type setting, like dictionary values
KStringCow
has the following states:
Box<str>
'static str
's str
If we add a From
to it, we can possibly detect being able to use the inline string and write straight to it, avoiding the allocation when case correcting.
In addition, we'd be dropping from 4 machine words to 3 machine words iirc.
This would be relatively easy while improving the scannability of the results.
At least this is how ripgrep does it.
We'll to define file types and what traits those file types should have (specialized dictionaries, _ / - as identifier characters, and whether escape sequences are supported (#3).
This can then be extended into a config file that works with custom dictionaries (#9) to allow the user to override existing file type definitions or add their own.
We're developing a lot of flags. It'd be good if we added a config file so people can easily get a consistent experience
Copy the flags from ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/src/app.rs#L745
Exposed as https://github.com/BurntSushi/ripgrep/blob/master/src/args.rs#L902
Idea comes from codespell
https://github.com/codespell-project/codespell
Some times files should just be ignore for spelling but work for all others
Right now --diff
and --write-changes
are only after everything is done happening. Not showing results as we progress can confuse the user.
The API has gone through some churn. We should audit it before 1.0 to make sure its something we want.
Possibly steal ripgreps cases
Compare to scspell, the go one that we took the list from, and some kind of baseline search, like ripgrep
Right now we proactively parse out lines and then parse within a line. What if instead we found out our line number by counting the new lines afterwards? This puts the cost on the typo case, which should be rare, rather than on every case when parsing
foo\nbar
will look like foo
and nbar
without special handling.
With #14, we're going to have special handing of different file types but one file isn't always a single type
``
as generic code, and code-fences as the specified languageCurrently, all corrections force into a single english dialect. This will cause a lot more failures in a CI/. We should support any dialect.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.