canonical / praecepta Goto Github PK

View Code? Open in Web Editor NEW

11.0 21.0 16.0 175 KB

License: Creative Commons Attribution Share Alike 4.0 International

HTML 100.00%

docs web-and-design vale-linter-style vale vale-style

praecepta's Introduction

Canonical documentation style guide

This repository contains the documentation and the Vale rules for the documentation style guide.

The style guide itself is written in Markdown and contained in the en directory.

It is published online at: docs.ubuntu.com/en/styleguide

The Vale rules

The Vale linter operates from a series of rules. These are defined in individual YAML files, grouped into 'Styles'. This repository contains the Canonical set of rules, or the Canonical Style.

Manual check

To manually check your documentation with Vale rules use the following steps:

Install Vale.
Clone this repository.
Run Vale with the configuration file vale.ini from this repository for testing your documentation source files:
```
vale --config ~/praecepta/vale.ini ~/product/docs/
```

For automation, see the Canonical Style GitHub action.

Adding to the rules

Anyone is welcome to submit a PR to add additional rules. However, no additions will be considered unless they are part of the Canonical Style Guide as found at the website above.

For a reference on rule syntax, see the Vale documentation on Styles.

If you are completely new to developing Vale rules, see this introductory guide.

Using the rules

The Vale rules are published here so that they can be used in any workflow anywhere. You can run Vale locally, as part of CI or in a GitHub workflow - all you need is Vale, a configuration file (which you can also copy from this repository) and the Canonical Styles. Two common scenarios are also catered for more directly here, as detailed below.

The Canonical Style GitHub action

This repository also includes a file, action.yml, which is the basis of a GitHub action to automatically run Vale checks on incoming pull requests.

Using the GitHub action in a workflow

Your repository can make use of the action in a workflow. An example workflow is included in this repository and is demonstrated here. Note that the style guide action is merely part of a functioning workflow.

on: [pull_request]

jobs:
  vale:
    name: Style checker
    runs-on: ubuntu-22.04
    defaults:
        run:
            shell: bash
            working-directory: .
    steps:
        - name: Checkout repo to runner
          uses: actions/checkout@v3
        - name: Install styles
          uses: canonical/praecepta@main
        - name: Run Vale tests
          uses: errata-ai/vale-action@reviewdog
          with:
            files: ./docs
            fail_on_error: true

In the example above, the workflow is organised as a single job. This is important as the actions rely on persistence through the run. There are three job steps:

The github/checkout action: this fetches the code from the repo calling the workflow
The style guide action: this fetches the styles and, if not present, a default config for Vale. The example fetches from the main branch since rules are currently under active development.
The vale/reviewdog action: this runs Vale using reviewdog, to insert comments into a pull-request

This workflow uses reviewdog to insert output into review comments on any changes. The advantage of this method is:

the actions only run on new material (i.e. stuff added in the current PR)
it is surfaced directly where it will be noticed

praecepta's People

Contributors

Stargazers

Watchers

Forkers

evilnick evildmp akcano k-dimple tang-mm ru-fu pmatulis a-velasco krzysiekwie rkratky secondskoll keirthana yhontyk alexvonme izmalk medubelko

praecepta's Issues

Rule 11 - Stacked headings: False positives

The following errors:

 /home/izmalk/test/docs/izmalk/kafka-k8s-operator/docs/tutorial/t-enable-encryption.md
 24:1  error  Avoid stacked headings. There   Canonical.011-Headings-not-followed-by-heading 
              should be content for each                                                     
              heading.                                                                       
 29:1  error  Avoid stacked headings. There   Canonical.011-Headings-not-followed-by-heading 
              should be content for each                                                     
              heading.

For the following source fragment:

```shell
Model     Controller  Cloud/Region        Version  SLA          Timestamp
tutorial  microk8s    microk8s/localhost  3.1.5    unsupported  21:32:35+02:00

App                        Version  Status  Scale  Charm                      Channel    Rev  Exposed  Message
...
self-signed-certificates            active      1  self-signed-certificates   stable     72   no       
...

Unit                          Workload  Agent  Address    Ports  Message
...
self-signed-certificates/0*   active    idle   10.1.36.91        
...

The source code contains a code-block that is conflicting with markdown syntax on GitHub, so here is the 100% accurate fragment: https://pastebin.canonical.com/p/xHgWcSjXSt/

Feature request - Rule to detect deprecated commands or features

I'd like to see an existence-style rule with a list of deprecated or otherwise banned commands and even features. The main intent is to detect them being used in code blocks and warn about the usage of a deprecated feature.

For example, juju run-action.

Repeated words rule gives false positives too often

Particularly with web pages

e.g. <a href="something/plop">Plop</a> will trigger this rule

Rule 16 uncaptured cases and false positives

Please see comments from @ru-fu in #52, an I've added a few additional cases.

Cases include:

Capturing all MyST directives that contain #
Not capturing code blocks declared by indentation (MD)
Not capturing code blocks delimited by :: (RST)
Not capturing indented code blocks, as the initial delimiter has a start of new line requirement (MD)
Will capture content with # contained inline (e.g. URLs with anchors or other random strings)
Not capturing MyST colon fenced code blocks
Unable to parse included code files, or excerpts from code - likely out of scope, would require an additional scoping value and additional rules.

Rule 13 - Spell numbers: make exclusions work again

I'm trying to add an exclusion to the rule Rule#13 for the Day 0, Day 1, and Day 2 can be spelled with numbers as in the Software development lifecycle terminology.

Most of my attempts so far were unsuccessful:

adding the strings to the aссept.txt file directly (see documentation);

adding the exceptions section to the rule:

exceptions:
  - Day 0
  - Day 1
  - Day 2

I managed to achieve the desired effect by modifying the regular expression in the Rule:

extends: substitution
message: "Consider using '%s' instead of '%s'"
link: https://docs.ubuntu.com/styleguide/en#numbers
scope:
  - heading
  - list
  - sentence
  - table.header
  - table.cell
level: warning
swap:
  (?<![\-\.\,:\/*-+]|day\s)0(?![\-\.\,:\/*-+]): zero
  (?<![\-\.\,:\/*-+]|day\s)1(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): one
  (?<![\-\.\,:\/*-+]|day\s)2(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): two
  (?<![\-\.\,:\/*-+])3(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): three
  (?<![\-\.\,:\/*-+])4(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): four
  (?<![\-\.\,:\/*-+])5(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): five
  (?<![\-\.\,:\/*-+])6(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): six
  (?<![\-\.\,:\/*-+])7(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): seven
  (?<![\-\.\,:\/*-+])8(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): eight
  (?<![\-\.\,:\/*-+])9(?![\-\.\,:\/*-+])(?! Jan(?!uary)?| Feb(?!ruary)?| Mar(?!ch)?| Apr(?!il)?| May| Jun(?:e)?| Jul(?:y)?| Aug(?:ust)?| Sep(?:tember)?| Oct(?:ober)?| Nov(?:ember)?| Dec(?:ember)?): nine

But I feel like this is too hacky and I'd prefer to have the exceptions or accept.txt to work for this narrow case.

Rule 13 (spell out numbers below 10) should ignore numbers that contain punctuation

Hello!

Rule 13 currently generates warnings for single digits that are preceded and/or followed by punctuation, such as in:

comma separated numbers (like the '5' in '5,000');
numbers preceded with a currency sign (like the '3' in '£3.50');
numbers with a decimal place (like the '5' in '5.15'); and
version numbers (like some/all of the digits in '6.6.0' or the final digit in '3-1-2')

The warnings generated also sometimes differ if the number is at the start and/or end of a line. For example, Vale warns on each digit in the line "Test 6.6.0\n", but only the last two digits in the line "6.6.0\n".

I'm happy to help with preparing a PR or testing as needed, if that would be helpful.

Needs page on images

The 000 rule numbering doesn't scale

Using 000 for a specific rule means there is no space left for adding 'common sense' error checking rules.

Proposal - Renumber these to begin at 500

Add `vale-linter-style` tag

Hello :)

Could you add the vale-linter-style tag to the repo? There are a few links around documentation and other resources that gather all Vale-compatible styles together for people to find.

Add favicon

It would be nice to have a favicon. I see that all other doc sets under docs.ubuntu.com have one, so I imagine it should be possible.

Some of the 'British spelling' suggestions target ambiguous words

E.g.
' checker' suggests 'chequer', but that is only valid in a game. there are valid uses of the word 'checker' which are much more common in docs

Add file specific and project specific exceptions

Implement a way to add project-wide and file-wide exceptions for Vale style checks.

For example:

 /home/izmalk/test/docs/izmalk/kafka-k8s-operator/docs/reference/r-statuses.md
 12:28  error    Did you really mean 'params'?   Vale.Spelling                          
 22:5   warning  Avoid the phrase 'Terminated'   Canonical.020-Cliche-words-and-phrases

In the above output we see an error for the params which is a quote of the exact output produced by a program and a warning for the Terminated word, which is a name of a status and a correct technical term in this context.

I'd like to have a way to set project-specific and, ideally, file-specific exceptions for Vale checks.
File-specific exceptions are great for specific stuff that should not affect the whole project, like quotes of the output or status names.

Make improvements to Vocabulary list

@izmalk made a great suggestion in #79 around making lower case vocabulary entries be captured for upper case usage (for use at the start of sentences). This made me think that plurals are also not accounted for - and terms that can be pluralised should also be captured.

Edit all lower case vocabulary terms to accept upper case using selective syntax like, e.g. [Uu]ppercase
- Full case insensitivity can be a bad idea, as it will allow CAPSLOCK/PascalCase/camelCase usage.
Allow all terms that can be pluralised accept the plural versions as well, e.g. [Nn]namespace(?:s)?
- It would be ideal to use non-capturing groups for plurals, as it would be best to use common syntax, and they will speed up processing compared to capturing groups.
- It could also be a good idea to accept other parts of speech as well.

Rule #13 - Spell out numbers - False positive

The following warnings:

 /home/izmalk/test/docs/izmalk/kafka-k8s-operator/docs/reference/r-requirements.md
 10:5    warning  Use 'one' instead of '1'        Canonical.013-Spell-out-numbers-below-10 
 10:126  warning  Use 'two' instead of '2'        Canonical.013-Spell-out-numbers-below-10

Shown for the following source line:

* 3.1.6+ (due to issues with Juju secrets in previous versions, see [#1](https://bugs.launchpad.net/juju/+bug/2029285) and [#2](https://bugs.launchpad.net/juju/+bug/2029282))

As far as I understand the regular expression specifically excludes numbers preceded or followed by a dot (see the YAML file) . So the warning for 10:5 should not be triggered.

Adding single quotes around the RegEx or removing the escape symbols didn't help.

I'm not sure why the 10:126 warning (for the #2) is triggered but the 10:71 (for the #1) is not.
If I add the # to the negated Lookbehind pattern, both 10:5 and 10:126 warnings go away:

(?<![\-\.\,:\/*-+#])0(?![\-\.\,:\/*-+]): zero

Could that be, that it's a bug that it shows the 10:5 warning when it should have shown the 10:71 instead (for the #1)?

False positive Vale.Spelling results

Testing Kafka-k8s charm docs provides a vast amount of false positive errors, mostly due to terminology.

For example:

 /home/izmalk/test/docs/izmalk/kafka-k8s-operator/docs/tutorial/t-cleanup-environment.md
 4:94   error  Did you really mean             Vale.Spelling 
               'Multipass'?                                  
 6:119  error  Did you really mean             Vale.Spelling 
               'Multipass'?                                  
 16:25  error  Did you really mean 'VMs'?      Vale.Spelling

We have Multipass spelling error, but the Multipass is included in the default Accept.txt file for vocabulary. As far as I understand, that should be enough.

Another example, it keeps triggering for Grafana, even if I add it to the Accept.txt file:

 /home/izmalk/test/docs/izmalk/kafka-k8s-operator/docs/how-to/h-integrate-alerts-dashboards.md
 38:133  error  Did you really mean 'Grafana'?  Vale.Spelling 
 67:56   error  Did you really mean 'Grafana'?  Vale.Spelling

In case I'm wrong about my assumption or just as an alternative, we can try the following method to ignore some words: Ignoring non-dictionary words.

Rule number 4: the string parameters in the message are swapped.

According to the docs for the substitution rule type the first string parameter is used for the correct version of the term.

For example, Consider using '%s' instead of '%s' as opposed to our usage: "Did you really mean '%s' instead of '%s'?".

That leads to the following message: Did you really mean 'Juju' instead of 'juju'?, which is the opposite of what we want to say in the message.

Add License

I would suggest MIT to maximize compatibility with other Vale configurations as well as Vale itself.

Edit: I was mistaken, there are a lot of vale configs without any explicit license.
Edit2: Also, was veeeeeery wrong on how these worked, apparently. Though, my realization was also saddening.

Rule #13 - Optimisation of RegEx

It seems we can optimize the regular expressions used in the rule. See this comment for details.

Needs page on language/spelling

Rule 16 breaks MD parsing

Not entirely sure why, need to have a better look into it, but rule 16 is currently causing Vale to hang when processing MD documents.

My Vale installation was hanging (on both Ubuntu 24.04 and Windows WSL running 20.04 in WSL2), so I started removing the most recently added rules to see which one might be causing the issue.

Targeting *.rst it was functioning fine, but * and *.md were both hanging. Removing rule 16 from the set fixed the issue.

Will do some investigation when I have time later this week.

Add TODO to an avoidance list as a warning

Add a ToDo comment to a list of avoided terms with a warning.
It should be case-insensitive (to cover all the variants: todo, TODO, ToDo).