GithubHelp home page GithubHelp logo

l10nregistry-rs's Introduction

l10nregistry-rs's People

Contributors

djg avatar dminor avatar nordzilla avatar zbraniecki avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

djg dminor nordzilla

l10nregistry-rs's Issues

Handle edge-case deep-fallback scenarios

We have a number of bugs in Gecko filed against a scenario where a large L10nContext is loaded in a setup which has two sets of two sources in one locale, and then a message is requested which doesn't exist in any of them. See: https://bugzilla.mozilla.org/show_bug.cgi?id=1642415

In such a scenario right now, according to the logic, we will generate hundreds of thousands permutations and Bundles.

I provided a minimized benchmark in #11.

My initial idea for a solution is to introduce a meta-source concept. In that, we'd separate langpack-en-US with two sources from packaged-en-US with two sources, and will only permutate between them.
In result we'll have three levels of iteration (locales-> 2 x meta-sources->2 x sources) instead of two (locales-> 4 x sources).

That should reduce the number of Bundles in that scenario from 194 000 to 441*2=882.

That's still more conservative than what we'll face in reality, because that scenario in solver assumes both toolkit and browser sources contain all available files. In reality the number of overlaps is very slim, so we should end up with ~20-30 permutations in production.

I'm open to other ideas on how to evaluate it, and there's at least one more in the bugzilla bug suggested by Axel.

Bring back prefetching

JS implementation of L10nRegistry is offering an ability to "prefetch" resources basically triggering the initial load as soon as the list of resources is known.

This allows the l10n resources to be fetched and parsed in parallel with document parsing, instead of fetching starting only after the document has been parsed.

With migration to Rust we can do better. We can allow for single-resource triggers (prefetch_resource) triggered as soon as <link/> is read.

Add source management capabilities

At the moment FileSource list is stored in RefCell<ChunkyVec<>> in https://github.com/zbraniecki/l10nregistry-rs/blob/master/src/registry/mod.rs#L25

This makes it hard to remove sources, and clear sources, even if we allow for updating sources: djg/chunky-vec#2

We need to decide what to do here. @djg 's suggestion is to use RefCell<ChunkyVec<Option<FileSource>>> and if the source is to be removed, just replacing the value with None.

This may complicate the code in several ways as we'll have to iterate over increasing matrix of permutations of resource/source as sources are added/removed. The loopback on them will be fast (source missing -> invalid candidate), but the logic around it is becoming non-trivial.
It also becomes more problematic when we want to consider #8 because then a missing file may indicate a missing source, or a present source with a missing file, and should impact candidate viability differently.

I'd like to see if there's any other way to approach it, even performance-suboptimal.

@djg @Manishearth - do you think that Option<FileSource> is the only possibility here?

Re-add IndexedFileSource

The JS version of L10nRegistry has a concept of IndexedFileSource. We don't use it yet, but it allows us to offload the evaluation of the available files in a given source to build time.
Such source provides a list of present files at construction and in result can provide necessary information to the ProblemSolver without hitting I/O.

This should vastly speed up scenarios where we have a language pack or some other partial source that we want to use as the top choice for files that are present in it, and skip for all missing ones.

This will require our FileSource cache to have ResourceStatus::Available next to ResourceStatus::Missing. We probably will also want such file source to react differently to hasFile - instead of returning None to indicate "I don't know if I have this file", this type of FileSource would return true only for files that are either Loading/Loaded or Available.

Evaluate improved sync-load-during-async-load strategy

As per bug https://bugzilla.mozilla.org/show_bug.cgi?id=1723191 we have scenarios where a sync load is triggered in the middle of an async load and we should consider options for handling such scenario better than with https://github.com/mozilla/l10nregistry-rs/blob/master/src/source/mod.rs#L239-L255

With that approach, we just load the one-off synchronously and return it, while we also complete async and we only will cache the async.

Maybe we could hook into the async load, discard it, supply data from the sync load and cache that?

Allow for custom candidate viability considerations

Current JS based L10nRegistry rejects any candidate that has any file missing. This is a blunt strategy and it results in user perceivable experience degradation in cases where the user is loading a UI that has a lot of sources and one is missing (because it's new).

In such case L10nRegistry will reject such bundle and potentially flip to the next language which means that 1/13 of files missing result in 100% rejection of the locale.

One avenue to improve that is to fine-tune our problem solver to be more subtle about deciding when to reject a candidate. For example, the solver could consider a candidate viable if the number of resources is larger than 3, and the number of missing resources is lower than 2.

This would allow for a single missing file in L10nRegistry not to reject the whole candidate.

Alternatively, we could somehow mark which resources in a bundle are "critical" and bail when one of them is missing, while accepting missing non-critical resources. Or reverse - mark resources as "optional".
But that approach seems more manual and requires more maintenance over time.

My hope is that some simple heuristic will give us a good experience.

Should allow for marking a resource as optional

Some experiments on mozilla-central are initially intended only for English, and their resources are not translated. As experienced in bug 1732676, this makes it a bit too easy for related issues to have an outsized impact.

To address this, we need to make it possible to mark a localization resource as optional. The intent here is that if such a resource file is completely missing, it would be handled the same as an empty file experiences (i.e. use fallback only for the missing strings), rather than forcing the whole experience to use the fallback locale.

At least initially, this "optional" option should be default-false, to match the current practical reality, but allow for later review to change its default.

Allow for shared intl memoizer

At the moment each new FluentBundle gets new IntlLangMemoizer and in result new PluralRules instance.

This has been floated around for a bit in projectfluent/fluent.js#218 and https://bugzilla.mozilla.org/show_bug.cgi?id=1475356

The Rust intl-memoizer has the concept of IntlMemoizer that stores its lang-memoizers internally, we could use it by having L10nRegistry instance have its own instance of IntlMemoizer which returns a reference to IntlLangMemoizer for a given lang that is then used by each FluentBundle.

We'd likely need to add a threadsafe version of that pan-lang memoizer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.