Comments (31)
The migration path is CJS => ESM. Whatever needs to be implemented to ease this transition is important. In this case, implementing loader hooks will help moving code from ESM => CJS. Neat, but not terribly important, IMO.
I would prefer using our resources elsewhere.
Actually, having loader hooks only in ESM may make the transition go faster! (people will move to ESM just to have all the goodies that loader hooks will bring...)
from modules.
ftr, I'm totally fine with getters and setters on module.exports
not working - I find that an absurd pattern and nothing like it should be possible in ESM anyways. CJS mocking is certainly achievable without getters/setters.
from modules.
I think this is a must have. We should provide a simple enough mechanism from APMs to hook in both esm and cjs.
cc @watson
from modules.
I do not see a short-term future where a Node.js application will not a have a commonjs module in its dependency tree. Having support for cjs loader hooks will provide a safe transition to ESM.
Most enterprise Node.js deployments include an APM solution, which is currently based on monkey patching.
Without cjs hooks, any apm vendor should have two solutions: one for cjs and one for esm. In fact, not supporting this will slow the adoption curve.
from modules.
I think if itโs important enough to warrant first-class support in ESM, itโs important enough to have it in CJS. I also think the inverse is true.
from modules.
@mcollina - wouldn't the existing solutions they have work with their existing CJS modules? If they do, and they should, why bother migrating to another, albeit newer and better, solution?
Having said that, it would be nice to have a solution that works similarly and is configured similarly. But I would say that it's a nice to have, not a must have. If, in terms of code, it's not a big thing, then, yes. But if it requires significant work, I would prefer investing energy elsewhere.
Also, could you explain what "APM" means? ๐ณ
from modules.
I see two glaring omission in the recommended implementation for loader hooks
A directory containing package.json will have a termination when crossing the directory.
This assumes that the package boundaries will be exposed by each import statement. ESM doesn't support this functionality due to the nature of facades.
Use Case 1:
rxjs/operators/index.js
export { flat } from './flat';
export { flatMap } from './flatMap';
rxjs/index.js
export { flat, flatMap } from 'rxjs/operators';
app.js
import { flat, flatMap } from 'rxjs';
Which would fire, entry, root, or operators hook; if more than one fire, then in what order?
Use Case 2:
At some point, ESM will have full native support in browsers and Node, eliminating the need to transpile down to ES5. Despite that fact, bundling will still be in wide use where code size and number of requests matters.
Take for instance the following article:
https://medium.com/friendship-dot-js/i-peeked-into-my-node-modules-directory-and-you-wont-believe-what-happened-next-b89f63d21558
It's not uncommon for users of NPM packages to express their disdain for the overly bloated and unnecessarily deep dependency trees that come with some packages. One solution to this approach is for a lib to provide optimized production bundles that tree shake any and all unnecessary code.
If/when ESM is reaches full compatibility between Node and browsers, it'll be more common to see FE workflows bleed into universal modules. Assuming a directory/package structure will remain static may work for CJS workflows but fails to take into consideration how ESM will be used.
I don't really have any solutions to these problems. ESM supports some features that favor new approaches that aren't backward-compatible with CJS.
From what I've seen in the front-end ecosystem, the facade design pattern provides a lot of benefits for library development (ie abstracting API from structure, hiding internals) over the traditional way of doing things (ie deep linking). It's already starting to show up in commonly used libraries.
from modules.
@evanplaice I don't really understand your first point since it is laid out depending on package boundaries, but no package.json
files are in your minimal case. Per execution order, it is fairly well defined in nodejs/node#18233 .
Which would fire, entry, root, or operators hook; if more than one fire, then in what order?
This appears to be having some names that are missing in the example given.
I'm going to rewrite the question as "What hooks fire when, assuming all the given files have a package boundary in the directory they are located in, and app.js
is the entry point?". I'm going to put a leading /
in all the paths for my clarity on specifier vs path.
/package.json
resolve('rxjs', '/app.js')
I assume ends up resolving to /rxjs/index.js
given the example
/rxjs/package.json
resolve('./index.js', '/rxjs')
I assume this leaves the path unchanged. That ends up the first resolve and now the loader gets to loading rxjs/index.js
.
/rxjs/package.json
resolve('rxjs/operators', '/rxjs/index.js')
I assume ends up resolving to /rxjs/operators/index.js
given the example
/rxjs/operators/package.json
resolve('./index.js', '/rxjs/operators/')
I assume this leaves the path unchanged. That ends up the second resolve and now the loader gets to loading ./flat
and ./flatMap
.
/rxjs/operators/package.json
resolve('./flat', '/rxjs/operators/index.js')
If this doesn't cross a package boundary so is only handled by this loader.
/rxjs/operators/package.json
resolve('./flatMap', '/rxjs/operators/index.js')
If this doesn't cross a package boundary so is only handled by this loader.
I'm still unclear on the question and example, but hopefully that gives a clear idea of the order of operations.
For point 2, that article is satire but true. I'm not sure I understand how this relates to supporting --loader
hooks to require though.
from modules.
@bmeck Agreed.
my threat model treats user loaders as trusted code.
from modules.
@ljharb within reason, ESM was not made with CJS interop as a first class citizen. I would prefer if something is only viable in CJS and not ESM it not be put as a priority.
from modules.
@giltayar APM = Application performance management, mostly hooking into node core to time stuff and catch errors
from modules.
There are also other common workflows like Mocking or testing that use similar workflows of replacing a module with a virtual/temporary one.
from modules.
It's absolutely essential to any adoption of ESM that things like proxyquire
and jest.mock
and others have a way to function - iow, there needs to be a way to "mock out" a default or named import.
Supporting this natively in node for ESM is a must; if we can also support it in a similar fashion for CJS, that's very valuable for a migration path - iow, if half my modules are CJS and the other half ESM, i'd want to mock out a foo
whether it's require
d or import
ed.
from modules.
@ljharb that seems fine, but various things like setters/getters on the bindings are not possible still even though CJS mocking commonly does that to properties off module.exports
. If we have specific cases where they don't work (like having getter/setter named exports) in both I'd prefer not to have the functionality exposed.
from modules.
@bmeck for my own benefit, this is similar to/drawing from npm's loaderhooks
spec?
from modules.
@zackschuster this existed prior to theirs so idk which should be seen as drawing from which.
from modules.
@bmeck ah, i didn't know that. it sounded similar to their proposal so i wanted to make sure ๐ thanks!
from modules.
@zackschuster theirs is basically an exact replica of Node's, there are some changes to discuss around problems with the hooks and APIs. In particular Node's intentionally does not require/allow runtime mutation of the loader semantics except by the loader instrumenting it themselves.
from modules.
@bmeck ah, hence the APM references. this is making more sense to me, thank you for taking the time to explain things.
from modules.
I have two use cases for loader hooks that covers CJS modules and would be nice to have for ESMs.
Per "Dynamic Bundling", I'd like to be able to write (parent, child) on every load to build a module dependency graph during testing which I later prune and use to generate bundle metadata.
In production, I use the derived metadata to whitelist loads and check resource integrity.
Right now, I'm hacking updateChildren(parent, children, scan)
in lib/module.js.
Both steps require some configuration, so it'd be nice to be able to provision a loader with flags or a pointer to a config file.
I'd also like to experiment with ways to allow restricting access to modules with APIs that are prone to misuse.
Sometimes we design very powerful APIs that require a certain level of expertise to use correctly,
and then wrap those in APIs meant for general use which make it harder to footgun one's self.
In other contexts, we've had success by adding static checks in linkers and build systems to enable a workflow where we carefully review and whitelist the wrapper modules to make sure they use the misuse-prone APIs correctly, then disallow direct access by other modules.
I'd like to do a POC in a node context.
I believe it would be sufficient to
- be notified on the end of a module load so we can lock down out-of-band loads of sensitive modules like entries in the module cache
- be able to veto before a loader introduces a module to a parent based on a predicate over (parent, child).
(1) might be sufficient because there are tricks we can do with proxy objects and stack introspection but each of those have their own problems.
from modules.
@mikesamuel can you clarify
be able to veto before a loader introduces a module to a parent based on a predicate over (parent, child).
Say I have a loader that wants to do this behavior. It is given ~ (request_string, parent_url)
and it returns (url)
. At what point are trying to stop propagation of the import
and produce an error?
Currently loaders do not automatically ever call the "parent" loader. So if this is the only loader, it could cause import to fail by:
throw
ing an Error manuallyreturn
ing an invalid URL that cannot be loaded
from modules.
If we put the loaders which are already somewhat isolated into a thread we could use a single async resolve for both module systems. The hooks would need to define what mechanism is being used to load things however so that it can switch on various differences like if the specifier is a URL component for import
or a path for require
.
from modules.
Say I have a loader that wants to do this behavior. It is given ~ (request_string, parent_url) and
it returns (url). At what point are trying to stop propagation of the import and produce an error?Currently loaders do not automatically ever call the "parent" loader. So if this is the only loader,
it could cause import to fail by:
I'm still trying to wrap my head around all the things that 6.1.2 lets you do.
How does the lack of implicit parent loader chaining affect things?
Is that a concern for vm
contexts?
Not chaining on a vetoed resolve seems like it would fail safe, but if an error from a loader
causes a fallback to a sibling loader then I'd prefer option (3) below.
- throwing an Error manually
- returning an invalid URL that cannot be loaded
I would be happy with either of those or even just
- Substitutes a powerless module, e.g. one that exports nothing or
a stub that is API compatible but which has no side effects.
I think any solution needs to prevent:
- Prevent module initialization code from running
- Log sufficient context to make it clear what module didn't load, where the resolution started, and that it was vetoed because of a whitelist constraint.
I'll defer to people who better understand the loading process on whether the runtime should cache the fact that a module load failed and which solution achieves the right balance.
Logs need not reflect multiple attempts to load the same module identifier.
from modules.
@mikesamuel I think the problem might be that you linked to a spec that is no longer being developed to land in browsers or node. Node hooks are minimal and documented at https://nodejs.org/dist/latest-v9.x/docs/api/esm.html
from modules.
@bmeck Sorry, I'm an idiot.
That node hook doc makes no mention of "parent loaders." Should I be looking at your "RFC: Per Package Loader Hooks #18233"?
Loaders therefore need to have a concept of a parent loader hooks to defer to, or to ignore.
from modules.
@mikesamuel you can look at that or nodejs/node#18914
from modules.
Thanks. I see the loop at https://github.com/bmeck/node/blob/6b94c9ccbfe70773001260ade15a60f81d583a17/lib/internal/process/modules.js#L56-L111
It looks like each userLoader gets to specify a factory that receives the previous links in the loader chain.
IIUC, earlier resolvers get to control what URL reaches later loaders, but the last loader gets to ultimately decide whether a load resolves to anything.
An error in a resolver would cause line 71 to resolve to a broken promise, so would propagate through the await in the next one.
So it looks like either of your options below would work.
- throwing an Error manually
- returning an invalid URL that cannot be loaded
An early user loader could defeat either by conspiring with a late user loader but my threat model treats user loaders as trusted code.
from modules.
@mikesamuel can you explain the early loader and late loader vector? Early loaders can already rewrite any loader being setup during bootstrapping.
from modules.
If an early loader can find a fixed point URL -- one that the middle loaders return unchanged -- then it can store the specifier in a side table it shares with the late loader and return the fixed point URL.
The middle loaders never see the specifier in the side table.
The late loader can then retrieve the value that the early loader squirreled away from the table.
There side table is just one kind of shared state that serves to make the communication independent of promise resolution order.
// conspiracy.js
exports const urlSideTable = new Map();
// early loader
import urlSideTable from './conspiracy.js'
let counter = 0
export async function resolve(specifier, parentModuleURL, defaultResolver) {
const substitute = `url-that-other-loaders-will-approve-${counter++}`
urlSideTable.set(substitute, specifier)
return {
url: new URL(substitute).href,
format: 'esm'
};
}
// late-loader.js
import urlSideTable from './conspiracy.js'
export async function resolve(specifier, parentModuleURL, defaultResolver) {
const original = urlSideTable.get(specifier)
return /* some computation over original */
}
from modules.
@mikesamuel I've thought about completely isolating their global space before, but don't think loaders can be seen as untrusted since you need to set them via CLI flags or environment variables on startup. My suggestion is that you can have your secure loader prevent all the following loaders from sharing the global space by putting them into a vm.Context
if you want.
from modules.
going to table this for now, we can come back later and reopen if desired.
from modules.
Related Issues (20)
- Mocking use cases HOT 13
- Enabling import meta resolve by default? HOT 16
- Relaxing the '#' restriction on imports HOT 8
- Node.js Modules Team Meeting 2020-09-09
- Node.js Modules Team Meeting 2020-09-23 HOT 1
- Document node: scheme HOT 4
- Clarification for bundlers re "import" / "require" matching HOT 7
- Node.js Modules Team Meeting 2020-10-07 HOT 3
- File URL normalization changes HOT 1
- s/master/main HOT 3
- Node.js Modules Team Meeting 2020-10-21 HOT 4
- Experimental Status HOT 18
- Node.js Modules Team Meeting 2020-11-04
- Archiving modules repo and stopping team meetings HOT 1
- Use existing context for SourceTextModule evaluation? HOT 2
- Node.js Modules Team Meeting 2020-11-18
- import local files fails while import() succeeds HOT 5
- Proposal: ESM should not export PromiseLike instance HOT 5
- `import.meta.url` is `undefined` inside module file. HOT 3
- Extensionless files should allow --input-type (or work) HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modules.