GithubHelp home page GithubHelp logo

Comments (8)

alexeagle avatar alexeagle commented on June 10, 2024

Just to clarify, you have a tree

repo
 - our_app
 - vendor
   - node_modules
     - <stuff in here is binary distributions layed out following the npm conventions, same as npm install would do>

Or is your vendored directory actually sources that need to be built?

vastly reduce the work done

do you have evidence that a smaller node_modules/* input to the build or to individual actions makes it any faster? I could imagine that for remote actions the input size is a problem, but I haven't seen this problem locally

As a workaround, the rules here accept a node_modules attribute which you could assign to a label that is filegroup-like, which would let you reduce which inputs the nodejs_binary uses.

from rules_nodejs.

kamalmarhubi avatar kamalmarhubi commented on June 10, 2024

@alexeagle we actually have

repo
├── node_modules
│   └── <stuff in here is binary distributions layed out following the npm conventions, same as npm install would do>
└── ourstuff

where we basically commit results of npm install to our repo. (This makes the few native modules we use special snowflakes, but that's something I would hope to fix via bazel.)

The concern I originally had with the monolithic node_modules dependency is that each executable / script in our repo only depends on only a subset of the hundreds of modules in node_modules. By having a single node_modules target, any update to any module will require a rebuild / retest of the whole repo, instead of only those targets dependent on the changed or added module.

I don't have any measurements, as there are no build rules for coffeescript, which is what a substantial portion of the repo is written in. However your question made me take a closer look at our commit history. It looks like node_modules changes less often than I'd originally thought, so this is probably not a huge issue during development. From an artifact size perspective, however, if we package nodejs_binary rules for distribution they'll end up being hundreds of MB if they use even one module. There may still be some value in allowing dependencies on individual modules.

Curious to hear your thoughts!

from rules_nodejs.

alexeagle avatar alexeagle commented on June 10, 2024

for nodejs_binary deployments on the server we don't care that much, a few hundred MB is the standard size for Java programs in a large monorepo with a lot of code sharing and a deep tech stack.
For a presumed npm_bundle rule (#10), none of the node_modules would be included, only the package.json file.

Note that you can have multiple node_modules_xx targets, each a filegroup with a manually curated set of glob patterns, to satisfy your use case - but I think the manual curation won't be practical over time or at scale. Maybe that can be a workaround if we do discover some reason that the whole of node_modules can't be an input.

There's a scheme in https://github.com/dropbox/rules_node to generate the bazel workspace from the package.json, which you might want to check out.

from rules_nodejs.

pauldraper avatar pauldraper commented on June 10, 2024

For an idea of "how big are we talking", take https://angular.io/guide/quickstart . Following the instructions, I get a node_modules directory of 335 MB. Given that's just the recommended "getting started" Angular app, I think it's not unlikely for a combined node_modules/ for a monolithic repo to be better part of a gigabyte.

That causes several problems with rules_nodejs:

  1. Install times are lengthy if you use anything in node_modules.
  2. It disobeys the law of incrementality, which is that the time rebuild after a change is proportional to the size of the change (rather than the size of the code). (@kamalmarhubi #18 (comment))
  3. It requires large uploads for distributed execution. (@alexeagle #18 (comment))
  4. It doesn't facilitate well-structured code bases. For example, I cannot easily ask "What uses this dependency?"
  5. It results in large deployment targets. (@alexeagle
    #18 (comment)) On the one hand, this may not be an issue since "disk space is cheap". On the other hand, as servers become more numerous and more ephemeral, many containerized deployments fight to optimize out every MB possible. "A 100 MB here or there isn't important" is not as true as we might like it to be.

(1) can only be fixed by making Bazel-native installs, in the style of https://github.com/johnynek/bazel-deps (Maven) or https://github.com/dropbox/rules_node (npm).

(2), (3), (4), and (5) can be fixed in less drastic ways. npm/yarn can still resolve and install the dependencies, compile native bindings, etc. But Bazel, via the generated lock file, could understand how those installed dependencies are linked. Rules could depend on specific dependencies and rules_nodejs would understand which files those are. This is not a very complicated or error-prone algorithm, and it would help enormously with those issues. (Or you could opt-out of the fine grained dependency management, and basic workflow is still all the same.)

Thoughts @alexeagle ?

from rules_nodejs.

gregmagolan avatar gregmagolan commented on June 10, 2024

@kamalmarhubi @pauldraper Alex and I have a PR up for fine grained npm dependencies via yarn_install/npm_install based on ideas from https://github.com/pubref/rules_node.

This PR defined targets for each node module installed and will allow for the following:

        yarn_install(
          name = "npm",
          package_json = "//:package.json",
          yarn_lock = "//:yarn.lock",
        )

        nodejs_binary(
          name = "my_binary",
          ...
          data = [
              "@npm//:foo",
              "@npm//:bar",
              ...
          ],
        )

from rules_nodejs.

gregmagolan avatar gregmagolan commented on June 10, 2024

@pauldraper We've seen performance issues with large node_module filegroups as well bazelbuild/bazel#5153

from rules_nodejs.

pauldraper avatar pauldraper commented on June 10, 2024

@gregmagolan 👍 that's exactly what I was thinking.

Looks like proposal is at https://docs.google.com/document/d/1AfjHMLVyE_vYwlHSK7k7yW_IIGppSxsQtPm9PTr1xEo/preview

from rules_nodejs.

alexeagle avatar alexeagle commented on June 10, 2024

This is in now, note that a breaking change is still coming in #337

from rules_nodejs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.