GithubHelp home page GithubHelp logo

Comments (3)

RubenKelevra avatar RubenKelevra commented on May 27, 2024 2

Thanks for splitting this up in a dedicated ticket. Makes more sense to discuss it not in a closed ticket :)

By reading your docs and other issues, I understand your preferred way to do this is to mount the cluster repo on pacman's cache dir (/var/cache/pacman/pkg), forcing pacman to get the package from IPFS during the step where it scans its cache dir to see if the package is already present.

This is correct, but pacman needs a writeable path for packages. I would just add a non-writeable path in front of the usual cache dir, what pacman perfectly supports. Pacman would search one directory at a time until the package was either found or not.

It's possible to fall back to a traditional mirror if the IPFS mirror underperforms
I want to be able to use IPFS to download packages, but not if it performs worse than regular mirrors. By setting traditional mirrors in my mirrorlist after my local IPFS gateway, pacman will switch to the old mirrors if the IPFS gateway times out

True, the speed of IPFS might be slower than a traditional mirror. But it's somewhat unlikely since you can download from all cluster members plus all clients which have already the package downloaded. So it's more likely that IPFS will be able to fully utilize your connection, regardless of how fast it is. Additionally, the current speed will largely improve, since IPFS made major improvements with the help of Netflix last week:
https://blog.ipfs.io/2020-02-14-improved-bitswap-for-container-distribution/

The import server was able to handle somewhat around 60 MByte/s via IPFS in my testing. So it should be enough to get new packages fast distributed in the cluster.

Security-wise, the local IPFS node is contained as a low-privileged daemon, no IPFS operation requires root (e.g for the FUSE mount)
As IPFS is still an experimental technology, I will sleep better at night knowing high privileges don't come near my IPFS node.

Mounting the IPFS won't use any superuser rights. IPFS won't run as a superuser, but with a system user account. That's how the cluster and the ipfs running on the import server is currently configured.

I fully understand your sleeping issues, when IPFS would run as root.

Opportunity to use a remote IPFS node.
The way I plan to use the cluster repo, is to replicate the cluster into a server in my home network, then setting it as a mirror the couple of arch machines I have in my LAN. This way, IPFS does not even need to be installed on my arch machines, as I'll set the mirrorlist as: Server = http://localarchclusterpeer/ipns/pkg.pacman.store/$repo/os/$arch

You don't have to replicate the cluster into your home network. You just have to run an IPFS node to get this functionality. Running a cluster node will precache the whole mirror server with every little update (even staging and unstable stuff) this might increase your bandwidth use.

But, updates will be very fast. :)

The best solution IMHO is to run an IPFS client on every machine, they will automatically find each other via mDNS and I plan to write a small script that will automatically pin the updates which are needed on the machine next. So the download in the IPFS cache will happen as soon as new updates are available and you won't need to download any package you don't have installed in your network.

That said, I appreciate that you're running a cluster member :)

Note that now the pacman cache actaully acts the way it should, as a cache to prevent pacman from causing extra network movement for packages that are already present.

IPFS will also cache the packages you have installed. There's no network activity involved to "download" the packages from your local IPFS cache, since they are already present.

If you're running out of disk space (for the IPFS cache) it will clean up the cache, to get some space back automatically. So you can set the cache size to any value you like, and the cleanup process will automatically prevent it from getting bigger.

So there's no 100 GB space requirement to get updates from IPFS, if you think that's the case.

I would recommend like 5-10 GB of cache when you install large packages over IPFS. If not 2 GB should be sufficient.

If you, the cluster master does not serve an IPFS gateway, the load should be balanced well between us cluster peers.

A role that I don't want to have in the first place. I like to move the cluster to trusted arch users or arch developers in the long term, to give a second update channel for users, next to regular mirrors.

Having me as an importer for the cluster isn't an ideal condition. I like to keep the service the best working I can, so maybe I'll add a second or a third server on different locations, but in the long term, I like to get the liability removed from my person to be the guy importing the arch updates into IPFS. If you know what I mean. :)

from pacman.store.

RubenKelevra avatar RubenKelevra commented on May 27, 2024

@guysv are you fine with me going ahead, implementing my idea to the end & pushing this to the backlog until I'm finished?

I think I don't have to the time to implement multiple approaches at the same time.

We can discuss afterwards, what needs improvements and what makes sense to change or add alternative concepts.

from pacman.store.

guysv avatar guysv commented on May 27, 2024

Yes. Lets give the mount approach a try :)

from pacman.store.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.