GithubHelp home page GithubHelp logo

Comments (18)

RubenKelevra avatar RubenKelevra commented on June 6, 2024 2

Just to give some impression about how many times the cluster gets updated and how it's spaced here's an example how the cluster pinset looks like:

x86-64.archlinux.pkg.pacman.store@2021-01-22T02:27:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T05:42:31+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T07:15:24+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T07:50:30+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T08:18:23+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T08:46:05+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T08:58:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T09:01:40+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T09:20:09+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T09:22:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T10:38:40+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T10:41:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T10:43:59+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T10:49:25+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:06:22+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:08:10+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:10:33+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:13:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:15:12+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:23:56+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:25:55+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:44:53+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:46:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T11:49:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T12:32:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T13:32:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T13:35:00+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T13:58:06+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T14:51:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T14:59:54+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:02:33+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:04:12+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:08:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:19:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:22:25+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:24:31+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:29:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:32:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:35:10+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:37:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:42:30+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:45:50+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:48:37+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:53:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:56:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T15:58:12+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:01:09+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:03:55+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:07:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:11:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:14:56+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:17:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:19:02+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:22:28+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:24:24+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:28:45+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:31:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:35:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:38:52+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:42:27+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T16:45:31+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:16:02+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:24:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:44:28+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:49:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:52:11+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:54:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T17:57:06+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:00:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:04:27+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:08:10+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:11:41+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:15:08+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:20:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:22:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:27:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:30:05+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:33:46+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:36:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:38:41+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:40:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:42:55+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:44:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:46:43+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:49:00+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:51:59+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:54:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T18:57:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:01:55+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:04:50+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:07:45+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:10:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:13:45+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:16:24+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:19:40+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:23:36+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:27:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:30:28+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:33:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:35:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:39:17+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:41:19+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:45:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:49:09+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:53:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T19:57:19+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:00:15+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:02:56+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:05:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:07:44+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:10:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:14:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:22:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:27:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:29:36+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:31:50+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:34:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:36:19+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:38:36+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:41:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:43:22+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:45:28+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:47:55+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:51:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:53:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:55:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:57:45+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T20:59:40+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:01:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:06:14+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:09:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:13:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:16:24+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:18:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:20:30+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:23:27+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:28:08+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:30:26+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:34:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:37:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:41:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:44:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:47:22+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:50:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:54:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:57:15+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T21:59:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:01:41+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:05:48+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:07:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:13:02+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:15:49+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:19:01+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:21:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:25:13+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:27:23+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:30:08+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:33:14+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:37:41+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:40:33+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:43:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:46:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:50:39+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:53:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T22:56:32+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:00:18+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:03:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:07:41+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:10:46+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:13:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:16:09+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:19:14+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:21:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:24:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:27:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:30:11+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:33:15+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:35:37+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:40:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:42:49+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:46:04+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:49:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:52:00+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-22T23:58:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:00:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:03:31+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:06:48+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:09:05+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:12:50+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:15:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:20:24+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:23:48+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:29:42+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:32:16+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:37:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:40:31+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:43:30+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:47:49+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:50:06+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:53:58+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T00:57:20+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:00:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:05:37+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:10:09+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:12:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:16:34+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:21:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:25:35+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:30:17+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:33:49+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:36:51+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:39:53+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:43:14+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:46:45+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:49:21+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:51:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:54:47+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T01:57:06+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T02:09:07+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T02:11:29+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T02:16:57+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T02:19:11+00:00
x86-64.archlinux.pkg.pacman.store@2021-01-23T02:22:50+00:00

from pacman.store.

Luflosi avatar Luflosi commented on June 6, 2024 1

Wouldn't it be possible to optimise IPFS itself so that for every node in the merle dag, it caches wether or not all nodes below that node are available locally? I would imagine, that this would make a recursive pin, that only has a small difference to another recursive pin, about as fast as a non-recursive pin. This could massively reduce the amount of IO required to recursively pin something if most of the stuff is already available locally.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024 1

@FireMasterK wrote:

Not sure if this helps, but we can see if Arch has anything like this: https://wiki.ubuntu.com/Mirrors/PushMirroring

Well, no technical Arch does not. But we use very frequent polling to archive the same thing: Single-digit minute numbers between a package maintainer pushing an update until it has reached the IPFS-pacman-mirror.

But the delay between updates which are available and having them added to the cluster isn't the issue here.

It's more a internal cluster issue, where the changes have to be compared to the old version by each cluster member resulting in very large amounts of IO reading the old data from the disk.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024 1

@Luflosi I've pushed the concurrency fix half an hour ago to ipfs. If you restart your follower you should see a drop in IO load.

Please report back. :)

from pacman.store.

FireMasterK avatar FireMasterK commented on June 6, 2024 1

I feel we should explore pin-update too.

from pacman.store.

guysv avatar guysv commented on June 6, 2024

not solving the underlying issue but what about updating the pins only every few half a day or so? that + bumping on security updates seems reasonable to me. I'm also pretty sure that's how often official mirrors update anyway.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

@guysv there's no way to tell what a security update is and what isn't.

Apart from this that is highly discouraged, as you can tell from the wiki page:

Check the status of the Arch mirrors by visiting the Mirror Status page. It is recommended to only use mirrors that are up to date, i.e. not out of sync.

I only run an update when there's an update available - so there's no unnecessary IO.

I think I found the issue why it's pretty high in IO load. It does up to 4 tasks concurrently right now.

This doesn't reduce the IO usage, but the load, since ipfs does less concurrently.

from pacman.store.

FireMasterK avatar FireMasterK commented on June 6, 2024

Not sure if this helps, but we can see if Arch has anything like this: https://wiki.ubuntu.com/Mirrors/PushMirroring

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

@Luflosi well... yeah, there might be potential to optimize IPFS itself - which would be great.

But - there are already several ways to improve the situation which are implemented:

  • You can use BadgerDB as the block storage, which gives IPFS much more performance than FlatFS - but there's a catch: The cleanup currently doesn't workup very well. See ipfs/go-ds-badger#54

  • The other option to increase speed of an update is to use GraphSync which should increase the speed of a delta transmission between two states. There are network security concerns - that's why I don't use it yet.

  • We could use the pin-update command rather than pin-add for the cluster. I'm not sure why I decided against it some month ago when I wrote version 2, but I think there was a limitation in ipfs-cluster v0.12 which had me use regular pin-add. Not sure how this works out IO wise, thought.

from pacman.store.

teknomunk avatar teknomunk commented on June 6, 2024

Another option besides pinning the root directory and pinning each individual file recursively and the directories non-recursively is to add all the packages in an update to a directory that is not in the normal folder structure but contains the same files and add that to the cluster instead of the individual files. You end up with the same result as the individual file pins, but with fewer cluster entries in every instance except for single updated packages.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

Another option besides pinning the root directory and pinning each individual file recursively and the directories non-recursively is to add all the packages in an update to a directory that is not in the normal folder structure but contains the same files and add that to the cluster instead of the individual files. You end up with the same result as the individual file pins, but with fewer cluster entries in every instance except for single updated packages.

That's not an option.

You'll end up with a lot of folders with unrelated updates which got pushed together. Say if package a to g got updated, I'll add a, b, c, d, e, f, g to a folder.

Now you got an update of d, f, x and y, so I would have to traverse all folders stored in the cluster to find the folder with d and f, delete those files from the folder, push the new version of the folder and create a new folder with d, f, x and y in it.

Over time you end up with a lot of folders with just a single package in them when they haven't got updated in a while.

Edit:

Just to give you an impression, that would be roughly 5259 folders for /community right now. For just ~8000 packages.

from pacman.store.

teknomunk avatar teknomunk commented on June 6, 2024

That is not really what I was wanting to describe. Let me try again.

The folder structure under /ipns/x86-64.archlinux.pkg.pacman.store/ is not changed at all from its current state. What I suggest is having a completely separate directory structure that contains the same package files with a different structure optimized for making the cluster members pin just the new packages without having to check all the other packages and directories in the repo.

As an example, consider that update with only the packages abiword and go-ipfs. You would create a directory like this:

/2021-01-22-001/
/2021-01-22-001/abiword-3.0.4-4-x86_64.pkg.tar.zst
/2021-01-22-001/go-ipfs-0.7.0-1-x86_64.pkg.tar.zst

in addition to updating /extra/ and /community/, then add the hash of the folder /2021-01-22-001/ to the cluster. This folder would exist only in the cluster, just for the purpose of having the cluster members pin those two new packages.

If you then got another set of package updates, you would create another folder for only those packages:

/2021-01-22-002/
/2021-01-22-002/dbus-broker-26-1-x86_64.pkg.tar.zst
/2021-01-22-002/fftw-3.3.9-1-x86_64.pkg.tar.zst
/2021-01-22-002/xorg-docs-1.7.1-3-any.pkg.tar.zst
/2021-01-22-002/yasm-1.3.0-4-x86_64.pkg.tar.zst

After all the packages in a directory have been removed from upstream, after some fixed expiration time, or some other condition, the update directory (i.e. /2021-01-22-001/) is unpinned from the cluster.

Looking at rsync2ipfs-cluster/bin/rsync2cluster.sh, to implement this idea, I think you will only need to modify ipfs_mfs_add_file() to take a third parameter (the update folder path in MFS) along with adding the file's CID to the update folder, and add the update folder to the cluster pin set.

from pacman.store.

Luflosi avatar Luflosi commented on June 6, 2024

@RubenKelevra I tried using BadgerDB a couple weeks ago but I think that made performance worse, possibly because my ZFS recordsize is set to 128K, which might be too large for BadgerDB. But I couldn't find any information about the optimal block size for BadgerDB online. Setting the recordsize to a lower value also reduces the possible compression ratio. Setting it to 4k when ashift is 12 effectively disables compression. I was also concerned about fragmentation at such a low recordsize. I also didn't feel like experimenting since my repo has enough blocks that many operations such as the conversion take over a day to complete, so I just went back to an older snapshot with FlatFS.
I think you're also using ZFS. What recordsize are you using and is BadgerDB working well for you?
I found the Datastore.BloomFilterSize option, which sounds like it has the potential to speed up pinning operations but I couldn't find any documentation on what it actually does. Do you know?

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

@Luflosi wrote

I think you're also using ZFS. What recordsize are you using and is BadgerDB working well for you?

I'm currently using the following settings on the import server:

  • FlatFS
  • Blocksize 256K
  • Checksum: Edon-R
  • dedup: Edon-R+verify
  • compression: off
  • sync: disabled

BadgerFS isn't working for me since I need to hold the data outside and inside of IPFS which makes deduplication very reasonable.

As Blocksize I would recommend 8K as for all databases and compression / deduplication turned off.

I would also recommend to turn off sync in ZFS, which might sound counter-intuitive, but ZFS makes sure that override operations and other atomic changes of data are kept atomic. That's why a database would send a sync, so zfs can ignore the command and do its operations more efficiently.

I found the Datastore.BloomFilterSize option, which sounds like it has the potential to speed up pinning operations but I couldn't find any documentation on what it actually does. Do you know?

Well, it's an interesting feature, but you need to tune it depending on the amount of CIDs you store. It replaces a "let's check if we got that data already" with a "let's have a calculation which gives us with 99.xx% certainty the same result".

It's definitely faster, but rarely used. That's why I stay away from it in my daily IPFS usage.

I don't know what happens when we cannot deliver 0.xx percent of the blocks, because their checksums was too similar, but I guess it wouldn't be pretty.

Moved to #47

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

Hey @teknomunk,

thanks ... I feel like this idea needs a dedicated ticket. Can you move your idea to a new ticket and we discuss there? This one here is more like a meta-ticket and it get's a bit too cluttered with our discussion here :)

from pacman.store.

Luflosi avatar Luflosi commented on June 6, 2024

If you restart your follower you should see a drop in IO load.

Please report back. :)

I restarted my follower a couple days ago and now it finally caught up (pinned everything) but I'm not sure if it's better or worse than before, since I changed quite a lot of things on my end. Since I last ran the follower, I switched Linux distros from Arch Linux to NixOS, put my root FS on the same ZFS pool as everything else, added a SATA SSD as an L2ARC and added the Bloom filter option in the IPFS config.
I think I'm currently limited by the 3GB/s SATA interface my SSD is connected to. When I next reboot my Server, I'll attach it somewhere else that might be a 6GB/s interface.
I'm running a couple Arch Linux LXD containers, so the cluster still has relevance for me besides just donating bandwidth.
Maybe I'll try BadgerFS again at some point but with 4k recordsize and no compression.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

@RubenKelevra wrote

@Luflosi well... yeah, there might be potential to optimize IPFS itself - which would be great.

But - there are already several ways to improve the situation which are implemented:

  • You can use BadgerDB as the block storage, which gives IPFS much more performance than FlatFS - but there's a catch: The cleanup currently doesn't workup very well. See ipfs/go-ds-badger#54

This has been fixed with IPFS 0.8 and IPFS-Cluster 0.13.1.

  • The other option to increase speed of an update is to use GraphSync which should increase the speed of a delta transmission between two states. There are network security concerns - that's why I don't use it yet.

Still not fixed upstream.

  • We could use the pin-update command rather than pin-add for the cluster. I'm not sure why I decided against it some month ago when I wrote version 2, but I think there was a limitation in ipfs-cluster v0.12 which had me use regular pin-add. Not sure how this works out IO wise, thought.

Today I started running the cluster with pin-update. Please report back how the IO is now.

from pacman.store.

RubenKelevra avatar RubenKelevra commented on June 6, 2024

Since no further feedback has come in, I guess this is resolved.

from pacman.store.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.