GithubHelp home page GithubHelp logo

Comments (15)

Tojaj avatar Tojaj commented on May 12, 2024

Createrepo_c does a lot of things in threads (almost everything that could be reasonably parallelized is parallelized) [1] [2] [3]. RPM headers are read and dumped (generation of xml chunks) in threads, checksums are calculated in threads, repodata files are written and compressed simultaneously in threads (one thread per file) also deltarpms are generated in threads.

I looked on this and switched lzma (xz) encoder from lzma_easy_encoder to threaded lzma_stream_encoder_mt [4], here are the results:
https://gist.github.com/Tojaj/cfa35863d481e5f3428d
The performance is worse. This has been just a quick and dirty benchmark, I'll take a look on this in more detail soon (I'll try to tweak configuration a bit) but for now, as expected, it seems that adding more threads to already parallelized tasks doesn't bring any performance boost but brings overhead which causes performance degradation.

[1] https://github.com/rpm-software-management/createrepo_c/blob/master/src/dumper_thread.c#L298
[2] https://github.com/rpm-software-management/createrepo_c/blob/master/src/threads.h
[3] https://github.com/rpm-software-management/createrepo_c/blob/master/src/deltarpms.c#L302
[4] http://git.tukaani.org/?p=xz.git;a=blob_plain;f=doc/examples/04_compress_easy_mt.c;hb=HEAD

from createrepo_c.

pterjan avatar pterjan commented on May 12, 2024

Thank you for the patch, I just tried it on my 4-core machine:

With this patch: real 1m22.385s
Without this patch: real 1m35.695s

So indeed the improvement is not great.

I then tried on the Mageia upload server (32 cores) and this is a different story:

With this patch: real 0m50.894s
Without this patch: real 1m45.635s

from createrepo_c.

Tojaj avatar Tojaj commented on May 12, 2024

@pterjan Thanks for the details! I'm glad that it helped! I'll keep it this as a compile time option yet, but maybe make it available via command line param in future (once threaded decompression is available for XZ too).

I going to leave this issue open until that time.

from createrepo_c.

Conan-Kudo avatar Conan-Kudo commented on May 12, 2024

@Tojaj As far as I knew, it's also available in xz. It was made available the same time, I thought?

from createrepo_c.

Tojaj avatar Tojaj commented on May 12, 2024

@Conan-Kudo according to author, the release notes were wrong and only compression is supported so far

from createrepo_c.

Tojaj avatar Tojaj commented on May 12, 2024

@Conan-Kudo found it: http://permalink.gmane.org/gmane.comp.compression.xz.devel/226

from createrepo_c.

Conan-Kudo avatar Conan-Kudo commented on May 12, 2024

@Tojaj ah, thanks for the reference.

from createrepo_c.

dralley avatar dralley commented on May 12, 2024

Suggestion: switch to zlib-ng instead. The speedup is significant and probably comparable or better than multi-threading the current zlib.

zlib-ng/zlib-ng#871

It is available in Fedora, though not EL.

Also, Xz is no longer really the best option, now that Zstd exists.

from createrepo_c.

j-mracek avatar j-mracek commented on May 12, 2024

It is not a problem but I do not see anyone using zlib-ng in Fedora. And if we will be the only user in RHEL it means additional work and responsibility for us.

from createrepo_c.

j-mracek avatar j-mracek commented on May 12, 2024

Please consider also required compatibility => requirement of backports to distribution with the longer lifecycle (let say 10 years).
A also recommend to test the the performance using a real repository data. Sometimes it can differs.

from createrepo_c.

dralley avatar dralley commented on May 12, 2024

I'm not actually suggesting it be done immediately, I'm just saying (on a 6 year old RFE) that now that we're in the future this would be a better approach than the original recommendation.

I have actually tried it, not with createrepo_c but separately, and it was roughly twice as fast. With repodata specifically

zlib-ng can be built with a compatible API to the original zlib, so it may be possible to propose as a system-wide change in Fedora (eventually). And then all applications would benefit and no changes would be needed here.

from createrepo_c.

Conan-Kudo avatar Conan-Kudo commented on May 12, 2024

Why not go ahead and propose that for F38?

from createrepo_c.

dralley avatar dralley commented on May 12, 2024

Would it come with the expectation that I would be responsible for seeing it through? Because I definitely don't have the C background necessary to do that for a core library.

from createrepo_c.

Conan-Kudo avatar Conan-Kudo commented on May 12, 2024

You're underselling yourself, but you should probably involve the maintainer of zlib and zlib-ng in the Change so they can help you.

from createrepo_c.

j-mracek avatar j-mracek commented on May 12, 2024

May this can be useful - here is a link of the change of RPM compression - https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression. I know it is a different thing but may be it can be helpful.

from createrepo_c.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.