GithubHelp home page GithubHelp logo

Comments (13)

dralley avatar dralley commented on May 25, 2024

Can you give a general overview of how rpm-ostree is currently using the existing RPM metadata format, and how it is using its own special metadata (not specific to this particular change)? Is it equally dependent on RPM repo metadata, or is it independent, but you need a way to make it easier to produce from existing Fedora / RHEL / etc. repos without needing to coordinate standing up a set of separate services across all of those distros?

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

Can you give a general overview of how rpm-ostree is currently using the existing RPM metadata format,

rpm-ostree uses libdnf, the same as dnf.

and how it is using its own special metadata (not specific to this particular change)?

There is no special metadata today. We're talking about adding some new metadata about the rate of change of RPMs, which would not be used by clients by default. It'd be used by build tooling.

Just for the record, I'm copy-pasting from the hackmd below:


proposal: package change metadata

We're working on bodhi-scraper which is part of larger effort in rpm-ostree to optimize packing images.

We require a metadata file (frequencyUpdateInfo.json) in the repodata of every (Fedora/RHEL/SCOS/Kionite) repository. This file will contain the list of all updates to all of the packages of the specific release.

These list of updates are more comprehensive than those present in updateinfo.xml.

We then combine the frequencyUpdateInfo.json of all the current and pending releases and process it to create a file.

Since this file will be required for all rpm-ostree based Linux distributions, we wanted the architecture to integrate with createrepo_c to make the implementation more general.

Option 1: Inject this into primary.xml

Since this is a relatively small amount of additional metadata per package, we could add it to the primary package metadata. primary.xml is already enormous.

Option 2: Add a new updatemeta.json

We could introduce a new metadata file (JSON since this is 2020s) that contains this metadata instead.

Note that either option implies freezing (at least the first version of) the data shipped.

from createrepo_c.

dralley avatar dralley commented on May 25, 2024

rpm-ostree uses libdnf, the same as dnf. There is no special metadata today.

I was expecting that ostree-specific metadata is involved somewhere in the chain, but if not, apologies.

We're talking about adding some new metadata about the rate of change of RPMs, which would not be used by clients by default. It'd be used by build tooling.

Not used by clients by "default", or not used at all?

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

I was expecting that ostree-specific metadata is involved somewhere in the chain, but if not, apologies.

rpm-ostree uses ostree by default; no rpm metadata at all is fetched. https://fedoraproject.org/wiki/Changes/OstreeNativeContainerStable is in the progress of s/ostree/containers/.

Not used by clients by "default", or not used at all?

client = dnf here basically. rpm-md fetches are lazy (usually) - clients only fetch what they care about, except for mirroring. But we obviously intend to use this additional metadata for build tooling which generates container images (usually, server side).

from createrepo_c.

dralley avatar dralley commented on May 25, 2024

rpm-ostree uses ostree by default; no rpm metadata at all is fetched.

Well, that is what I was asking :) Basically I'm just trying to figure out if it is orthogonal to the actual client concerns w/r/t RPM metadata (and you just want it to be present alongside the repo purely for helping out the build tooling) or if it's intertwined.

Because you can have the metadata at a specified place in the repo without necessarily having it be in repodata and registered in repomd.xml. Like kickstart trees are.

Also you may already know this but as RHEL doesn't use createrepo_c, they will be a "special snowflake" no matter what.

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

(and you just want it to be present alongside the repo purely for helping out the build tooling)

Basically this. It's data which is strongly associated with the set of RPMs, and having some sort of external data storage for it creates problems around "lifecycling" this data with the packages.

Because you can have the metadata at a specified place in the repo without necessarily having it be in repodata and registered in repomd.xml. Like kickstart trees are.

Hmm, true. That's definitely an option for PoC work here at least!

Also you may already know this but as RHEL doesn't use createrepo_c, they will be a "special snowflake" no matter what.

I didn't know that...exciting. What is it? Pulp?

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

@RishabhSaini I think basically what we can do for PoC work here is:

  1. Test out creating a copy of the fedora rpm-md repo or a subset even
  2. Inject the frequencyinfo.json file into that
  3. Teach rpm-ostree to try fetching it from the same location as the input repos
  4. Use it if it exists

Once that's done...I'm sure we could ask Fedora infra to try adding this data just manually...maybe have a process that pulls it from a git repo?

That I think the part that requires the most code is step 3, but it shouldn't be too bad.

from createrepo_c.

RishabhSaini avatar RishabhSaini commented on May 25, 2024

Test out creating a copy of the fedora rpm-md repo or a subset even

Does this mean creating a zero sized payload new rpm whose use is just to contain the appropriate metadata (frequencyinfo.json) needed in rpm-ostree?
Then this rpm would need to be published for rpm-ostree to consume

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

rpm-md repositories are just regular files served by a (usually static) webserver. I wasn't thinking we'd make a new rpm, but literally just drop frequencyinfo.json alongside the other repodata files (e.g. the files in this path).

Maybe actually what would work best is to support finding the frequency information in a separate rpm-md repository too...then we could do e.g.:

$ mkdir testrepo
$ cd testrepo
$ echo '{ dummy frequency info }' > frequencyinfo.json
$ createrepo_c 

Then point rpm-ostree at it via a repo file like

[testrepo]
baseurl=file:///path/to/testrepo

etc.

from createrepo_c.

RishabhSaini avatar RishabhSaini commented on May 25, 2024

Okay thanks for the help!

from createrepo_c.

RishabhSaini avatar RishabhSaini commented on May 25, 2024

For easy reference, I will refer
Name of yum repo: frequency.repo
Name of rpmmd repo containing frequencyinfo.json: frequencyRepo

As outlined in fedora-infra/bodhi#5172, the repodata will still need to contain a more comprehensive list of updates than updateinfo.xml called as FrequencyUpdateInfoMetadata.json

Then point rpm-ostree at it via a repo file like

To implement this
frequency.repo will need to be added into fcos-config, so COSA in its build scripts can add it to the /etc/yum.repos.d when creating a new release of FCOS for it to be searchable by rpm-ostree.

Will the frequencyRepo be hosted somewhere (github?) or just kept locally as a folder? How will updates to the repo work?
When bodhi-scraper is done generating an updated version of updateinfo.json the file in frequencyRepo will need to be replaced and then createrepo_c needs to be run to update the repomd.xml and checksums. How and where will this workflow be handled?

from createrepo_c.

j-mracek avatar j-mracek commented on May 25, 2024

I am really sorry but I am simply lost here. Let me summarize what I understand. I understand that rpm-ostree is looking to resolve a problem, but it looks like that the issue is related to building images/containers for RPM-OSTREE. I don't know whether the metadata will contain some unique or additional information that is not present in RPM or only in METADATA.
I don't know whether the proposal is resolving performance issue or something else.

If new metadata are only used internally, I would not recommend to include them in metadata. We have an experience with module.yaml that contains information that was completely not informative to end users and only infrastructure use it for internal purpose. Therefore I would like to avoid it if it is possible.

There is also a problem with propagation of the new type of metadata on user side. Again we learn that with modules where customers regenerate repositories in their workflows and by that way additional metadata are dropped.

Please don't take my note as a negative reaction. I just want to say that I have not enough information and I would like to share our experience with the new type of metadata that are essential for distribution.

from createrepo_c.

cgwalters avatar cgwalters commented on May 25, 2024

Thanks for the reply! The comparison with modulemd is actually quite similar indeed.

In the case of this metadata, unlike modulemd it's not essential. Like modulemd, it really helps if it goes where the rpms go by default.

In the end, this data I think is going to be very small it's just a historical relative frequency of the package update; we discussed trying to insert it into primary.xml. But that's slightly messy because it's not actually something that comes from the RPM headers.

from createrepo_c.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.