GithubHelp home page GithubHelp logo

Comments (8)

dvzrv avatar dvzrv commented on June 15, 2024 1

There is an interesting wrt. license compliance when the tarballs do not contain the licenses. We can discuss about this in a separate issue/discussion.

We separately make source tarballs available for the upstreams licensed under terms that require it. All Arch Linux systems have the common license files installed which the binary package files require though :)

NB: I might move this issue to ScanCode toolkit or create a new one there as the fix will happen there

Thanks for getting back on this matter so quickly! 🥳

And related: if you want to actually detect the licenses in the code and match that to the metadata-level expression, you may want to look into ScanCode toolkit or .io

Ah thanks, that's a great piece of info! I guess our tooling still lacks there. Personally I rely on SPDX-License-Identifiers in upstream code mostly, or other specific license information provided by upstreams.

from license-expression.

Foxboron avatar Foxboron commented on June 15, 2024 1

To add insult to injury, https://github.com/nix-community/acpi_call is also missing the GPL text and ARCH is further incorrectly reporting a plain "GPL" which means literally "GPL-1.0-or-later" instead of the upstream "GPL-3.0-or-later"... so the work done on ARCH in a bit lossy wrt. upstream and upstream is not even complying with its own license.

fwiw, it's still ongoing work to move to SPDX identifiers. And I assume a lot of issues like that will be fixed over the next year.

from license-expression.

pombredanne avatar pombredanne commented on June 15, 2024

@dvzrv Thanks for using this small lib! This is awesome.

This leads me to the question: Is there a specific reason why LLGPL is treated as a plain license and not as a license exception?

This is an oversight, a data bug on our side and easy to fix : add an is_exception: yes to https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/llgpl.LICENSE

Like in https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/classpath-exception-2.0.LICENSE#L8

https://gitlab.archlinux.org/archlinux/packaging/packages/licenses/-/blob/main/.SRCINFO?ref_type=heads#L7

I would advise to prefix this with arch-linux or something to the same effect, like in LicenseRef-archlinux-none ... even better would be to have a proper license.

We also provide full lists of all known license and exception identifiers separately. The package allows us to centrally share common license files and not repackage them in every package.

There is an interesting wrt. license compliance when the tarballs do not contain the licenses. We can discuss about this in a separate issue/discussion.

NB: I might move this issue to ScanCode toolkit or create a new one there as the fix will happen there

from license-expression.

pombredanne avatar pombredanne commented on June 15, 2024

And related: if you want to actually detect the licenses in the code and match that to the metadata-level expression, you may want to look into ScanCode toolkit or .io

from license-expression.

pombredanne avatar pombredanne commented on June 15, 2024

@dvzrv you wrote:

We separately make source tarballs available for the upstreams licensed under terms that require it. All Arch Linux systems have the common license files installed which the binary package files require though :)

FWIW, we actually wrote a parser for Arch's PKGBUILD shell scripts in https://github.com/nexB/scancode-toolkit/blob/f70bbb7d9d9bab40a9d504e664bc945b6a1630e8/src/packagedcode/bashlex.py#L43 and https://github.com/nexB/scancode-toolkit/blob/f70bbb7d9d9bab40a9d504e664bc945b6a1630e8/src/packagedcode/bashparse.py (which is also the format used by Alpine Linux APKBUILD files and Msys2/mingw) and another parser for PKGINFO/BUILDINFO (for msys2) that I need to validate to use with AUR https://github.com/nexB/scancode-plugins/blob/4df0cf04e1b7b6774ba6e983c7c57002f19327c9/etc/scripts/msys2.py#L883

The problem with a the shared package containing all the license texts is that it satisfies the engineer in me who wants to avoid file duplication in downloads. But it means that each and every binary download such as https://archlinux.org/packages/extra/any/acpi_call-dkms/download/ are also not GPL compliant and are missing the all important GPL license text. :]

To add insult to injury, https://github.com/nix-community/acpi_call is also missing the GPL text and ARCH is further incorrectly reporting a plain "GPL" which means literally "GPL-1.0-or-later" instead of the upstream "GPL-3.0-or-later"... so the work done on ARCH in a bit lossy wrt. upstream and upstream is not even complying with its own license.

Ah thanks, that's a great piece of info! I guess our tooling still lacks there. Personally I rely on SPDX-License-Identifiers in upstream code mostly, or other specific license information provided by upstreams.

That's not bad at first, but for a comprehensive approach, a full scan with ScanCode will not hurt. I helped a few years ago adding proper SPDX License ids in the kernel for instance, but the point is as much as I wish, not everyone is using these.

from license-expression.

pombredanne avatar pombredanne commented on June 15, 2024

@Foxboron 👋 much honored! Tell me how we can help within our modest capabilities. I work a bit with Debian developers too and I want every package to have a clean and clear license expression.

Ideally, I would love to have the top-level PKGBUILD license expression being derived automatically from the actual code licenses notices (or overridden when things are missing/incorrect and eventually pushed upstream as fixes for future releases)

Is this something we could work as a distros collaboration and may find some good souls to help with the effort (which is a massive undertaking at a distro scale ... and makes the work I did on the kernel look like small potatoes)

from license-expression.

Foxboron avatar Foxboron commented on June 15, 2024

Tell me how we can help within our modest capabilities. I work a bit with Debian developers too and I want every package to have a clean and clear license expression.

Thanks for the offer. Currently it is very much a manual process and people are updating license information as they go along with the help of the community.

I'm sure David would appreciate more eyes on the code he has already written :)

https://gitlab.archlinux.org/pacman/namcap/-/blob/master/Namcap/rules/licensepkg.py?ref_type=heads

Ideally, I would love to have the top-level PKGBUILD license expression being derived automatically from the actual code licenses notices (or overridden when things are missing/incorrect and eventually pushed upstream as fixes for future releases)

I suspect this could be part of our developer tooling, our package linter and/or a future CI/CD check. I don't think it could be part of the package manager itself. Arch is probably not going to be as detailed as distros like Debian currently is which helps a bit making this simpler to implement.

Is this something we could work as a distros collaboration and may find some good souls to help with the effort (which is a massive undertaking at a distro scale ... and makes the work I did on the kernel look like small potatoes)

I'm not sure. It's interesting and I can point at the places where such a thing could be implemented. But I'm not sure if anyone in Arch is up for the time investment.

from license-expression.

pombredanne avatar pombredanne commented on June 15, 2024

@dvzrv your code rocks!

See also some pieces of code that may help:

Process a license field in a package manifest:

Enhanced expressions parsing:

from license-expression.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.