GithubHelp home page GithubHelp logo

pombredanne / creadur-tentacles Goto Github PK

View Code? Open in Web Editor NEW

This project forked from apache/creadur-tentacles

0.0 1.0 0.0 452 KB

Mirror of Apache Tentacles

License: Apache License 2.0

Java 98.80% CSS 1.20%

creadur-tentacles's Introduction

# Running

The tool will download all the archives from a staging repo, unpack
them and create a little report of what is there.

    java -ea -jar apache-tentacles-0.1-SNAPSHOT.jar https://repository.apache.org/content/repositories/orgapacheopenejb-090

Assertions must be enabled.

The tool is not specific to maven and will simply recursively walk
the provided URL and download all files matching the following
pattern:

    .*\.(jar|zip|war|ear|rar|tar.gz)

Tar.gz files are downloaded though there is currently no support for
unpacking them.

# Output

Once the tool has run, the following files directories will exist:

    repo/
    content/
    archives.html
    licenses.html
    notices.html
    style.css
    org.apache.openejb.openejb-core.3.0.4.openejb-core-3.0.4.jar.licenses.html
    org.apache.openejb.openejb-core.3.0.4.openejb-core-3.0.4.jar.notices.html
    org.apache.openejb.openejb-standalone.3.0.4.openejb-standalone-3.0.4.zip.licenses.html
    org.apache.openejb.openejb-standalone.3.0.4.openejb-standalone-3.0.4.zip.notices.html
    org.apache.openejb.openejb-tomcat-webapp.3.0.4.openejb-tomcat-webapp-3.0.4.war.licenses.html
    org.apache.openejb.openejb-tomcat-webapp.3.0.4.openejb-tomcat-webapp-3.0.4.war.notices.html
    ...

## repo

The repo directory will contain the full set of binaries, unmodified.
Theoretically, this tool could also download and check signatures
though it does not do that now.

## content

The content directory will contain the unpacked version of the
downloaded binaries

So this file for example:

    repo/foo.zip

Will be unpacked at the following location:

    content/foo.zip.contents/
    content/foo.zip.contents/LICENSE
    content/foo.zip.contents/NOTICE
    content/foo.zip.contents/README.txt
    content/foo.zip.contents/lib/bar.jar

Unpacking is recursive, so any binaries contained in foo.zip will
also be unpacked.

    content/foo.zip.contents/lib/bar.jar
    content/foo.zip.contents/lib/bar.jar.contents/
    content/foo.zip.contents/lib/bar.jar.contents/LICENSE
    content/foo.zip.contents/lib/bar.jar.contents/NOTICE
    content/foo.zip.contents/lib/bar.jar.contents/README.txt
    content/foo.zip.contents/lib/bar.jar.contents/org/
    content/foo.zip.contents/lib/bar.jar.contents/org/bar/
    content/foo.zip.contents/lib/bar.jar.contents/org/bar/Some.class

## Reports

The "main" report is currently called `archives.html` and will list
all of the top-level binaires, their LICENSE and NOTICE files and any
LICENSE and NOTICE files of any binaries they may contain.

Validation of the output at this point is all still manual.  One of
the first improvements would be to automatically flag any binaries
that:

  - contain no LICENSE and NOTICE files
  - contain more than one LICENSE or NOTICE file

In this report, each binary will have three links listed after its
name '(licenses, notices, contents)'

### foo.zip.licenses.html

This page will display the full text of the LICENSE files included in
the binary.  There will be two sections **Declared** and
**Undeclared**

The Declared section lists the single LICENSE file that was supplied
by the binary itself.  As the tool works recursively, it will also
collect any LICENSE file text from any binaries contained in the
foo.zip.  Well call these "sub" LICENSES for simplicity.

Some attempt is made to figure out if the text from sub LICENSE files
are contained in the declared LICENSE file.  If the sub license text
is contained in the declared LICENSE file it is not listed as
Undeclared.

The matching is not complete or perfect, but does help in more quickly
seeing where there might be a missing LICENSE text that should be
declared.

### foo.zip.notices.html

Functions identical to the previously described LICENSE page with
identical matching.

Note on the code, this all could probably be abstracted.  We probably
don't need separate License and Notice classes.

### foo.zip.contents

The unpacked contents of the foo.zip as described above.  Can be nice
to be able to browse around the zip and look for any jars that might
have LICENSE or NOTICE requirements but were overlooked.

#  Future work

Overall it would be great if this tool could perform some validation

Existence of LICENSE/NOTICE files:
  - flag binaries that contain no LICENSE or NOTICE files
  - flag binaries that contain too many LICENSE or NOTICE files

Contents of LICENSE/NOTICE files:
  - better matching of missing license/notice text
  - look false license/notice text, text that applied to "sub"
    binaries once included in a binary, but are no longer present

creadur-tentacles's People

Contributors

danielshahaf avatar dblevins avatar itstechupnorth avatar ottlinger avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.