GithubHelp home page GithubHelp logo

chaoss / grimoirelab-graal Goto Github PK

View Code? Open in Web Editor NEW
21.0 21.0 61.0 4.38 MB

A Generic Repository AnALyzer

License: GNU General Public License v3.0

Python 99.44% Dockerfile 0.56%
agnostic analysis generic source-code

grimoirelab-graal's People

Contributors

evamillan avatar inishchith avatar jgbarah avatar jjmerchante avatar sanjana091001 avatar sduenas avatar sunflowerpku avatar valeriocos avatar vchrombie avatar zhquan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grimoirelab-graal's Issues

[feature request] License analysis: Support repositories that have a single LICENSE file

Fortunately legal departments nowadays are OK with having only a single LICENSE file top-level in source code repositories. We don't need to explicitly state neither license nor copyright in every single file anymore. Since this has become a quite common practice, it would be nice for the license analysis to support this. In the current form it is of no use for such repositories.

Thank you for your consideration and for this great software!

git command - fatal: destination path, repo already exists and is not an empty directory

When executing the docker 3p container, the error below appears after the second execution (included).

2020-05-23 20:59:34,981 - grimoire_elk.elk - ERROR - Error feeding raw from cocom (https://github.com/chaoss/grimoirelab-toolkit): git command - fatal: destination path '/home/grimoirelab/.graal/repositories/https://github.com/chaoss/grimoirelab-toolkit-git' already exists and is not an empty directory.
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 228, in feed_backend
    ocean_backend.feed(**params)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 230, in feed
    self.feed_items(items)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 246, in feed_items
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/perceval/backend.py", line 215, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 166, in fetch_items
    self.graalRepo = self.__create_graal_repository()
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 251, in __create_graal_repository
    repo = GraalRepository.clone(self.uri, self.gitpath)
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/git.py", line 826, in clone
    cls._exec(cmd, env=env)
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/git.py", line 1331, in _exec
    raise RepositoryError(cause=cause)
perceval.errors.RepositoryError: git command - fatal: destination path '/home/grimoirelab/.graal/repositories/https://github.com/chaoss/grimoirelab-toolkit-git' already exists and is not an empty directory.

[graal] Checkout log an issue in case of large repositories

Issue Scenario: I found this out when I was working on the integration task of Graal with ELK (inishchith/gsoc#4). which causes the checkout log (for the number of commits):

Git repository <repo-path> checked out!

This wouldn't be a good to have in case of repositories of a large number of commits causing a console log for eg. [image attached below on the first run]

Screenshot 2019-06-04 at 6 45 07 PM

@valeriocos @jgbarah I was thinking if we could remove the logger for the checkout purpose (a small change but would lead to better log). Would like to have your thoughts :)

Thanks!

Evaluate the approach underlying Graal

Graal contains 6 backends that allow to fetch different kind of information from source code (complexity, licenses, dependencies, etc.). Each backend can fetch source code information using different tools. For instance, coqua relies on pylint and flake8 for code quality, colic leverages on nomos and scancode to get license insights. Every time a backend is enhanced to support a new tool, the corresponding code looses cohesion and the maintenance/evolution effort increase.

The goal of this ticket is to evaluate a different approach for graal backends, where each backend relies on a single tool. In this case scenario, there will be a backend for flake8, pylint, nomos and scancode.

idea: Open Source Metric

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

[cocom] Impossible to checkout the worktree

This issue was reported while testing the raw(collection) phase of ELK integration with Graal's CoCom Backend.

  • Error Log
2019-08-13 08:19:57,700 Fetching commits: 'https://github.com/chaoss/grimoirelab-graal' git repository from 1970-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00; master branches
2019-08-13 08:19:58,862 Analysis failed at 2fb9a49363021922eb0fcc9874baabfc252a827c
2019-08-13 08:19:58,862 Error feeding ocean from cocom (https://github.com/chaoss/grimoirelab-graal): Impossible to checkout the worktree /tmp/worktrees/grimoirelab-graal-git at 2fb9a49363021922eb0fcc9874baabfc252a827c
Traceback (most recent call last):
  File "/home/slimbook/Escritorio/sources/graal/graal/graal.py", line 323, in checkout
    self._exec(cmd_checkout, cwd=self.worktreepath, env=self.gitenv)
  File "/home/slimbook/Escritorio/sources/perceval/perceval/backends/core/git.py", line 1331, in _exec
    raise RepositoryError(cause=cause)
perceval.errors.RepositoryError: git command - error: Your local changes to the following files would be overwritten by checkout:
	graal/backends/core/analyzers/cloc.py
	graal/backends/core/analyzers/lizard.py
	tests/test_cloc.py
Please commit your changes or stash them before you switch branches.
Aborting


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/slimbook/Escritorio/sources/ELK/grimoire_elk/elk.py", line 228, in feed_backend
    ocean_backend.feed(**params)
  File "/home/slimbook/Escritorio/sources/ELK/grimoire_elk/raw/elastic.py", line 228, in feed
    self.feed_items(items)
  File "/home/slimbook/Escritorio/sources/ELK/grimoire_elk/raw/elastic.py", line 244, in feed_items
    for item in items:
  File "/home/slimbook/Escritorio/sources/perceval/perceval/backend.py", line 161, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/home/slimbook/Escritorio/sources/graal/graal/graal.py", line 182, in fetch_items
    raise e
  File "/home/slimbook/Escritorio/sources/graal/graal/graal.py", line 174, in fetch_items
    self.graalRepo.checkout(commit['commit'])
  File "/home/slimbook/Escritorio/sources/graal/graal/graal.py", line 327, in checkout
    raise RepositoryError(cause=cause)
perceval.errors.RepositoryError: Impossible to checkout the worktree /tmp/worktrees/grimoirelab-graal-git at 2fb9a49363021922eb0fcc9874baabfc252a827c
2019-08-13 08:19:58,863 Done cocom 

/cc @valeriocos

[cocom] [colic] Problem with enrichment?

I am running five different projects through Grimoirelab now, including cocom and colic analysis. I'm using my self-built version of grimoirelab/full-3p v0.2.36.

Somehow only two of the projects show up on the dashboards. When investigating this, I found something strange in the enriched index. For example, I run this query (to filter out the bulk of the entries):

GET cocom_chaoss_enrich/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "origin": "https://github.com/philips-software/cogito"
          }
        }
      ]
    }
  }
}

This gives me hits that have "_id" and "file_path" from one project, and (in the same hit) a different "project" key and "origin". In other words, it seems information from multiple projects gets mistakenly combined into the index. Unless of course I'm misinterpreting this, which could very well be...

CoLic throwing path error

Hello, I am participating in the MSR Hackathon and have attempted to install Graal a few times to no avail. Most recently, I set up a clean Debian 11 VM to verify that it wasn't due to prior system configuration, incorrect permissions, etc. Alas, the same error is appearing:

Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from graal.backends.core.colic import CoLic
>>> repo_uri = 'https://github.com/chaoss/grimoirelab-graal'
>>> repo_dir = 'grimoirelab-graal'
>>> cl = CoLic(uri=repo_uri, git_path=repo_dir)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/graal-0.2.9-py3.9.egg/graal/backends/core/colic.py", line 79, in __init__
    if not GraalRepository.exists(exec_path):
  File "/usr/local/lib/python3.9/dist-packages/graal-0.2.9-py3.9.egg/graal/graal.py", line 411, in exists
    return os.path.exists(dest)
  File "/usr/lib/python3.9/genericpath.py", line 19, in exists
    os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
>>> 

I have tried with both a blank directory and having pre-created the grimoirelab-graal folder (not that this should be a blocker, but looking for anything that might be...)

Any pointers or guidance would be very much appreciated. Thanks!

Graal doesn't work with latest version of Git (2.35.1)

When Perceval updates the repository to fetch the items, Git refuses to fetch into the currently checked out branch of any working tree. This change was made in the lastest version of Git (2.35.1): git/git@8bc1f39

  File "/home/runner/work/grimoirelab-graal/grimoirelab-graal/src/perceval/perceval/backends/core/git.py", line 278, in __fetch_from_repo
    commits = self.__fetch_commits_from_repo(repo, from_date, to_date, branches, no_update)
  File "/home/runner/work/grimoirelab-graal/grimoirelab-graal/src/perceval/perceval/backends/core/git.py", line 308, in __fetch_commits_from_repo
    repo.update()
  File "/home/runner/work/grimoirelab-graal/grimoirelab-graal/src/perceval/perceval/backends/core/git.py", line 927, in update
    self._exec(cmd_update, cwd=self.dirpath, env=self.gitenv)
  File "/home/runner/work/grimoirelab-graal/grimoirelab-graal/src/perceval/perceval/backends/core/git.py", line 1349, in _exec
    raise RepositoryError(cause=cause)
perceval.errors.RepositoryError: git command - fatal: refusing to fetch into branch 'refs/heads/master' checked out at '/tmp/worktrees/graaltest'

It only fails when the worktree is created using the following command:

git worktree add /tmp/worktrees/graaltest master

But it doesn't fail when the worktree is created without providing a branch, because the branch name is by default the name of the worktree:

git worktree add /tmp/worktrees/graaltest

Graal backends structure improvement proposal

Hi @zhquan and @vchrombie!

As part of the Grimoirelab MSR hackathon our team has been working on making some refactoring and maintenance contributions for the Graal project. Based on our experience of using Graal and exploring the code we identified some refactoring opportunities. Besides that we looked at the existing issues in this repository, and found issue Issue #89 which discusses the current backend structure/approach of Graal.

Yesterday we had a meeting with @valeriocos to discuss our proposal and got some good feedback. Valerio suggested us to contact you to discuss this further. We are very interested in getting your thoughts on this proposal and would like to get in touch.

We have worked out a POC for improving this structure (https://github.com/wmeijer221/grimoirelab-graal/tree/refactor). To give an impression, below are two figures that show the current and proposed structure.

Current structure

Screenshot from 2022-01-18 15-59-28

Proposed structure

Screenshot from 2022-01-18 15-59-38

[colic] KeyError on execution of ELK with ScanCode-CLI

  • ScanCode-CLI, when executed with ELK, showed the following error. Executes well with other repositories(eg. Graal) when tested.

  • Error Log

Traceback (most recent call last):
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/etc/scripts/scancli.py", line 73, in <module>
    for s in scan(args):
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/etc/scripts/scancli.py", line 64, in scan
    results = channel.receive()
  File "/usr/local/lib/python3.6/site-packages/execnet/gateway_base.py", line 728, in receive
    raise self._getremoteerror() or EOFError()
execnet.gateway_base.RemoteError: Traceback (most recent call last):
  File "<string>", line 1063, in executetask
  File "<string>", line 1, in do_exec
  File "<remote exec>", line 53, in <module>
  File "<remote exec>", line 44, in run_scan
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/src/scancode/cli.py", line 813, in run_scan
    raise ScancodeError(msg + '\n' + traceback.format_exc())
ScancodeError: ERROR: failed to collect codebase at: u'/tmp/worktrees/grimoirelab-elk-git/utils/perceval'
Traceback (most recent call last):
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/src/scancode/cli.py", line 809, in run_scan
    max_in_memory=max_in_memory
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/src/scancode/resource.py", line 281, in __init__
    self._populate()
  File "/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/src/scancode/resource.py", line 440, in _populate
    parent = parent_by_loc.pop(top)
KeyError: u'/tmp/worktrees/grimoirelab-elk-git/utils/perceval/backends'


[2019-08-16 20:16:10,589] - Analysis failed at 41af2b5984ec2ce9b6781bfb1b3e133aedba19a6
Traceback (most recent call last):
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/backends/core/analyzers/scancode.py", line 90, in __analyze_scancode_cli
    msg = subprocess.check_output(cmd_scancli).decode("utf-8")
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['python3', '/Users/Nishchith/GitHub/CHAOSS/exec/scancode-toolkit/etc/scripts/scancli.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/__init__.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/backends/__init__.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/backends/backend.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/utils.py', '/tmp/worktrees/grimoirelab-elk-git/utils/bugzilla2el.py', '/tmp/worktrees/grimoirelab-elk-git/utils/perceval']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-perceval/perceval/backend.py", line 472, in run
    for item in items:
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-perceval/perceval/backend.py", line 589, in fetch
    raise e
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-perceval/perceval/backend.py", line 583, in fetch
    for item in items:
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-perceval/perceval/backend.py", line 162, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/graal.py", line 182, in fetch_items
    raise e
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/graal.py", line 175, in fetch_items
    commit['analysis'] = self._analyze(commit)
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/backends/core/colic.py", line 170, in _analyze
    analysis = self.analyzer.analyze(local_paths)
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/backends/core/colic.py", line 221, in analyze
    analysis = self.analyzer.analyze(**kwargs)
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/backends/core/analyzers/scancode.py", line 127, in analyze
    result = self.__analyze_scancode_cli(kwargs['file_paths'])
  File "/Users/Nishchith/GitHub/CHAOSS/grimoirelab-graal/graal/backends/core/analyzers/scancode.py", line 93, in __analyze_scancode_cli
    e.output.decode("utf-8")))
graal.graal.GraalError: Scancode failed at ['/tmp/worktrees/grimoirelab-elk-git/perceval/__init__.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/backends/__init__.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/backends/backend.py', '/tmp/worktrees/grimoirelab-elk-git/perceval/utils.py', '/tmp/worktrees/grimoirelab-elk-git/utils/bugzilla2el.py', '/tmp/worktrees/grimoirelab-elk-git/utils/perceval'], [

[feature request] Analysing licenses of dependencies

TL/DR

I think it would be valuable for Graal to be able to analyse the (open source) software licenses of dependencies of my software.

Rationale

People and organisations generally want to adhere to the licenses under which software is made available to them. Nowadays this has become so difficult that tooling is needed to help. Take for example the Javascript/nodejs ecosystem. When you develop even a simple client-server application, you easily end up using hundreds and hundreds of open source packages, either directly or through transitive dependencies. It's virtually impossible to collect and verify all this manually for every release of every product.

Feature Request

I envision a component of Graal that can create a "bill of materials" of my software. A table containing all dependencies of my software, both directly and indirectly. The table should have columns for:

  • Name of the dependency
  • Version of the dependency
  • Where the dependency can be found (could be a link to a package manager repository such as maven / npm / etc; or a link to the source code archive)
  • Software license(s) of the dependency; normalised somehow to easily search and filter

Furthermore, a view where the dependency tree is visualised seems useful, but that is probably more of a V2 feature / nice-to-have.

Notes

  • Please note that a package can have multiple licenses, e.g. dual licensed BSD and GPLv2, meaning that you (as a user of the package) can choose which you want to apply (example)
  • A colleague mentioned that https://spdx.org has a good chance of becoming the standard.
  • Please note that this probably depends on #80 to be a useful feature.

Thanks for your consideration and this great software!

[discussion] Dependency Analyzer for Java Projects

JDeps is a class/package dependency analyzer for Java. It is present in Oracle Java/OpenJDK's bin folder. This is a relevant tutorial.

Here are a few issues I am facing while trying to integrate JDeps as an analyzer:

  • Given a JAR file, the output I am getting is extremely large. Here's an example
  • How would I use the git backend to get the relevant JAR files for analysis from github?

Sorry for any mistakes, I am not experienced with Java.

Require DCO sign-off for new commits

This issue is to activate protobot/dco (or similar bot) to check that all commits have a sign-off in this repository.

The CHAOSS Project Charter section 8.2.1 requires that all contributions are signed-off. The CHAOSS project has been piloting the use of DCO sign-offs. Once contributors know how to do it, sign-offs are easy to do with little overhead.

For users of the git command line interface, a sign-off is accomplished with the -s as part of the commit command: git commit -s -m 'This is a commit message'

For users of the GitHub interface, a sign-off is accomplished by writing Signed-off-by: Your Name <[email protected]> into the commit comment field. This can be automated by using a browser plugin like scottrigby/dco-gh-ui

To-Do for repo maintainers: Please inform your contributors about DCO sign-offs and comment on this issue when your are ready for the DCO bot to be activated on this repository.

idea: "Open Source Metric"

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

[scancode] Use scancode from python3 interface

Recently scancode has been ported to Python 3, thus the corresponding analyzer could be updated to call scancode using a python interface instead of a system call.

After talking with @pombredanne, there are some minor fixes to be done on scancode which should be included in the upcoming release.

idea: Open Source Metric

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

idea: Open Source Metric

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

[analyzer] Fix results for deleted files

A file, when deleted in a commit, is incorrectly handled. REF and would lead to

  • Duplicate files ( in case the file path is changed )
  • Loosing out information related to a file.

The issue persists in CoCom and CoLic backend as the analysis performed in both the cases is at file level

issue triggered in discussion of inishchith/gsoc#6

@valeriocos I've currently worked on a fix for CoCom backend. If you agree, i can open a PR and we can review the changes and then move ahead with working on resolving the same issue for CoLic.

Thanks :)

CI is broken because of old python dependency

The CI builds were failing

Collecting bandit>=1.4.0 (from -r requirements.txt (line 6))
  Downloading https://files.pythonhosted.org/packages/1b/b7/be70ee3cc87607ffc474d95ca0ce4d06c2fcad8163cc6b47a99470f09826/bandit-1.7.2-py3-none-any.whl (113kB)
bandit requires Python '>=3.7' but the running Python is 3.6.15

The support for Python 3.6 has officially ended on 23 Dec 2021. I think it makes sense to remove 3.6 and add other versions if needed.

issue in fetching data from graal command

Hi Team,

I have installed Graal using pip command in virtualenv and trying to fetch data in python script so i was getting error.
So i tried to fetch data from commandline and i m getting following error.
Kindly suggest what i can be missing or is it some bug ?

(graal) freda@freda-HP-Laptop-15-bs0xx:~$ graal cocom https://github.com/nodejs/help --git-path /home/freda/nodejs
[2019-07-14 12:14:50,084] - Starting the quest for the Graal.
[2019-07-14 12:14:52,503] - Git worktree /tmp/worktrees/nodejs created!
[2019-07-14 12:14:52,503] - Fetching commits: 'https://github.com/nodejs/help' git repository from 1970-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00; all branches
[2019-07-14 12:14:53,236] - Git repository /home/freda/nodejs checked out!
[2019-07-14 12:14:53,240] - Analysis failed at 9161c1807154ca3dc19dd0ac15106a108cce8cbe
Traceback (most recent call last):
File "/home/freda/venvs/graal/lib/python3.6/site-packages/perceval/backend.py", line 472, in run
for item in items:
File "/home/freda/venvs/graal/lib/python3.6/site-packages/perceval/backend.py", line 589, in fetch
raise e
File "/home/freda/venvs/graal/lib/python3.6/site-packages/perceval/backend.py", line 583, in fetch
for item in items:
File "/home/freda/venvs/graal/lib/python3.6/site-packages/perceval/backend.py", line 162, in fetch
for item in self.fetch_items(category, **kwargs):
File "/home/freda/venvs/graal/lib/python3.6/site-packages/graal/graal.py", line 183, in fetch_items
raise e
File "/home/freda/venvs/graal/lib/python3.6/site-packages/graal/graal.py", line 176, in fetch_items
commit['analysis'] = self._analyze(commit)
File "/home/freda/venvs/graal/lib/python3.6/site-packages/graal/backends/core/cocom.py", line 142, in _analyze
file_info = self.file_analyzer.analyze(local_path)
File "/home/freda/venvs/graal/lib/python3.6/site-packages/graal/backends/core/cocom.py", line 191, in analyze
cloc_analysis = self.cloc.analyze(**kwargs)
File "/home/freda/venvs/graal/lib/python3.6/site-packages/graal/backends/core/analyzers/cloc.py", line 119, in analyze
message = subprocess.check_output(['cloc', file_path]).decode("utf-8")
File "/usr/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/usr/lib/python3.6/subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.6/subprocess.py", line 709, in init
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'cloc': 'cloc'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/freda/venvs/graal/bin/graal", line 129, in
main()
File "/home/freda/venvs/graal/bin/graal", line 75, in main
cmd.run()
File "/home/freda/venvs/graal/lib/python3.6/site-packages/perceval/backend.py", line 480, in run
raise RuntimeError(str(e))
RuntimeError: [Errno 2] No such file or directory: 'cloc': 'cloc'

Travis CI fails due to error when creating nomossa executable

10.80s$ git clone https://github.com/fossology/fossology
before_script.5
0.00s$ cd fossology/src/nomos/agent/
before_script.6
2.80s$ sudo apt-get install libjson-c-dev
41.52s$ make -f Makefile.sa FO_LDFLAGS="-lglib-2.0 -lpq -lglib-2.0 -ljson-c -lpthread -lrt"
gcc -g -O2 -Wall -D_FILE_OFFSET_BITS=64   -c -o encode.o encode.c
gcc -g -O2 -Wall -D_FILE_OFFSET_BITS=64 -o encode encode.c
NOTE: GENSEARCHDATA takes 1-2 minutes to run
./GENSEARCHDATA
gcc -c nomos.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c -DVERSION_S=\"3.8.0-19-g012cd4da2\" -DCOMMIT_HASH_S=\"`git show > /dev/null 2>&1 && git show | head -1 | awk '{print substr($2,1,6)}' || echo "unknown"`\"
nomos.c: In function ‘arsNomos’:
nomos.c:93:14: warning: implicit declaration of function ‘checkDuplicateReq’ [-Wimplicit-function-declaration]
     result = checkDuplicateReq(gl.pgConn, upload_pk, gl.agentPk);
              ^
nomos.c:93:12: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
     result = checkDuplicateReq(gl.pgConn, upload_pk, gl.agentPk);
            ^
nomos.c:107:14: warning: implicit declaration of function ‘getSelectedPFiles’ [-Wimplicit-function-declaration]
     result = getSelectedPFiles(gl.pgConn, upload_pk, gl.agentPk, ignoreFilesWit
              ^
nomos.c:107:12: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
     result = getSelectedPFiles(gl.pgConn, upload_pk, gl.agentPk, ignoreFilesWit
            ^
gcc -c standalone.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c licenses.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c list.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c parse.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c process.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c nomos_regex.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c util.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c nomos_gap.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c nomos_utils.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c doctorBuffer_utils.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c json_writer.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
./PRECHECK
*** START: 1306 license-text strings ***
... "source" in 116 strings
... "free" in 150 strings
... "under" in 248 strings
... "copyright" in 143 strings
... "grant" in 206 strings
... "software" in 255 strings
... "distribut" in 444 strings
... "licen" in 566 strings
... "[iu][nst]" in 1133 strings
*** DONE: 56 strings have no pre-check exclusions ***
./CHECKSTR
CHECKSTR: 1306 phrases, 449 contain wild-cards
gcc -c _precheck.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc -c _autodata.c -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c
gcc nomos.o standalone.o licenses.o list.o parse.o process.o nomos_regex.o util.o nomos_gap.o nomos_utils.o doctorBuffer_utils.o json_writer.o  _precheck.o _autodata.o -DSTANDALONE -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/json-c -lglib-2.0 -lpq -lglib-2.0 -ljson-c -lpthread -lrt -o nomossa
nomos.o: In function `arsNomos':
nomos.c:(.text+0x10c): undefined reference to `checkDuplicateReq'
nomos.c:(.text+0x228): undefined reference to `getSelectedPFiles'
collect2: error: ld returned 1 exit status
Makefile.sa:31: recipe for target 'nomossa' failed
make: *** [nomossa] Error 1
The command "make -f Makefile.sa FO_LDFLAGS="-lglib-2.0 -lpq -lglib-2.0 -ljson-c -lpthread -lrt"" failed and exited with 2 during .

[analyzers] Evaluate the integration of Sloc Cloc and Code (scc)

Sloc Cloc and Code (scc) is a tool similar to cloc, sloccount and tokei. It is able to count lines of code, blank lines, comment lines from many programming languages. Its goal is to is to be the fastest code counter possible, but also perform COCOMO calculation like sloccount and to estimate code complexity similar to cyclomatic complexity calculators. (cc @jgbarah)

[colic] Slow execution of ScanCode-CLI

  • Scancode-CLI as reported, should be faster; but after being tested(below) is found to be slower when the number of files is huge.
Repository Commits Files Analyzer Time
chaoss/grimoirelab-graal 192 54 scancode_cli 00:46:31 hr
chaoss/grimoirelab-graal 192 54 scancode 02:14:11 hr
  • Evaluation results after integration with elk:
Repository Commits Files Analyzer Time
xiph/vorbis 1520 1425 scancode_cli 8:13:39 hr
xiph/vorbis 1520 1425 scancode 8:50:21 hr

idea: Open Source Metric

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

idea: Open Source Metric

At All Things Open, @dizquierdo and I met with Patrick and Nick from OSI. They want to have some publicity around an “Open Source Metric”. Yesterday during the CHAOSS Risk WG, we defined “Open Source Metric” to be a filter “is OSI approved” on top of License Coverage.

If OSI follows through to create some marketing around this “Open Source Metric” and they highlight the CHAOSS project — it would be great to also extend COCOLic to have this metric. Augur implemented the metric using the SPDX license list (json available) to check whether a license is OSI approved — maybe an approach we can do as well.

I imagine that we add a yes/no field "is-osi-approved-license" or something like that.

Is this a Graal issue or should it be in GrimoireELK?

[cocom] Analysis fail : invalid literal for int() with base 10

I got this error when analyse somes of our repositories.

Example of Traceback :

2019-08-17 08:43:51,370 Analysis failed at 5c7cf2756aed68171f7350bb8d80a0fed37162e9
2019-08-17 08:43:51,370 Error feeding ocean from cocom (https://github.com/kiwix/web): invalid literal for int() with base 10: 'File'
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 228, in feed_backend
    ocean_backend.feed(**params)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 228, in feed
    self.feed_items(items)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 244, in feed_items
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/perceval/backend.py", line 161, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 183, in fetch_items
    raise e
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 176, in fetch_items
    commit['analysis'] = self._analyze(commit)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/cocom.py", line 189, in _analyze
    file_info = self.analyzer.analyze(local_path)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/cocom.py", line 242, in analyze
    cloc_analysis = self.cloc.analyze(**kwargs)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/analyzers/cloc.py", line 128, in analyze
    results = self.__analyze_file(message)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/analyzers/cloc.py", line 58, in __analyze_file
    blank_lines = int(info_file[2])
ValueError: invalid literal for int() with base 10: 'File'

or :

2019-08-17 10:43:35,881 Analysis failed at c6ee59a8e1acf65b7931534c9613bf6c5c23909c
2019-08-17 10:43:35,881 Error feeding ocean from cocom (https://github.com/kiwix/kiwix-android): invalid literal for int() with base 10: 'Shell'
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 228, in feed_backend
    ocean_backend.feed(**params)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 228, in feed
    self.feed_items(items)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 244, in feed_items
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/perceval/backend.py", line 161, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 183, in fetch_items
    raise e
  File "/usr/local/lib/python3.5/dist-packages/graal/graal.py", line 176, in fetch_items
    commit['analysis'] = self._analyze(commit)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/cocom.py", line 189, in _analyze
    file_info = self.analyzer.analyze(local_path)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/cocom.py", line 242, in analyze
    cloc_analysis = self.cloc.analyze(**kwargs)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/analyzers/cloc.py", line 128, in analyze
    results = self.__analyze_file(message)
  File "/usr/local/lib/python3.5/dist-packages/graal/backends/core/analyzers/cloc.py", line 58, in __analyze_file
    blank_lines = int(info_file[2])
ValueError: invalid literal for int() with base 10: 'Shell'

Tested with graal integration : chaoss/grimoirelab-elk#672

[cocom] Request to Elasticsearch fails on grimoire_creation_date

I'm running the cocom analysis, using grimoirelab/full-3p that I built myself from master (0.2.36 / c612e657fdd99e00c9389d0021612784eb1ec122), on a project from GitHub. When I open the cocom_study_project_wise_evolution_ccn_functions view, I get an error:

Error: Request to Elasticsearch failed: {"error":{"root_cause":[{"type":"query_shard_exception","reason":"No mapping found for [grimoire_creation_date] in order to sort on","index_uuid":"0qVhhLiPSN2JTIAQ72whWA","index":"cocom_chaoss_enrich"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"cocom_chaoss_enrich","node":"0sQ_NbQIRI-O8pt5Om2VIA","reason":{"type":"query_shard_exception","reason":"No mapping found for [grimoire_creation_date] in order to sort on","index_uuid":"0qVhhLiPSN2JTIAQ72whWA","index":"cocom_chaoss_enrich"}}]},"status":400}
KbnError@http://localhost:5601/bundles/commons.bundle.js?v=16428:56:25656
RequestFailure@http://localhost:5601/bundles/commons.bundle.js?v=16428:56:26448
http://localhost:5601/bundles/kibana.bundle.js?v=16428:61:655962
http://localhost:5601/bundles/commons.bundle.js?v=16428:56:19136
map@[native code]
http://localhost:5601/bundles/commons.bundle.js?v=16428:56:18480
processQueue@http://localhost:5601/bundles/commons.bundle.js?v=16428:35:132458
http://localhost:5601/bundles/commons.bundle.js?v=16428:35:133361
$digest@http://localhost:5601/bundles/commons.bundle.js?v=16428:35:144241
$apply@http://localhost:5601/bundles/commons.bundle.js?v=16428:35:147025
done@http://localhost:5601/bundles/commons.bundle.js?v=16428:35:100032
completeRequest@http://localhost:5601/bundles/commons.bundle.js?v=16428:35:104705
http://localhost:5601/bundles/commons.bundle.js?v=16428:35:105450

This is my projects json:

{
  "Fresco": {
    "git": ["https://github.com/philips-software/fresco-logistic-regression-2"],
    "github": ["https://github.com/philips-software/fresco-logistic-regression-2"],
    "colic": ["https://github.com/philips-software/fresco-logistic-regression-2"],
    "cocom": ["https://github.com/philips-software/fresco-logistic-regression-2"]
  }
}

This is the relevant part of my config:

[cocom]
raw_index = cocom_chaoss
enriched_index = cocom_chaoss_enrich
category = code_complexity_lizard_file
studies = [enrich_cocom_analysis]
branches = master
git-path = /tmp/git-cocom
worktree-path = /tmp/cocom/

[enrich_cocom_analysis]
out_index = cocom_chaoss_study
interval_months = [1]

[colic]
raw_index = colic_chaoss
enriched_index = colic_chaoss_enrich
category = code_license_nomos
studies = [enrich_colic_analysis]
exec-path = /usr/share/fossology/nomos/agent/nomossa
branches = master
git-path = /tmp/git-colic
worktree-path = /tmp/colic

[enrich_colic_analysis]
out_index = colic_chaoss_study
interval_months = [6]

Is this a bug? Am I doing something wrong? Thanks for your assistance!!

Graal does not honor environment variable

I am using a different version of git than the one installed on the system by default. Newer version of the git is installed in ~/bin and PATH variable is set to ~/bin. Shell command for git picks it up correctly as:

$ which git
~/bin/git

But Graal does not:

$ PATH=~/bin:${PATH} graal -g cocom https://github.com/chaoss/grimoirelab-perceval --git-path /tmp/graal-cocom
[2021-12-02 11:58:14,207 - root - INFO] - Starting the quest for the Graal.
[2021-12-02 11:58:14,211 - perceval.backends.core.git - DEBUG] - Running command /usr/bin/git worktree add /tmp/worktrees/graal-cocom (cwd: /tmp/graal-cocom, env: {'LANG': 'C', 'PAGER': '', 'HTTP_PROXY': 'http://<proxy>:911/', 'HTTPS_PROXY': 'http://<proxy>:912/', 'NO_PROXY': '', 'HOME': '/home/nhasabni'})
[2021-12-02 11:58:14,224 - perceval.backend - ERROR] - Error!: git command - git: 'worktree' is not a git command. See 'git --help'.

[2021-12-02 11:58:14,224 - root - INFO] - Quest completed.

worktree is part of git installed under ~/bin:

$ git worktree list
/tmp/worktrees/graal-cocom                          2e6a58b [graal-cocom] prunable
$ /usr/bin/git worktree list
git: 'worktree' is not a git command. See 'git --help'.

[cocom] Redundant log on every file-open operation

  • A log appears very frequently during the execution of ELK and Graal via the new connector.
  • It's related to read file-type operation related to lizard library.

(Not sure why it doesn't show up while executing the lizard-analyzer via the command line interface)

LOG

/usr/local/lib/python3.6/site-packages/lizard_ext/auto_open.py:26: DeprecationWarning: 'U' mode is deprecated
  return open(*args, **kwargs)

(ref. lizard_ext:line)

  • In order to reproduce this issue, execute the CoCom-backend task via the micro-mordred with the corresponding connector under elk.

/cc @valeriocos

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.