GithubHelp home page GithubHelp logo

biggroum's People

Contributors

bechang avatar effervescentfibration avatar ftc avatar lesleylai avatar smover avatar sriram0339 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

biggroum's Issues

Change TAFPI API - rename bugType to type.

From Tom: "I believe I fixed all the issues with the run/finalize commands. There is one client-facing change and that's the JSON field of bugType should be called type . From our standpoint that keeps the fields consistent from v1 tools and v3 tools."

Import data into Solr

Import groum, cluster and pattern documents into solr.

  • Import groum
  • import patterns
  • import clusters

Musedev: add endpoint to the source code service

The source code service downloads a repository and then applies a patch.
When we integrate with musedev instead we get the source code files as input (i.e., we do not have to download the giit repository again).

There are two main tasks:

  • Add an endpoint in the source code service that takes as input the "diffs" for a single source code file and the source code to patch. This activity is a refactoring of the existing service: all the functionalities are already there.
  • Change the search service to call the new endpoint, passing the right files (this task will require some refactoring to pass the file content)

Implement lattice-based search

Implement the new lattice-based search for patterns.

The search will return a set of suggestions that a developer can use to correct/complete his code.
The search takes as input a groum and returns a list of patterns, their relationship with the input groum (e.g., isomorphic, a subgraph, or a supergraph), their "distance" from the input groum (e.g., how many changes need to happen to change one groum in the pattern).

The tasks are:

  • on GraphIso - serialize the lattice
  • on GraphIso - serialize the frequent itemset index
  • on GraphIso - implement the new search algorithm
  • in the biggroum fixriso python scripts - adapt the script to the new interface
  • in the FixrGraphPatternSearch - change the search interface
  • in the fixr_groum_search_frontend - change the web interface to call the new service and display the new data

Send Deployment to Musedev

Tasks to complete before monday meeting:

  • sergio: find container to run service and push to dockerhub
  • shawn: add sergio to dockerhub org
  • shawn: test that our script works in their docker container
  • have demo to walk through

Save error log in the TAFPI api implementation

The execution of the biggroum MuseDev script does not save the error output.

We can save the error log on the image to help us debugging any issue.
We should save also the logging output (I don't remember if it is redirected to stderr by default now).

Build the Demo

  • Script for demo we are going to show darpa
  • good set of test cases
  • what good results can we show?

Musedev: Implement the new Features in the Search Service

  • Create a new endpoint accepting a list of groums and source files.
  • Refactor implementation to be stateless with respect to previous runs.
  • Analysis on refactoring (note: don't remember the specifics of this, @smover could you look at the photo and update this point? it was item 5.2 on the whiteboard discussion 11/5/19)

Prepare tools-deployment for the demo/hackaton

  1. Implementation
  • Import patterns,clusters,groums in Solr
  • Implement search tool for patterns
  • Improve the graph extractor (APKs, ignore libraries, scale)
  • Improve extraction scripts (switch to python, ease the execution/debugging of the different steps)
  1. Deployment
  • Deploy search tool
  • Import graphs in solr
  • Import clusters into solr

Fix source code packaging in the musedev api

The muse api is supposed to create an archive of the source code corresponding to the graphs built from the class files.

Now, the muse api puts in the archive the content of a "source" folder that should exist in the same place the graphs have been extracted: this logic is wrong, since such source folder is not created from the graph extractor.

The muse api must construct the archive of source code files to send to the search service differently.

run_mining.py fails to perform all mining steps on APKs

scripts/run_mining.py exits after extracting graphs but before performing clustering, pattern search, and HTML generation.

Assuming a test directory with the following structure

myTest
└── fdroid
    ├── app1
    ├── app2
    └── app3

The configuration files can be generated like so:

python scripts/generate_mining_files.py -p <path to myTest>/fdroid -b <path to myTest> -o <output path>

To run the mining:

python scripts/run_mining.py -c <output path>/mining_configuration/config.yaml

Currently, run_mining.py completes the graph extraction but fails to compute the clusters. The full mining does work, but only after config.yamlis edited to disable extraction, and run_mining.py is run again on the same directory.

API finalize should be robust to a build that generates multiple targets

Some builds will generate multiple versions of an app creating several classes.

e.g.

root@5db2c9353c41:/fpp# find ./ -name "MainActivity.class"
./analyzing-3354dff4a11388be/MapboxAndroidWearDemo/build/intermediates/javac/debug/compileDebugJavaWithJavac/classes/com/mapbox/mapboxandroiddemo/MainActivity.class
./analyzing-3354dff4a11388be/MapboxAndroidDemo/build/intermediates/javac/globalDebug/compileGlobalDebugJavaWithJavac/classes/com/mapbox/mapboxandroiddemo/MainActivity.class
./analyzing-3354dff4a11388be/MapboxAndroidDemo/build/intermediates/javac/chinaDebug/compileChinaDebugJavaWithJavac/classes/com/mapbox/mapboxandroiddemo/MainActivity.class

The finalize command currently searches for any class file:

https://github.com/cuplv/biggroum/blob/fix_docker/python/fixrgraph/extraction/extract_single.py#L31

This should be updated to intelligently choose one version of the app.

Current workaround is to build with a command for only one release:

e.g. for mapbox:

./gradlew compileGlobalDebugSources

Musedev: Manage Residue

The residue is the part of the graph extractor run that persists between interactions with the developer.

Sub tasks:

  • Decide Residue Format
  • Residue in "run" command
  • Residue in "finalize" command
  • Residue in "talk" command

Biggroumscript.sh breaks the tests in python/fixrgraph/musedev/test.

The changes to the biggroumscript.sh breaks the tests in python/fixrgraph/musedev/test.

We were testing that the bash script was calling the python script correctly, but now the bash script became "path dependent" (e.g., biggroumsetup/biggroum and /root/biggroumsetup) and without a way to skip the setup process (we cannot test the script on mac anymore for example, because path are hardcoded and the update-alternative command that is always called).

The comment Note: Environment variable to determine run was difficult. docker does not run .bashrc on shell start, sourcing .bashrc in script also failed does not really explain why we switched from environment variables (which are quite easy to set when invoking the script, for local testing) to files, it just motivates a workaround.

Why does sourcing .bashrc file does not work? That seems to break the behavior of bash, while it should not be the case.

Here my main complaints:

  • The script uses the absolute path /root, while you should use ${HOME} and save everything there (e.g., what does it happen if tomorrow the musedev image change and you run as another user?).

  • Using the absolute path (e.g. /root/biggroumsetup) further makes the local testing (outside the container) impossible. It would be ok to replace the absolute path with a relative path (e.g. ${HOME}/biggroumsetup_completed).
    At this point you could really just save a file with the additional environment variables you set during the setup.

  • using the relative path biggroumsetup to invoke the python script is another issue for running the test locally (e.g., on a machine where we already have a setup). The lines in the script are:

cd "$(dirname "${BASH_SOURCE[0]}")"
python biggroumsetup/biggroum/python/fixrgraph/musedev/biggroumscript.py "${dir}" "${commit}" "${cmd}" "${graph_extractor_path}" "${fixr_search_endpoint}" < /dev/stdin 1> /dev/stdout 2> /dev/stderr

FIrst, cd "$(dirname "${BASH_SOURCE[0]}")" may not work on the musedev deployment unless the biggroumcheck.sh script is always in the home directory.

You have the same issues for the other environment variables you keep setting:

export GRAPH_EXTRACTOR_PATH="${HOME}/biggroumsetup/fixrgraphextractor_2.12-0.1.0-one-jar.jar" >>setup_log 2>&1 && \
export PYTHONPATH="${HOME}/biggroumsetup/biggroum/python:$PYTHONPATH"  >>setup_log 2>&1

They were inside the setup first, under the assumption that the container was created fresh at every run.
I would just export those environment variables in the setup and export them in the bashrc (and execute the bashrc every time), so we do not lose them and we can test the script locally.

I would also not change directory before invoking the biggroumscript.py, and I would use an environment variable telling where is the biggroum repository:

${BIGGROUMREPO}/python/fixrgraph/musedev/biggroumscript.py`
  • You should move the update_alternative command in the setup steps (the change is persistent, I think)

Originally posted by @smover in #60 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.