GithubHelp home page GithubHelp logo

goseind / reana-demo-atlas-recast Goto Github PK

View Code? Open in Web Editor NEW

This project forked from reanahub/reana-demo-atlas-recast

0.0 0.0 0.0 149 KB

REANA example - RECAST analysis based on ATLAS software stack

License: MIT License

C++ 60.43% Python 22.19% C 1.48% CMake 12.86% Dockerfile 3.04%

reana-demo-atlas-recast's Introduction

REANA example - ATLAS RECAST

About

This REANA reproducible analysis example demonstrates a RECAST analysis using ATLAS Analysis Software Group stack.

Analysis structure

Making a research data analysis reproducible basically means to provide "runnable recipes" addressing (1) where is the input data, (2) what software was used to analyse the data, (3) which computing environments were used to run the software and (4) which computational workflow steps were taken to run the analysis. This will permit to instantiate the analysis on the computational cloud and run the analysis to obtain (5) output results.

1. Input data

The analysis takes the following inputs:

  • dxaod input ROOT file
  • did dataset ID e.g. 404958
  • xsec_in_pb cross section in picobarn e.g. 0.00122

2. Analysis code

The event selection code for this analysis example resides under the eventselection subdirectory. It uses the official analysis releases prepared by the ATLAS Analysis Software Group (ASG).

  • eventselection/CMakeLists.txt
  • eventselection/MyEventSelection/CMakeLists.txt
  • eventselection/MyEventSelection/MyEventSelection/MyEventSelectionAlg.h
  • eventselection/MyEventSelection/Root/LinkDef.h
  • eventselection/MyEventSelection/Root/MyEventSelectionAlg.cxx
  • eventselection/MyEventSelection/util/myEventSelection.cxx

The statistical analysis code for this analysis example resides in statanalysis subdirectory. It implements limit setting for outputs produced by the event selection package.

  • statanalysis/data/background.root
  • statanalysis/data/data.root
  • statanalysis/make_ws.py
  • statanalysis/plot.py
  • statanalysis/set_limit.py

Notes that make_ws.py script generates a HistFactory configuration based on signal, data and background ROOT files. It performs a simple HistFactory-based fit based on a single channel (consisting of two bins).

3. Compute environment

In order to be able to rerun the analysis even several years in the future, we need to "encapsulate the current compute environment", for example to freeze the ATLAS software version our analysis is using. We shall achieve this by preparing a Docker container image for our analysis steps.

The event selection stage uses official ATLAS atlas/analysisbase container on top of which we add and build our custom code:

$ less eventselection/Dockerfile
FROM atlas/analysisbase:latest
ADD . /analysis/src
WORKDIR /analysis/build
RUN source ~/release_setup.sh &&  \
    sudo chown -R atlas /analysis && \
    cmake ../src && \
    make -j4

We can build our event selection analysis environment image and give it a name reanahub/reana-demo-atlas-recast-eventselection:

$ cd eventselection
$ docker build -t reanahub/reana-demo-atlas-recast-eventselection .

The statistical analysis stage also extends atlas/analysisbase by the custom code:

$ less statanalysis/Dockerfile
FROM atlas/analysisbase
ADD . /code
RUN sudo sh -c "source /home/atlas/release_setup.sh && pip install hftools"

We can build our statistical analysis environment image and give it a name reanahub/reana-demo-atlas-recast-statanalysis:

$ cd statanalysis
$ docker build -t reanahub/reana-demo-atlas-recast-statanalysis .

We can upload both images to the DockerHub image registry:

$ docker push reanahub/reana-demo-atlas-recast-eventselection
$ docker push reanahub/reana-demo-atlas-recast-statanalysis

(Note that typically you would use your own username such as johndoe in place of reanahub.)

4. Analysis workflow

This analysis example consists of a simple workflow where event selection is run first and its output serve as an input for the statistical analysis.

We shall use the Yadage workflow engine to express the computational steps in a declarative manner:

workflow.png

The full analysis pipeline is defined in workflow.yml and the individual steps are defined in steps.yml.

5. Output results

The analysis produces several pre-fit and post-fit plots:

pre.png
post.png

The limit plot:

limit.png

The limit data is also stored in JSON format for both an entire µ-scan as well as for µ=1.

Running the example on REANA cloud

There are two ways to execute this analysis example on REANA.

If you would like to simply launch this analysis example on the REANA instance at CERN and inspect its results using the web interface, please click on the following badge:

If you would like a step-by-step guide on how to use the REANA command-line client to launch this analysis example, please read on.

We start by creating a reana.yaml file describing the above analysis structure with its inputs, code, runtime environment, computational workflow steps and expected outputs:

version: 0.3.0
inputs:
  parameters:
    did: 404958
    xsec_in_pb: 0.00122
    dxaod_file: https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root
workflow:
  type: yadage
  file: workflow/workflow.yml
outputs:
  files:
    - outputs/statanalysis/fitresults/pre.png
    - outputs/statanalysis/fitresults/post.png
    - outputs/statanalysis/fitresults/limit.png
    - outputs/statanalysis/fitresults/limit_data.json

We can now install the REANA command-line client, run the analysis and download the resulting plots:

$ # create new virtual environment
$ virtualenv ~/.virtualenvs/reana
$ source ~/.virtualenvs/reana/bin/activate
$ # install REANA client
$ pip install reana-client
$ # connect to some REANA cloud instance
$ export REANA_SERVER_URL=https://reana.cern.ch/
$ export REANA_ACCESS_TOKEN=XXXXXXX
$ # create new workflow
$ reana-client create -n myanalysis
$ export REANA_WORKON=myanalysis
$ # upload input code, data and workflow to the workspace
$ reana-client upload
$ # start computational workflow
$ reana-client start
$ # ... should be finished in about a minute
$ reana-client status
$ # list workspace files
$ reana-client ls
$ # download output results
$ reana-client download

Please see the REANA-Client documentation for more detailed explanation of typical reana-client usage scenarios.

Contributors

The list of contributors in alphabetical order:

reana-demo-atlas-recast's People

Contributors

tiborsimko avatar lukasheinrich avatar audrium avatar dprelipcean avatar okraskaj avatar mvidalgarcia avatar reana-team avatar ronalddobos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.