common-workflow-language / cwltool Goto Github PK

View Code? Open in Web Editor NEW

328.0 44.0 229.0 11.36 MB

Common Workflow Language reference implementation

Home Page: https://cwltool.readthedocs.io/

License: Apache License 2.0

Shell 0.85% Python 79.11% JavaScript 0.09% Makefile 0.52% Common Workflow Language 19.30% Dockerfile 0.10% Tcl 0.04%

common-workflow-language cwl workflows containers science commonwl sciworkflows

cwltool's Introduction

`cwltool`: The reference reference implementation of the Common Workflow Language standards

PyPI:

Conda:

Debian:

Quay.io (Docker):

This is the reference implementation of the Common Workflow Language open standards. It is intended to be feature complete and provide comprehensive validation of CWL files as well as provide other tools related to working with CWL.

cwltool is written and tested for Python 3.x {x = 6, 8, 9, 10, 11}

The reference implementation consists of two packages. The cwltool package is the primary Python module containing the reference implementation in the cwltool module and console executable by the same name.

The cwlref-runner package is optional and provides an additional entry point under the alias cwl-runner, which is the implementation-agnostic name for the default CWL interpreter installed on a host.

cwltool is provided by the CWL project, a member project of Software Freedom Conservancy and our many contributors.

Table of Contents

Install
Run on the command line
Development

Install

`cwltool` packages

Your operating system may offer cwltool directly. For Debian, Ubuntu, and similar Linux distribution try

sudo apt-get install cwltool

If you encounter an error, first try to update package information by using

sudo apt-get update

If you are running macOS X or other UNIXes and you want to use packages prepared by the conda-forge project, then please follow the install instructions for conda-forge (if you haven't already) and then

conda install -c conda-forge cwltool

All of the above methods of installing cwltool use packages that might contain bugs already fixed in newer versions or be missing desired features. If the packaged version of cwltool available to you is too old, then we recommend installing using pip and venv

python3 -m venv env      # Create a virtual environment named 'env' in the current directory
source env/bin/activate  # Activate environment before installing `cwltool`

Then install the latest cwlref-runner package from PyPi (which will install the latest cwltool package as well)

pip install cwlref-runner

If installing alongside another CWL implementation (like toil-cwl-runner or arvados-cwl-runner) then instead run

pip install cwltool

MS Windows users

Install Windows Subsystem for Linux 2 and Docker Desktop.
Install Debian from the Microsoft Store.
Set Debian as your default WSL 2 distro: wsl --set-default debian.
Return to the Docker Desktop, choose Settings → Resources → WSL Integration and under "Enable integration with additional distros" select "Debian",
Reboot if you have not yet already.
Launch Debian and follow the Linux instructions above (apt-get install cwltool or use the venv method)

Network problems from within WSL2? Try these instructions followed by wsl --shutdown.

`cwltool` development version

Or you can skip the direct pip commands above and install the latest development version of cwltool:

git clone https://github.com/common-workflow-language/cwltool.git # clone (copy) the cwltool git repository
cd cwltool           # Change to source directory that git clone just downloaded
pip install .[deps]  # Installs ``cwltool`` from source
cwltool --version    # Check if the installation works correctly

Remember, if co-installing multiple CWL implementations, then you need to maintain which implementation cwl-runner points to via a symbolic file system link or another facility.

Recommended Software

We strongly suggested to have the following installed:

One of the following software container engines
- Podman
- Docker
- Singularity/Apptainer: See Using Singularity
- udocker: See Using uDocker
node.js for evaluating CWL Expressions quickly (required for udocker users, optional but recommended for the other container engines).

Without these, some examples in the CWL tutorials at http://www.commonwl.org/user_guide/ may not work.

Run on the command line

Simple command:

cwl-runner my_workflow.cwl my_inputs.yaml

Or if you have multiple CWL implementations installed and you want to override the default cwl-runner then use:

cwltool my_workflow.cwl my_inputs.yml

You can set cwltool options in the environment with CWLTOOL_OPTIONS, these will be inserted at the beginning of the command line:

export CWLTOOL_OPTIONS="--debug"

Use with boot2docker on macOS

boot2docker runs Docker inside a virtual machine, and it only mounts Users on it. The default behavior of CWL is to create temporary directories under e.g. /Var which is not accessible to Docker containers.

To run CWL successfully with boot2docker you need to set the --tmpdir-prefix and --tmp-outdir-prefix to somewhere under /Users:

$ cwl-runner --tmp-outdir-prefix=/Users/username/project --tmpdir-prefix=/Users/username/project wc-tool.cwl wc-job.json

Using uDocker

Some shared computing environments don't support Docker software containers for technical or policy reasons. As a workaround, the CWL reference runner supports using the udocker program on Linux using --udocker.

udocker installation: https://indigo-dc.github.io/udocker/installation_manual.html

Run cwltool just as you usually would, but with --udocker prior to the workflow path:

cwltool --udocker https://github.com/common-workflow-language/common-workflow-language/raw/main/v1.0/v1.0/test-cwl-out2.cwl https://github.com/common-workflow-language/common-workflow-language/raw/main/v1.0/v1.0/empty.json

As was mentioned in the Recommended Software section,

Using Singularity

cwltool can also use Singularity version 2.6.1 or later as a Docker container runtime. cwltool with Singularity will run software containers specified in DockerRequirement and therefore works with Docker images only, native Singularity images are not supported. To use Singularity as the Docker container runtime, provide --singularity command line option to cwltool. With Singularity, cwltool can pass all CWL v1.0 conformance tests, except those involving Docker container ENTRYPOINTs.

Example

cwltool --singularity https://github.com/common-workflow-language/common-workflow-language/raw/main/v1.0/v1.0/cat3-tool-mediumcut.cwl https://github.com/common-workflow-language/common-workflow-language/raw/main/v1.0/v1.0/cat-job.json

Running a tool or workflow from remote or local locations

cwltool can run tool and workflow descriptions on both local and remote systems via its support for HTTP[S] URLs.

Input job files and Workflow steps (via the run directive) can reference CWL documents using absolute or relative local filesystem paths. If a relative path is referenced and that document isn't found in the current directory, then the following locations will be searched: http://www.commonwl.org/v1.0/CommandLineTool.html#Discovering_CWL_documents_on_a_local_filesystem

You can also use cwldep to manage dependencies on external tools and workflows.

Overriding workflow requirements at load time

Sometimes a workflow needs additional requirements to run in a particular environment or with a particular dataset. To avoid the need to modify the underlying workflow, cwltool supports requirement "overrides".

The format of the "overrides" object is a mapping of item identifier (workflow, workflow step, or command line tool) to the process requirements that should be applied.

cwltool:overrides:
  echo.cwl:
    requirements:
      EnvVarRequirement:
        envDef:
          MESSAGE: override_value

Overrides can be specified either on the command line, or as part of the job input document. Workflow steps are identified using the name of the workflow file followed by the step name as a document fragment identifier "#id". Override identifiers are relative to the top-level workflow document.

cwltool --overrides overrides.yml my-tool.cwl my-job.yml

input_parameter1: value1
input_parameter2: value2
cwltool:overrides:
  workflow.cwl#step1:
    requirements:
      EnvVarRequirement:
        envDef:
          MESSAGE: override_value

cwltool my-tool.cwl my-job-with-overrides.yml

Combining parts of a workflow into a single document

Use --pack to combine a workflow made up of multiple files into a single compound document. This operation takes all the CWL files referenced by a workflow and builds a new CWL document with all Process objects (CommandLineTool and Workflow) in a list in the $graph field. Cross references (such as run: and source: fields) are updated to internal references within the new packed document. The top-level workflow is named #main.

cwltool --pack my-wf.cwl > my-packed-wf.cwl

Running only part of a workflow

You can run a partial workflow with the --target (-t) option. This takes the name of an output parameter, workflow step, or input parameter in the top-level workflow. You may provide multiple targets.

cwltool --target step3 my-wf.cwl

If a target is an output parameter, it will only run only the steps that contribute to that output. If a target is a workflow step, it will run the workflow starting from that step. If a target is an input parameter, it will only run the steps connected to that input.

Use --print-targets to get a listing of the targets of a workflow. To see which steps will run, use --print-subgraph with --target to get a printout of the workflow subgraph for the selected targets.

cwltool --print-targets my-wf.cwl

cwltool --target step3 --print-subgraph my-wf.cwl > my-wf-starting-from-step3.cwl

Visualizing a CWL document

The --print-dot option will print a file suitable for Graphviz dot program. Here is a bash onliner to generate a Scalable Vector Graphic (SVG) file:

cwltool --print-dot my-wf.cwl | dot -Tsvg > my-wf.svg

Modeling a CWL document as RDF

CWL documents can be expressed as RDF triple graphs.

cwltool --print-rdf --rdf-serializer=turtle mywf.cwl

Environment Variables in cwltool

This reference implementation supports several ways of setting environment variables for tools, in addition to the standard EnvVarRequirement. The sequence of steps applied to create the environment is:

If the --preserve-entire-environment flag is present, then begin with the current environment, else begin with an empty environment.
Add any variables specified by --preserve-environment option(s).
Set TMPDIR and HOME per the CWL v1.0+ CommandLineTool specification.
Apply any EnvVarRequirement from the CommandLineTool description.
Apply any manipulations required by any cwltool:MPIRequirement extensions.
Substitute any secrets required by Secrets extension.
Modify the environment in response to SoftwareRequirement (see below).

Leveraging SoftwareRequirements (Beta)

CWL tools may be decorated with SoftwareRequirement hints that cwltool may in turn use to resolve to packages in various package managers or dependency management systems such as Environment Modules.

Utilizing SoftwareRequirement hints using cwltool requires an optional dependency, for this reason be sure to use specify the deps modifier when installing cwltool. For instance:

$ pip install 'cwltool[deps]'

Installing cwltool in this fashion enables several new command line options. The most general of these options is --beta-dependency-resolvers-configuration. This option allows one to specify a dependency resolver's configuration file. This file may be specified as either XML or YAML and very simply describes various plugins to enable to "resolve" SoftwareRequirement dependencies.

Using these hints will allow cwltool to modify the environment in which your tool runs, for example by loading one or more Environment Modules. The environment is constructed as above, then the environment may modified by the selected tool resolver. This currently means that you cannot override any environment variables set by the selected tool resolver. Note that the environment given to the configured dependency resolver has the variable _CWLTOOL set to 1 to allow introspection.

To discuss some of these plugins and how to configure them, first consider the following hint definition for an example CWL tool.

SoftwareRequirement:
  packages:
  - package: seqtk
    version:
    - r93

Now imagine deploying cwltool on a cluster with Software Modules installed and that a seqtk module is available at version r93. This means cluster users likely won't have the binary seqtk on their PATH by default, but after sourcing this module with the command modulecmd sh load seqtk/r93 seqtk is available on the PATH. A simple dependency resolvers configuration file, called dependency-resolvers-conf.yml for instance, that would enable cwltool to source the correct module environment before executing the above tool would simply be:

- type: modules

The outer list indicates that one plugin is being enabled, the plugin parameters are defined as a dictionary for this one list item. There is only one required parameter for the plugin above, this is type and defines the plugin type. This parameter is required for all plugins. The available plugins and the parameters available for each are documented (incompletely) here. Unfortunately, this documentation is in the context of Galaxy tool requirement s instead of CWL SoftwareRequirement s, but the concepts map fairly directly.

cwltool is distributed with an example of such seqtk tool and sample corresponding job. It could executed from the cwltool root using a dependency resolvers configuration file such as the above one using the command:

cwltool --beta-dependency-resolvers-configuration /path/to/dependency-resolvers-conf.yml \
    tests/seqtk_seq.cwl \
    tests/seqtk_seq_job.json

This example demonstrates both that cwltool can leverage existing software installations and also handle workflows with dependencies on different versions of the same software and libraries. However the above example does require an existing module setup so it is impossible to test this example "out of the box" with cwltool. For a more isolated test that demonstrates all the same concepts - the resolver plugin type galaxy_packages can be used.

"Galaxy packages" are a lighter-weight alternative to Environment Modules that are really just defined by a way to lay out directories into packages and versions to find little scripts that are sourced to modify the environment. They have been used for years in Galaxy community to adapt Galaxy tools to cluster environments but require neither knowledge of Galaxy nor any special tools to setup. These should work just fine for CWL tools.

The cwltool source code repository's test directory is setup with a very simple directory that defines a set of "Galaxy packages" (but really just defines one package named random-lines). The directory layout is simply:

tests/test_deps_env/
  random-lines/
    1.0/
      env.sh

If the galaxy_packages plugin is enabled and pointed at the tests/test_deps_env directory in cwltool's root and a SoftwareRequirement such as the following is encountered.

hints:
  SoftwareRequirement:
    packages:
    - package: 'random-lines'
      version:
      - '1.0'

Then cwltool will simply find that env.sh file and source it before executing the corresponding tool. That env.sh script is only responsible for modifying the job's PATH to add the required binaries.

This is a full example that works since resolving "Galaxy packages" has no external requirements. Try it out by executing the following command from cwltool's root directory:

cwltool --beta-dependency-resolvers-configuration tests/test_deps_env_resolvers_conf.yml \
    tests/random_lines.cwl \
    tests/random_lines_job.json

The resolvers configuration file in the above example was simply:

- type: galaxy_packages
  base_path: ./tests/test_deps_env

It is possible that the SoftwareRequirement s in a given CWL tool will not match the module names for a given cluster. Such requirements can be re-mapped to specific deployed packages or versions using another file specified using the resolver plugin parameter mapping_files. We will demonstrate this using galaxy_packages, but the concepts apply equally well to Environment Modules or Conda packages (described below), for instance.

So consider the resolvers configuration file. (tests/test_deps_env_resolvers_conf_rewrite.yml):

- type: galaxy_packages
  base_path: ./tests/test_deps_env
  mapping_files: ./tests/test_deps_mapping.yml

And the corresponding mapping configuration file (tests/test_deps_mapping.yml):

- from:
    name: randomLines
    version: 1.0.0-rc1
  to:
    name: random-lines
    version: '1.0'

This is saying if cwltool encounters a requirement of randomLines at version 1.0.0-rc1 in a tool, to rewrite to our specific plugin as random-lines at version 1.0. cwltool has such a test tool called random_lines_mapping.cwl that contains such a source SoftwareRequirement. To try out this example with mapping, execute the following command from the cwltool root directory:

cwltool --beta-dependency-resolvers-configuration tests/test_deps_env_resolvers_conf_rewrite.yml \
    tests/random_lines_mapping.cwl \
    tests/random_lines_job.json

The previous examples demonstrated leveraging existing infrastructure to provide requirements for CWL tools. If instead a real package manager is used cwltool has the opportunity to install requirements as needed. While initial support for Homebrew/Linuxbrew plugins is available, the most developed such plugin is for the Conda package manager. Conda has the nice properties of allowing multiple versions of a package to be installed simultaneously, not requiring evaluated permissions to install Conda itself or packages using Conda, and being cross-platform. For these reasons, cwltool may run as a normal user, install its own Conda environment and manage multiple versions of Conda packages on Linux and Mac OS X.

The Conda plugin can be endlessly configured, but a sensible set of defaults that has proven a powerful stack for dependency management within the Galaxy tool development ecosystem can be enabled by simply passing cwltool the --beta-conda-dependencies flag.

With this, we can use the seqtk example above without Docker or any externally managed services - cwltool should install everything it needs and create an environment for the tool. Try it out with the following command:

cwltool --beta-conda-dependencies tests/seqtk_seq.cwl tests/seqtk_seq_job.json

The CWL specification allows URIs to be attached to SoftwareRequirement s that allow disambiguation of package names. If the mapping files described above allow deployers to adapt tools to their infrastructure, this mechanism allows tools to adapt their requirements to multiple package managers. To demonstrate this within the context of the seqtk, we can simply break the package name we use and then specify a specific Conda package as follows:

hints:
  SoftwareRequirement:
    packages:
    - package: seqtk_seq
      version:
      - '1.2'
      specs:
      - https://anaconda.org/bioconda/seqtk
      - https://packages.debian.org/sid/seqtk

The example can be executed using the command:

cwltool --beta-conda-dependencies tests/seqtk_seq_wrong_name.cwl tests/seqtk_seq_job.json

The plugin framework for managing the resolution of these software requirements as maintained as part of galaxy-tool-util - a small, portable subset of the Galaxy project. More information on configuration and implementation can be found at the following links:

Use with GA4GH Tool Registry API

Cwltool can launch tools directly from GA4GH Tool Registry API endpoints.

By default, cwltool searches https://dockstore.org/ . Use --add-tool-registry to add other registries to the search path.

For example

cwltool quay.io/collaboratory/dockstore-tool-bamstats:develop test.json

and (defaults to latest when a version is not specified)

cwltool quay.io/collaboratory/dockstore-tool-bamstats test.json

For this example, grab the test.json (and input file) from https://github.com/CancerCollaboratory/dockstore-tool-bamstats

wget https://dockstore.org/api/api/ga4gh/v2/tools/quay.io%2Fbriandoconnor%2Fdockstore-tool-bamstats/versions/develop/PLAIN-CWL/descriptor/test.json
wget https://github.com/CancerCollaboratory/dockstore-tool-bamstats/raw/develop/rna.SRR948778.bam

Running MPI-based tools that need to be launched

Cwltool supports an extension to the CWL spec http://commonwl.org/cwltool#MPIRequirement. When the tool definition has this in its requirements/hints section, and cwltool has been run with --enable-ext, then the tool's command line will be extended with the commands needed to launch it with mpirun or similar. You can specify the number of processes to start as either a literal integer or an expression (that will result in an integer). For example:

#!/usr/bin/env cwl-runner
cwlVersion: v1.1
class: CommandLineTool
$namespaces:
  cwltool: "http://commonwl.org/cwltool#"
requirements:
  cwltool:MPIRequirement:
    processes: $(inputs.nproc)
inputs:
  nproc:
    type: int

Interaction with containers: the MPIRequirement currently prepends its commands to the front of the command line that is constructed. If you wish to run a containerized application in parallel, for simple use cases, this does work with Singularity, depending upon the platform setup. However, this combination should be considered "alpha" -- please do report any issues you have! This does not work with Docker at the moment. (More precisely, you get n copies of the same single process image run at the same time that cannot communicate with each other.)

The host-specific parameters are configured in a simple YAML file (specified with the --mpi-config-file flag). The allowed keys are given in the following table; all are optional.

Key	Type	Default	Description
runner	str	"mpirun"	The primary command to use.
nproc_flag	str	"-n"	Flag to set number of processes to start.
default_nproc	int	1	Default number of processes.
extra_flags	List[str]	[]	A list of any other flags to be added to the runner's command line before the `baseCommand`.
env_pass	List[str]	[]	A list of environment variables that should be passed from the host environment through to the tool (e.g., giving the node list as set by your scheduler).
env_pass_regex	List[str]	[]	A list of python regular expressions that will be matched against the host's environment. Those that match will be passed through.
env_set	Mapping[str,str]	{}	A dictionary whose keys are the environment variables set and the values being the values.

Enabling Fast Parser (experimental)

For very large workflows, cwltool can spend a lot of time in initialization, before the first step runs. There is an experimental flag --fast-parser which can dramatically reduce the initialization overhead, however as of this writing it has several limitations:

Error reporting in general is worse than the standard parser, you will want to use it with workflows that you know are already correct.
It does not check for dangling links (these will become runtime errors instead of loading errors)
Several other cases fail, as documented in #1720

Development

Running tests locally

Running basic tests (/tests):

To run the basic tests after installing cwltool execute the following:

pip install -rtest-requirements.txt
pytest   ## N.B. This requires node.js or docker to be available

To run various tests in all supported Python environments, we use tox. To run the test suite in all supported Python environments first clone the complete code repository (see the git clone instructions above) and then run the following in the terminal: pip install "tox<4"; tox -p

List of all environment can be seen using: tox --listenvs and running a specific test env using: tox -e <env name> and additionally run a specific test using this format: tox -e py310-unit -- -v tests/test_examples.py::test_scandeps

Running the entire suite of CWL conformance tests:

The GitHub repository for the CWL specifications contains a script that tests a CWL implementation against a wide array of valid CWL files using the cwltest program

Instructions for running these tests can be found in the Common Workflow Language Specification repository at https://github.com/common-workflow-language/common-workflow-language/blob/main/CONFORMANCE_TESTS.md .

Import as a module

Add

import cwltool

to your script.

The easiest way to use cwltool to run a tool or workflow from Python is to use a Factory

import cwltool.factory
fac = cwltool.factory.Factory()

echo = fac.make("echo.cwl")
result = echo(inp="foo")

# result["out"] == "foo"

CWL Tool Control Flow

Technical outline of how cwltool works internally, for maintainers.

Use CWL load_tool() to load document.
1. Fetches the document from file or URL
2. Applies preprocessing (syntax/identifier expansion and normalization)
3. Validates the document based on cwlVersion
4. If necessary, updates the document to the latest spec
5. Constructs a Process object using make_tool()` callback. This yields a CommandLineTool, Workflow, or ExpressionTool. For workflows, this recursively constructs each workflow step.
6. To construct custom types for CommandLineTool, Workflow, or ExpressionTool, provide a custom make_tool()
Iterate on the job() method of the Process object to get back runnable jobs.
1. job() is a generator method (uses the Python iterator protocol)
2. Each time the job() method is invoked in an iteration, it returns one of: a runnable item (an object with a run() method), None (indicating there is currently no work ready to run) or end of iteration (indicating the process is complete.)
3. Invoke the runnable item by calling run(). This runs the tool and gets output.
4. An output callback reports the output of a process.
5. job() may be iterated over multiple times. It will yield all the work that is currently ready to run and then yield None.
Workflow objects create a corresponding WorkflowJob and WorkflowJobStep objects to hold the workflow state for the duration of the job invocation.
1. The WorkflowJob iterates over each WorkflowJobStep and determines if the inputs the step are ready.
2. When a step is ready, it constructs an input object for that step and iterates on the job() method of the workflow job step.
3. Each runnable item is yielded back up to top-level run loop
4. When a step job completes and receives an output callback, the job outputs are assigned to the output of the workflow step.
5. When all steps are complete, the intermediate files are moved to a final workflow output, intermediate directories are deleted, and the workflow's output callback is called.
CommandLineTool job() objects yield a single runnable object.
1. The CommandLineTool job() method calls make_job_runner() to create a CommandLineJob object
2. The job method configures the CommandLineJob object by setting public attributes
3. The job method iterates over file and directories inputs to the CommandLineTool and creates a "path map".
4. Files are mapped from their "resolved" location to a "target" path where they will appear at tool invocation (for example, a location inside a Docker container.) The target paths are used on the command line.
5. Files are staged to targets paths using either Docker volume binds (when using containers) or symlinks (if not). This staging step enables files to be logically rearranged or renamed independent of their source layout.
6. The run() method of CommandLineJob executes the command line tool or Docker container, waits for it to complete, collects output, and makes the output callback.

Extension points

The following functions can be passed to main() to override or augment the listed behaviors.

executor

executor(tool, job_order_object, runtimeContext, logger)
  (Process, Dict[Text, Any], RuntimeContext) -> Tuple[Dict[Text, Any], Text]

An implementation of the top-level workflow execution loop should synchronously run a process object to completion and return the output object.

versionfunc

()
  () -> Text

Return version string.

logger_handler

logger_handler
  logging.Handler

Handler object for logging.

The following functions can be set in LoadingContext to override or augment the listed behaviors.

fetcher_constructor

fetcher_constructor(cache, session)
  (Dict[unicode, unicode], requests.sessions.Session) -> Fetcher

Construct a Fetcher object with the supplied cache and HTTP session.

resolver

resolver(document_loader, document)
  (Loader, Union[Text, dict[Text, Any]]) -> Text

Resolve a relative document identifier to an absolute one that can be fetched.

The following functions can be set in RuntimeContext to override or augment the listed behaviors.

construct_tool_object

construct_tool_object(toolpath_object, loadingContext)
  (MutableMapping[Text, Any], LoadingContext) -> Process

Hook to construct a Process object (eg CommandLineTool) object from a document.

select_resources

selectResources(request)
  (Dict[str, int], RuntimeContext) -> Dict[Text, int]

Take a resource request and turn it into a concrete resource assignment.

make_fs_access

make_fs_access(basedir)
  (Text) -> StdFsAccess

Return a file system access object.

In addition, when providing custom subclasses of Process objects, you can override the following methods:

CommandLineTool.make_job_runner

make_job_runner(RuntimeContext)
  (RuntimeContext) -> Type[JobBase]

Create and return a job runner object (this implements concrete execution of a command line tool).

Workflow.make_workflow_step

make_workflow_step(toolpath_object, pos, loadingContext, parentworkflowProv)
  (Dict[Text, Any], int, LoadingContext, Optional[ProvenanceProfile]) -> WorkflowStep

Create and return a workflow step object.

cwltool's People

Contributors

Stargazers

Watchers

Forkers

dleehr jmchilton chapmanb johandahlberg smoe philloooo satra alexbarrera bmeg kmhernan forrestzhang samuell nbisweden qiukunlong arvados stefanoberri nci-gdc dionjwa psafont blmoore kislyuk gijzelaerr tboser tjelvar-olsson kapilkd13 faical-yannick-congo lijiayong manu-chroma galic-vlad sashabaranov gopricy aleksandrsl adamstruck joshle hcrat otiai10 inutano undu markrobbo wtsi-hgi ctb jxtx igormusinov liuxf09 sylvinite johnfonner danbills soolee bd2kgenomics lunactic andrejsim ksebby geetduggal lukaszimmermann stevencox jasper1918 4dn-dcic umitsamima jmchung molsim itajaja scchess doreenm tmooney c3bi-pasteur-fr jirikuncar wltrimbl rmccole tiborsimko dpfoose michaeltong philres uniqueg bioqueue ionox0 slottad erikvdbergh bogdang989 zane529 larsdu lrodri29 fgeorgatos hmenager cjllanwarne sm0179 stain johanneskoester diegodelemos weclarke ltirrell davidjsherman drjrm3 croth1 roksys michael-kotliar ewatercycle manabuishii genenetwork mb1069 sivkovic

cwltool's Issues

TypeError: 'str' object does not support item assignment

Traceback (most recent call last):
  File "/Users/oren/anaconda/bin/cwltool", line 9, in <module>
    load_entry_point('cwltool==1.0.20151026181844', 'console_scripts', 'cwltool')()
  File "/Users/oren/anaconda/lib/python2.7/site-packages/cwltool/main.py", line 337, in main
    t = load_tool(args.workflow, args.update, args.strict, makeTool, args.debug, args.print_pre)
  File "/Users/oren/anaconda/lib/python2.7/site-packages/cwltool/main.py", line 266, in load_tool
    workflowobj = update.update(workflowobj, document_loader, fileuri)
  File "/Users/oren/anaconda/lib/python2.7/site-packages/cwltool/update.py", line 88, in update
    (doc, version) = nextupdate(doc, loader, baseuri)
  File "/Users/oren/anaconda/lib/python2.7/site-packages/cwltool/update.py", line 70, in draft2toDraft3
    return (_draft2toDraft3(doc, loader, baseuri), "https://w3id.org/cwl/cwl#draft-3.dev1")
  File "/Users/oren/anaconda/lib/python2.7/site-packages/cwltool/update.py", line 53, in _draft2toDraft3
    s["id"] = "step%i" % i
TypeError: 'str' object does not support item assignment

AttributeError: 'list' object has no attribute 'startswith' when giving a list to outputbinding

So I am trying to use an input array in an outputbinding, but I got this error.
Example cwl:

inputs:
  - id: "#ids"
    type:
      type: array
      items: string
    default: null
    inputBinding:
      prefix: --ids
outputs:
  - id: "#files"
    type:
      type: array
      items: File
    outputBinding:
      glob:
        engine: node-engine.cwl
        script: |
          {
          return inputs["keys"];
          }

Error:

Exception while running job
Traceback (most recent call last):
  File "/Users/yajing/Documents/.venv/lib/python2.7/site-packages/cwltool/job.py", line 179, in run
    outputs = self.collect_outputs(self.outdir)
  File "/Users/yajing/Documents/.venv/lib/python2.7/site-packages/cwltool/draft2tool.py", line 196, in collect_output_ports
    ret[fragment] = self.collect_output(port, builder, outdir)
  File "/Users/yajing/Documents/.venv/lib/python2.7/site-packages/cwltool/draft2tool.py", line 214, in collect_output
    r.extend([{"path": g, "class": "File"} for g in builder.fs_access.glob(os.path.join(outdir, gb))])
  File "/Users/yajing/Documents/.venv/lib/python2.7/posixpath.py", line 68, in join
    if b.startswith('/'):
AttributeError: 'list' object has no attribute 'startswith'
[job 4418143504] completed permanentFail
Final process status is permanentFail
Workflow error:
  Process status is ['permanentFail']

By cwl draft3, glob should be able to handle an array right?

"No such file or directory" in very simple example

Hi everybody,

First of all I would like to congratulate you all for the amazing work you
are doing with the CWL. It really seems a great idea to try to standardize
practices which we are actually reimplementing almost continuously in our
work (I am a bioinformatician).

I am beginning with the tools, just trying to replicate some of the simple
examples (Hello world!) and finding out if I can do something similar with
a command line R script, which is something I am currently very interested
in. I already have the following dummy R script:

#!/usr/bin/env Rscript

args = commandArgs(trailingOnly = TRUE)

result = sum(as.numeric(args))

cat(result, '\n')

And I have created the following CWL specification:

- id: "#sumcmd"
  class: CommandLineTool
  inputs:
    - id: "#a"
      type: int
      inputBinding: {}
    - id: "#b"
      type: int
      inputBinding: {}
    - id: "#output-file"
      type: string
  outputs:
    - id: "#filename"
      type: File
      outputBinding: 
        glob: $(inputs['output-file'])
  baseCommand: sum.r
  stdout: $(inputs['output-file'])

If I try to run the tool, it seems fine at first:

vagrant@debian-jessie:~$ cwl-runner sum.cwl#sumcmd -h
/usr/local/bin/cwl-runner 1.0.20160222205901
usage: sum.cwl#sumcmd [-h] -a A -b B --output-file OUTPUT_FILE [job_order]

positional arguments:
  job_order             Job input json file

optional arguments:
  -h, --help            show this help message and exit
  -a A
  -b B
  --output-file OUTPUT_FILE

But when I try to run it with actual arguments:

vagrant@debian-jessie:~$ cwl-runner sum.cwl#sumcmd -a 50 -b 49 
--output-file foobar.txt
/usr/local/bin/cwl-runner 1.0.20160222205901
Got workflow error
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 160, 
in single_job_executor
    for r in jobiter:
  File "/usr/local/lib/python2.7/dist-packages/cwltool/draft2tool.py", line 
131, in job
    j.stdout = builder.do_eval(self.tool["stdout"])
  File "/usr/local/lib/python2.7/dist-packages/cwltool/builder.py", line 164
, in do_eval
    context=context, pull_image=pull_image)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/expression.py", line 
135, in do_eval
    return sandboxjs.interpolate(ex, jshead(r.get("expressionLib", []), 
rootvars))
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 
131, in interpolate
    e = execjs(scan[w[0]+1:w[1]], jslib)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 
20, in execjs
    stderr=subprocess.PIPE)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1335, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Workflow error:
  [Errno 2] No such file or directory

At first I thought this was on me, but then I cloned the example workflows
and tools repository, and tried to execute one of the included tools.

vagrant@debian-jessie:~$ cwl-runner workflows/tools/linux-sort.cwl --input 
kk.txt --output kk2.txt --key 1
/usr/local/bin/cwl-runner 1.0.20160222205901
Got workflow error
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 160, 
in single_job_executor
    for r in jobiter:
  File "/usr/local/lib/python2.7/dist-packages/cwltool/draft2tool.py", line 
131, in job
    j.stdout = builder.do_eval(self.tool["stdout"])
  File "/usr/local/lib/python2.7/dist-packages/cwltool/builder.py", line 164
, in do_eval
    context=context, pull_image=pull_image)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/expression.py", line 
135, in do_eval
    return sandboxjs.interpolate(ex, jshead(r.get("expressionLib", []), 
rootvars))
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 
131, in interpolate
    e = execjs(scan[w[0]+1:w[1]], jslib)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 
20, in execjs
    stderr=subprocess.PIPE)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1335, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Workflow error:
  [Errno 2] No such file or directory

Problem is, I do not have any idea of what is exactly causing the error. I
can see that is related to not finding a file or a directory, but I am not
exactly sure about where to start looking for, because I think I do not
understand what the cwl-runner tool is actually doing.

Any help or hint would be much appreciated.

Regards,
Gustavo.

Symlinking input to docker on Mac links in local, not the host VM

I am trying to run cwltool on Mac that invokes a Docker container. When I specify the input to cwltool, in order for the container to access the input file cwltool is trying to create a symlink to the input file inside the host VM. But since I am running cwltool locally it creates the link locally instead, which causes the container to fail because it can't locate the input file inside the host.

Here is the command line I am using:

cwl-runner --debug samtools-workflow.cwl --input test/input/original.bam

and I end up with this symlink in my local directory:

indexed.bam -> test/input/original.bam

I end up with these lines in the debug output:

[job samindex] /var/folders/2l/0wpdpqws4jvg9lqjwrhvdcl8_c3ksp/T/tmpTvbsTy$ docker run -i --volume=/Users/spanglry/Code/pipelines-api-examples/samtools/test/input/original.bam:/var/lib/cwl/job956205074_input/original.bam:ro --volume=/var/folders/2l/0wpdpqws4jvg9lqjwrhvdcl8_c3ksp/T/tmpTvbsTy:/var/spool/cwl:rw --volume=/var/folders/2l/0wpdpqws4jvg9lqjwrhvdcl8_c3ksp/T/tmpYpwv0a:/tmp:rw --workdir=/var/spool/cwl --read-only=true --net=none --user=1000 --rm --env=TMPDIR=/tmp sha256:f8de369e9dddf875c5f53c5aada66596d12affccf8b96da15a00c48a1b3a4be9 samtools index indexed.bam
symlinking /var/folders/2l/0wpdpqws4jvg9lqjwrhvdcl8_c3ksp/T/tmpTvbsTy/indexed.bam to /var/lib/cwl/job956205074_input/original.bam
open: No such file or directory
[bam_index_build2] fail to open the BAM file.

Contents of samtools-workflow.cwl -----------------

#!/usr/bin/env cwl-runner

class: Workflow

inputs:
  - id: "#input"
    type: File
    description: "bam file"

outputs:
  - id: "#bam"
    type: File
    source: "#samindex.bam_with_bai"

hints:
  - class: DockerRequirement
    dockerLoad: gcr.io/level-elevator-714/samtools
    dockerImageId: sha256:f8de369e9dddf875c5f53c5aada66596d12affccf8b96da15a00c48a1b3a4be9

steps:
  - id: "#samindex"
    run: { import: samtools-index.cwl }
    inputs:
      - { id: "#samindex.input", source: "#input" }
    outputs:
      - { id: "#samindex.bam_with_bai" }

Contents of samtools-index.cwl -------------------

#!/usr/bin/env cwl-runner

class: CommandLineTool

description: "Invoke 'samtools index' to create a 'BAI' index (samtools 1.19)"

requirements:
  - class: CreateFileRequirement
    fileDef:
      - filename: indexed.bam
        fileContent:
          engine: "cwl:JsonPointer"
          script: "job/input"

inputs:
  - id: "#input"
    type: File
    description:
      Input bam file.

outputs:
  - id: "#bam_with_bai"
    type: File
    outputBinding:
      glob: "indexed.bam"
      secondaryFiles:
        - ".bai"

baseCommand: ["samtools", "index"]

arguments:
  - "indexed.bam"
  - "indexed.bam.bai"

Questions:

Is there some way to indicate I want to be symlinking the input inside the docker host instead of the process running the script?

Am I doing something else wrong here?

Any help would be greatly appreciated, thank you!

Custom nodejs engine?

Hello CWL folks!

Any reason for shipping a node runtime/engine in the reference implementation?:

https://hub.docker.com/r/commonworkflowlanguage/nodejs-engine/

cwltool romanvg$ grep -ri node-engine* *
cwltool/schemas/draft-2/draft-2/bwa-mem-tool.cwl:  - import: node-engine.cwl
cwltool/schemas/draft-2/draft-2/bwa-mem-tool.cwl:      engine: "node-engine.cwl"
cwltool/schemas/draft-2/draft-2/count-lines2-wf.cwl:  - import: node-engine.cwl
cwltool/schemas/draft-2/draft-2/count-lines2-wf.cwl:        engine: node-engine.cwl
cwltool/schemas/draft-2/draft-2/parseInt-tool.cwl:  - import: node-engine.cwl
cwltool/schemas/draft-2/draft-2/parseInt-tool.cwl:  engine: node-engine.cwl
cwltool/schemas/draft-2/draft-2/wc2-tool.cwl:  - import: node-engine.cwl
cwltool/schemas/draft-2/draft-2/wc2-tool.cwl:            engine: node-engine.cwl
cwltool/schemas/draft-2/draft-2/wc3-tool.cwl:  - import: node-engine.cwl
cwltool/schemas/draft-2/draft-2/wc3-tool.cwl:            engine: node-engine.cwl

Why not use the upstream one instead (possibly more optimized via alpine linux)?

@chapmanb @mr-c

inconsistent behavior: passing file paths via command line arguments vs input job documents

#!/usr/bin/env cwl-runner

cwlVersion: cwl:draft-3
class: CommandLineTool

baseCommand: wc

inputs:
  - id: inputfile
    type: File
    inputBinding:
      position: 1

outputs:
  - id: outputfile
    type: File
    outputBinding:
      glob: $((inputs.inputfile.path + '.wc').replace(runtime.outdir + '/', ''))

stdout: $(inputs.inputfile.path + '.wc')

requirements:
  - class: InlineJavascriptRequirement

(env) mcrusoe@mrcdev:~/t$ cwl-runner wc3.cwl --inputfile input.txt 
/home/mcrusoe/t/env/bin/cwl-runner 1.0.20160507101510
[job wc3.cwl] /home/mcrusoe/t$ wc /home/mcrusoe/t/input.txt > /home/mcrusoe/t/input.txt.wc
Final process status is success
{
    "outputfile": {
        "size": 32, 
        "path": "/home/mcrusoe/t/input.txt.wc", 
        "checksum": "sha1$843dfe5163bcb9cc33b20d0142a10db395a71ccd", 
        "class": "File"
    }
}

vs using an input document

inputfile:
  class: File
  path: input.txt

(env) mcrusoe@mrcdev:~/t$ rm input.txt.wc 
(env) mcrusoe@mrcdev:~/t$ cwl-runner wc3.cwl wc2-job1.yml 
/home/mcrusoe/t/env/bin/cwl-runner 1.0.20160507101510
[job wc3.cwl] /home/mcrusoe/t$ wc /home/mcrusoe/t/input.txt > /home/mcrusoe/t/file:///home/mcrusoe/t/input.txt.wc
Error while running job: Error collecting output for parameter 'outputfile': Did not find output file with glob pattern: '[u'input.txt.wc']'
[job wc3.cwl] completed permanentFail
Final process status is permanentFail
Workflow error, try again with --debug for more information:
  Process status is ['permanentFail']

conformance test gives "TypeError: init() got an unexpected keyword argument 'type'"

I tend to get these errors with cwltool --conformance-test, white a bit.

One example is for the samtools-view tool, but I suddenly got a similar error for my work on bwa-aln.cwl, after adding a trailing slash to a namespace (s: http://schema.org/ instead of s: http://schema.org).

Any hints?

[samuel tools]$ cwltool --conformance-test samtools-view.cwl 
/home/samuel/.pyenv/versions/2.7.11/bin/cwltool 1.0.20160427142240
Traceback (most recent call last):
  File "/home/samuel/.pyenv/versions/2.7.11/bin/cwltool", line 9, in <module>
    load_entry_point('cwltool==1.0.20160427142240', 'console_scripts', 'cwltool')()
  File "/home/samuel/.pyenv/versions/2.7.11/lib/python2.7/site-packages/cwltool/main.py", line 590, in main
    stdout=stdout)
  File "/home/samuel/.pyenv/versions/2.7.11/lib/python2.7/site-packages/cwltool/main.py", line 429, in load_job_order
    toolparser = generate_parser(argparse.ArgumentParser(prog=args.workflow), t, namemap)
  File "/home/samuel/.pyenv/versions/2.7.11/lib/python2.7/site-packages/cwltool/main.py", line 283, in generate_parser
    help=ahelp, action=action, type=atype, default=default)
  File "/home/samuel/.pyenv/versions/2.7.11/lib/python2.7/argparse.py", line 1294, in add_argument
    action = action_class(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'type'

report line numbers when parsing fails

No copyright declared in source code

CWLtool issue with passing a list of scatter/gathered outputs into a program that accepts a list

This works:

#!/usr/bin/env cwl-runner
class: Workflow
requirements:
  - class: ScatterFeatureRequirement
inputs:
  - id: "#mutation"
    type: string
  - id: "#normalin"
    type: File
  - id: "#tumorin"
    type: File
outputs:
  - id: "#outfile"
    type: File
    source: "#wc.outfile"
steps:
  - id: "#greptumor"
    run: {import: grep.cwl.yaml}
    #scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#tumorin"
    outputs:
      - id: "#greptumor.outfile"
  - id: "#grepnormal"
    run: {import: grep.cwl.yaml}
    #scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#normalin"
    outputs:
      - id: "#grepnormal.outfile"
  - id: "#wc"
    run: {import: wc.cwl.yaml}
    inputs:
      - id: "#wc.infile"
        source: ["#grepnormal.outfile",  "#greptumor.outfile"]
    outputs:
      - id: "#wc.outfile"

This does not work, when I try to have it work on multiple normal/tumor input files into grep scattered:

#!/usr/bin/env cwl-runner
class: Workflow
requirements:
  - class: ScatterFeatureRequirement
inputs:
  - id: "#mutation"
    type: string
  - id: "#normalin"
    type: {type: array, items: File}
  - id: "#tumorin"
    type: {type: array, items: File}
outputs:
  - id: "#outfile"
    type: File
    source: "#wc.outfile"
steps:
  - id: "#greptumor"
    run: {import: grep.cwl.yaml}
    scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#tumorin"
    outputs:
      - id: "#greptumor.outfile"
  - id: "#grepnormal"
    run: {import: grep.cwl.yaml}
    scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#normalin"
    outputs:
      - id: "#grepnormal.outfile"
  - id: "#wc"
    run: {import: wc.cwl.yaml}
    inputs:
      - id: "#wc.infile"
        source: ["#grepnormal.outfile",  "#greptumor.outfile"]
    outputs:
      - id: "#wc.outfile"

Interestingly though, this small change does work, when you only pass one of the scatter/gathered inputs into the final wc.

#!/usr/bin/env cwl-runner
class: Workflow
requirements:
  - class: ScatterFeatureRequirement
inputs:
  - id: "#mutation"
    type: string
  - id: "#normalin"
    type: {type: array, items: File}
  - id: "#tumorin"
    type: {type: array, items: File}
outputs:
  - id: "#outfile"
    type: File
    source: "#wc.outfile"
steps:
  - id: "#greptumor"
    run: {import: grep.cwl.yaml}
    scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#tumorin"
    outputs:
      - id: "#greptumor.outfile"
  - id: "#grepnormal"
    run: {import: grep.cwl.yaml}
    scatter: "#grep.infile"
    inputs:
      - id: "#grep.pattern"
        source: "#mutation"
      - id: "#grep.infile"
        source: "#normalin"
    outputs:
      - id: "#grepnormal.outfile"
  - id: "#wc"
    run: {import: wc.cwl.yaml}
    inputs:
      - id: "#wc.infile"
        source: ["#grepnormal.outfile"]
    outputs:
      - id: "#wc.outfile"

So it seems to me that this is an issue only when you use the program accepting an array of inputs and you want to pass two of those scatter/gathered things into a final output. Maybe the data structure being output in the scatter/gather is weird when you put it into a list? The error is this about a dict being expected:

> cwltool mutationfinder3.cwl.yaml --mutation GCATCCA --normalin normal.fastq --tumorin tumor.fastq 
/anaconda/bin/cwltool 1.0.20151026181844
[job 4386765520] /var/folders/8r/g00jq11j2yb586cz04tfpydc0000gn/T/tmpm8fJ5E$ grep GCATCCA /Users/john/workspaces/commonwl_examples/t790m_detector/normal.fastq > /var/folders/8r/g00jq11j2yb586cz04tfpydc0000gn/T/tmpm8fJ5E/out.txt
[job 4386765840] /var/folders/8r/g00jq11j2yb586cz04tfpydc0000gn/T/tmpUclOsb$ grep GCATCCA /Users/john/workspaces/commonwl_examples/t790m_detector/tumor.fastq > /var/folders/8r/g00jq11j2yb586cz04tfpydc0000gn/T/tmpUclOsb/out.txt
Workflow error:
  Error validating input record, could not validate field `infile` because
  At position 0
    `[{'checksum': 'sha1$e62341a47da9cb155ec0f531b525a97d2bd86455',
      'class': 'File',
      'path': '/var/folders/8r/g00jq11j2yb586cz04tfpydc0000gn/T/tmpm8fJ5E/out.txt'[...]`
     is not a dict

Document how to run the cwltool tests

I tried this, based on the install from source instructions and typical Python setup.py usage:

$ git clone https://github.com/common-workflow-language/cwltool.git
...
$ cd cwltool
$ python2.7 setup.py build
...
$ python2.7 setup.py test
running test
running egg_info
writing requirements to cwltool.egg-info/requires.txt
writing cwltool.egg-info/PKG-INFO
writing top-level names to cwltool.egg-info/top_level.txt
writing dependency_links to cwltool.egg-info/dependency_links.txt
writing entry points to cwltool.egg-info/entry_points.txt
reading manifest file 'cwltool.egg-info/SOURCES.txt'
writing manifest file 'cwltool.egg-info/SOURCES.txt'
running build_ext
test_factory (tests.test_examples.TestFactory) ... [job 139890816825808] /tmp/tmpqoAycd$ echo foo > /tmp/tmpqoAycd/out.txt
'echo' not found
[job 139890816825808] completed permanentFail
Final process status is permanentFail
ERROR
test_params (tests.test_examples.TestParamMatching) ... ok

======================================================================
ERROR: test_factory (tests.test_examples.TestFactory)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/shared/users/xxx/repositories/cwltool/tests/test_examples.py", line 107, in test_factory
    self.assertEqual(echo(inp="foo"), {"out": "foo\n"})
  File "/mnt/shared/users/xxx/repositories/cwltool/cwltool/factory.py", line 11, in __call__
    return self.factory.executor(self.t, kwargs, os.getcwd(), None, **self.factory.execkwargs)
  File "/mnt/shared/users/xxx/repositories/cwltool/cwltool/main.py", line 171, in single_job_executor
    raise workflow.WorkflowException("Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']

----------------------------------------------------------------------
Ran 2 tests in 3.838s

FAILED (errors=1)

As an aside, the apparent failure to run the echo command is odd:

$ echo foo
foo
$ more /tmp/tmpQWsOo2/out.txt
foo

I am assuming this failure is down to the setup, rather than a bug in the test suite?

See also #37 for using the tests within TravisCI.

support --net option for docker

This enforces to not use any network for docker. Can you support user passing network to use and default to none?

--no-container should override any requirements for Docker

Validation error... could not validate field `secondaryFiles`

With

$ cwl-runner --version
/home/ubuntu/.virtualenvs/p2/bin/cwl-runner 1.0.20151124040259

using input:
https://github.com/jeremiahsavage/workflows/blob/587d3fba885f27289643280d11ad0e09ff2a1f32/tools/remove_qcfail.cwl.yaml
I get error below.

Version cwltool-1.0.20151110030107 does not give this error, while cwltool-1.0.20151121032923 does. I could bisect further on other versions if that would be useful, using https://pypi.python.org/simple/cwltool/

$ cwl-runner /mnt/SCRATCH/tools/workflows/tools/remove_qcfail.cwl.yaml --first_input_bam ~/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam --second_input_bam ~/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0/realn/md/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam --uuid test
/home/ubuntu/.virtualenvs/p2/bin/cwl-runner 1.0.20151124040259
Tool definition failed validation:
Validation error in object file:///mnt/SCRATCH/tools/workflows/tools/remove_qcfail.cwl.yaml
  Could not validate as `CommandLineTool` because
    could not validate field `inputs` because
      At position 0
        could not validate field `inputBinding` because
          the value `{'prefix': '--first_bam',
           'secondaryFiles': [{'engine': 'node-engine.cwl',
                               'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]`
           is not a valid type in the union, expected one of:
          - null, but
             the value `{'prefix': '--first_bam',
             'secondaryFiles': [{'engine': 'node-engine.cwl',
                                 'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]` is not null
          - CommandLineBinding, but
             could not validate field `secondaryFiles` because
              the value `[{'engine': 'node-engine.cwl',
                'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
               is not a valid type in the union, expected one of:
              - null, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not null
              - string, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not string
              - Expression, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
                 is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'
              - array of <string or Expression>, but
                 At position 0
                  the value `{'engine': 'node-engine.cwl',
                   'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                   is not a valid type in the union, expected one of:
                  - string, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}` is not string
                  - Expression, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                     is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'

  Could not validate as `Workflow` because
    could not validate field `inputs` because
      At position 0
        could not validate field `inputBinding` because
          the value `{'prefix': '--first_bam',
           'secondaryFiles': [{'engine': 'node-engine.cwl',
                               'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]`
           is not a valid type in the union, expected one of:
          - null, but
             the value `{'prefix': '--first_bam',
             'secondaryFiles': [{'engine': 'node-engine.cwl',
                                 'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]` is not null
          - Binding, but
             could not validate field `secondaryFiles` because
              the value `[{'engine': 'node-engine.cwl',
                'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
               is not a valid type in the union, expected one of:
              - null, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not null
              - string, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not string
              - Expression, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
                 is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'
              - array of <string or Expression>, but
                 At position 0
                  the value `{'engine': 'node-engine.cwl',
                   'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                   is not a valid type in the union, expected one of:
                  - string, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}` is not string
                  - Expression, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                     is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'

            could not validate field `prefix` because it is not recognized and strict is True ([u'name', u'id'])

    could not validate field `outputs` because
      At position 0
        could not validate field `outputBinding` because
          the value `{'glob': '${\n  return "remove_qcfail/"+inputs[\'second_input_bam\'].path.split(\'/\').slice(-1)[0];\n}\n'}` is not a valid type in the union, expected one of:
          - null, but
             the value `{'glob': '${\n  return "remove_qcfail/"+inputs[\'second_input_bam\'].path.split(\'/\').slice(-1)[0];\n}\n'}` is not null
          - Binding, but
             could not validate field `glob` because it is not recognized and strict is True ([u'name', u'id'])

    missing required field `steps`
    could not validate field `baseCommand` because it is not recognized and strict is True ([u'name', u'id'])
  Could not validate as `ExpressionTool` because
    could not validate field `inputs` because
      At position 0
        could not validate field `inputBinding` because
          the value `{'prefix': '--first_bam',
           'secondaryFiles': [{'engine': 'node-engine.cwl',
                               'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]`
           is not a valid type in the union, expected one of:
          - null, but
             the value `{'prefix': '--first_bam',
             'secondaryFiles': [{'engine': 'node-engine.cwl',
                                 'script': '{\n  return {"path": $job[\'first_input_bam\'].path.[...]` is not null
          - Binding, but
             could not validate field `secondaryFiles` because
              the value `[{'engine': 'node-engine.cwl',
                'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
               is not a valid type in the union, expected one of:
              - null, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not null
              - string, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]` is not string
              - Expression, but
                 the value `[{'engine': 'node-engine.cwl',
                  'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}]`
                 is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'
              - array of <string or Expression>, but
                 At position 0
                  the value `{'engine': 'node-engine.cwl',
                   'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                   is not a valid type in the union, expected one of:
                  - string, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}` is not string
                  - Expression, but
                     the value `{'engine': 'node-engine.cwl',
                     'script': '{\n  return {"path": $job[\'first_input_bam\'].path.slice(0,-4)+".bai", "class": "File"};\n}\n'}`
                     is not a valid symbol in enum Expression, expected one of 'ExpressionPlaceholder'

            could not validate field `prefix` because it is not recognized and strict is True ([u'name', u'id'])

    could not validate field `outputs` because
      At position 0
        could not validate field `outputBinding` because
          the value `{'glob': '${\n  return "remove_qcfail/"+inputs[\'second_input_bam\'].path.split(\'/\').slice(-1)[0];\n}\n'}` is not a valid type in the union, expected one of:
          - null, but
             the value `{'glob': '${\n  return "remove_qcfail/"+inputs[\'second_input_bam\'].path.split(\'/\').slice(-1)[0];\n}\n'}` is not null
          - Binding, but
             could not validate field `glob` because it is not recognized and strict is True ([u'name', u'id'])

    missing required field `expression`
    could not validate field `baseCommand` because it is not recognized and strict is True ([u'name', u'id'])
$

tag releases

From common-workflow-language/common-workflow-language#30

Float values not being recognized?

In my CWL I have:

  - id: evalue
    type: ["null", float]
    inputBinding:
      position: 4
      prefix: -evalue
      separate: true

And then in my test JSON file for it I've tried both:

"evalue": 2e-10,

and

"evalue": "2e-10",

But when I run I get:

/usr/local/bin/cwl-runner 1.0.20160331184641
Workflow error, try again with --debug for more information:
  Error validating input record, could not validate field `evalue` because
  the value `'2e-10'` is not a valid type in the union, expected one of:
  - null, but
     the value `'2e-10'` is not null
  - float, but
     the value `'2e-10'` is not float or double


[blast+-blastp.cwl.txt](https://github.com/common-workflow-language/cwltool/files/208737/blast.-blastp.cwl.txt)
[blast+-blastp.test.json.txt](https://github.com/common-workflow-language/cwltool/files/208736/blast.-blastp.test.json.txt)

Using --cachedir option leaves output of workflow in cache instead of the directory the workflow is run

The title basically sums it up. When running cwl-runner and specifying a --cachedir, the output that was previously placed in the directory the workflow is run is instead left nested in the cache.

Shouldn't the use of --cachedir be transparent to the function of the tool, and leave the "interface" of the tool the same? IE, the output should go to the same place whether you are using --cachedir or not.

inputs array of secondaryFiles not processed

I am attempting to associate a secondaryFile (BAI) with each File is an array (of BAMs).
I get desired behavior with the Jan26 release [0], but the following Jan27 release [1] breaks our cwl.

This the the cwl snippet

cwlVersion: "cwl:draft-2"
requirements:
  - import: node-engine.cwl
  - import: envvar-global.cwl
  - class: DockerRequirement
    dockerPull: quay.io/___
class: CommandLineTool
inputs:
  - id: "#input_bam_path"
    type:
      type: array
      items: File
      inputBinding:
        prefix: --bam_path
        secondaryFiles:
          - engine: node-engine.cwl
            script: |
              {
              return {"path": $self.path.slice(0,-4)+".bai", "class": "File"};
              }

with --debug gives the desired bindings with the Jan26 release [2], but not with the Jan27 release [3].

I ran a diff [4] of the two releases. It looks like secondaryFiles are now in schema instead of binding. Is the post-Jan27 behavior the desired to our cwl?

[0]
https://pypi.python.org/packages/da/93/fd0885312894cda09ad4bcb04c7091ec7b6da15ab10e14f468cdc54caed5/cwltool-1.0.20160126211726.tar.gz

[1]
https://pypi.python.org/packages/8e/b3/c9326f44854d8ca71668070fb09746f4b0c1fe4f5d749de7db3d737eee88/cwltool-1.0.20160127144612.tar.gz

[2]

    {
        "secondaryFiles": [
            "${\nreturn {\"path\": self.path.slice(0,-4)+\".bai\", \"class\": \"File\"};\n}\n"
        ], 
        "prefix": "--bam_path", 
        "do_eval": {
            "path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
            "class": "File", 
            "secondaryFiles": [
                {
                    "path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bai", 
                    "class": "File"
                }
            ]
        }, 
        "valueFrom": {
            "path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
            "class": "File", 
            "secondaryFiles": [
                {
                    "path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bai", 
                    "class": "File"
                }
            ]
        }, 
        "position": [
            0, 
            0, 
            "input_bam_path", 
            "input_bam_path"
        ]
    },

[3]

    {
        "position": [
            0, 
            0, 
            "input_bam_path", 
            "input_bam_path"
        ], 
        "prefix": "--bam_path", 
        "do_eval": {
            "path": "/tmp/job557517512_test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
            "class": "File"
        }, 
        "valueFrom": {
            "path": "/tmp/job557517512_test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
            "class": "File"
        }
    },

[4]
https://gist.github.com/jeremiahsavage/c82b38027be30eccbf9b47361c8f7fbb

--outdir

I did not find a way to put file into ./test-files and do not have a cwltool error. I can either output it into the dir but have workflow error or have it in current dir and no errors.

When I run command cwltool --outdir ./test-files/ ./samtools-index.cwl ./jobs/samtools-index-job.json file .bai created in ./test-files but I have an error from cwltool

Error while running job: Did not find output file with glob pattern: '[u'SRR1031972.Aligned.sortedByCoord.out.bam.bai']'
[job 4463785808] completed permanentFail
Final process status is permanentFail
Workflow error:
  Process status is ['permanentFail']

Or I can run cwltool ./samtools-index.cwl ./jobs/samtools-index-job.json but then file in current directory :(

`Error collecting output` for array of secondaryFiles

I'm attempting to use biobambam2 bamtofastq (wrapped in a docker) to split BAMs into an arbitrary number of paired end fastq files. (example pair: D1K4L.4_1.fq.gz & D1K4L.4_2.fq.gz ). The cwl is at
https://github.com/jeremiahsavage/biobambam_tool/blob/a698b96ceee6e448ba9155e545a429a93f8eed55/biobambam2_bamtofastq.cwl.yaml

When run with a test BAM, I get the follow error:

(p2_cwl) ubuntu@compute001:~/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513$ cwltool --debug ~/cocleaning-cwl/tools/biobambam2_bamtofastq.cwl.yaml --input_bam C440.TCGA-IN-8462-01A-11D-2340-08.1.bam --uuid test
/home/ubuntu/.virtualenvs/p2_cwl/bin/cwltool 1.0.20160325210917
Parsed job order from command line: {
    "db_cred_s3url": null, 
    "uuid": "test", 
    "s3cfg_path": null, 
    "input_bam": {
        "path": "C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
        "class": "File"
    }, 
    "id": "/home/ubuntu/cocleaning-cwl/tools/biobambam2_bamtofastq.cwl.yaml", 
    "job_order": null
}
[job 140175215645712] initializing from file:///home/ubuntu/cocleaning-cwl/tools/biobambam2_bamtofastq.cwl.yaml
[job 140175215645712] {
    "db_cred_s3url": null, 
    "uuid": "test", 
    "s3cfg_path": null, 
    "input_bam": {
        "path": "C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
        "class": "File"
    }, 
    "job_order": null
}
[job 140175215645712] path mappings is {
    "C440.TCGA-IN-8462-01A-11D-2340-08.1.bam": [
        "/mnt/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
        "/var/lib/cwl/job203960872_47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam"
    ]
}
[job 140175215645712] command line bindings is [
    {
        "position": [
            -1000000, 
            0
        ], 
        "valueFrom": "/home/ubuntu/.virtualenvs/p3/bin/python"
    }, 
    {
        "position": [
            -1000000, 
            1
        ], 
        "valueFrom": "/home/ubuntu/.virtualenvs/p3/lib/python3.4/site-packages/biobambam_tool/main.py"
    }, 
    {
        "position": [
            -1000000, 
            2
        ], 
        "valueFrom": "--tool_name"
    }, 
    {
        "position": [
            -1000000, 
            3
        ], 
        "valueFrom": "bamtofastq"
    }, 
    {
        "position": [
            0, 
            "db_cred_s3url"
        ], 
        "prefix": "--db_cred_s3url", 
        "valueFrom": null
    }, 
    {
        "position": [
            0, 
            "input_bam"
        ], 
        "prefix": "--bam_path", 
        "valueFrom": {
            "path": "/var/lib/cwl/job203960872_47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam", 
            "containerfs": true, 
            "class": "File"
        }
    }, 
    {
        "position": [
            0, 
            "s3cfg_path"
        ], 
        "prefix": "--s3cfg_path", 
        "valueFrom": null
    }, 
    {
        "position": [
            0, 
            "uuid"
        ], 
        "prefix": "--uuid", 
        "valueFrom": "test"
    }
]
[job 140175215645712] /mnt/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513$ docker run -i --volume=/mnt/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam:/var/lib/cwl/job203960872_47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam:ro --volume=/mnt/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513:/var/spool/cwl:rw --volume=/tmp/tmp15xeVh:/tmp:rw --workdir=/var/spool/cwl --read-only=true --net=none --user=1000 --rm --env=TMPDIR=/tmp --env=PATH=/usr/local/bin/:/usr/bin:/bin quay.io/jeremiahsavage/biobambam_tool /home/ubuntu/.virtualenvs/p3/bin/python /home/ubuntu/.virtualenvs/p3/lib/python3.4/site-packages/biobambam_tool/main.py --tool_name bamtofastq --bam_path /var/lib/cwl/job203960872_47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam --uuid test
Error while running job: Error collecting output for parameter 'output_fastq': Output file path /mnt/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/fastq/D1K4L.4_2.fq.gz must be within designated output directory (/var/spool/cwl) or an input file pass through.
[job 140175215645712] completed permanentFail
[job 140175215645712] {}
Final process status is permanentFail
[job 140175215645712] Removing temporary directory /tmp/tmp15xeVh
Workflow error, try again with --debug for more information:
  Process status is ['permanentFail']
Traceback (most recent call last):
  File "/home/ubuntu/.virtualenvs/p2_cwl/local/lib/python2.7/site-packages/cwltool/main.py", line 581, in main
    eval_timeout=args.eval_timeout
  File "/home/ubuntu/.virtualenvs/p2_cwl/local/lib/python2.7/site-packages/cwltool/main.py", line 183, in single_job_executor
    raise workflow.WorkflowException("Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']
(p2_cwl) ubuntu@compute001:~/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513$

Need to add CWL Build Bot to PyPI maintainer list

@tetron can you add the PyPI user common_workflow_language to the PyPI listings?

Validation error when importing docker definition file as hint

How to reproduce:

Create this tool definition file, echo.cwl:

#!/usr/bin/env cwl-runner

cwlVersion: "cwl:draft-3"

class: CommandLineTool

hints:
  - $import: echo-docker.yml

inputs:
  - id: "text"
    type: [string]
    description: |
      the text to echo
    inputBinding:
      position: 1

outputs: []

baseCommand:
  - echo

Create this docker definition file, echo-docker.yml:

class: DockerRequirement
dockerPull: ubuntu:16.04
dockerFile: |
  ### Base Image
  FROM ubuntu:16.04

Create this job input file:
```
text: Hello, world!
```

Execute:

cwl-runner --no-container echo.cwl echo_testjob.yml

Output received:

```bash
/home/samuel/.pyenv/versions/2.7.11/bin/cwl-runner 1.0.20160427142240
Tool definition failed validation:
Validating hint `DockerRequirement`: could not validate field `id` because it is not recognized and strict is True, valid fields are: class, dockerPull, dockerLoad, dockerFile, dockerImport, dockerImageId, dockerOutputDirectory
could not validate field `name` because it is not recognized and strict is True, valid fields are: class, dockerPull, dockerLoad, dockerFile, dockerImport, dockerImageId, dockerOutputDirectory
```

Expected output:

Something else.

OS: Xubuntu 16.04 64bit
CWL Tool: 1.0.20160427142240
Python: 2.7.11

Remove --dry-run

It isn't tested and doesn't work.

Exact definition of --tmp-outdir-prefix and --tmpdir-prefix is ambiguous

I've hit what I think is an inconsistency in the expectation of these two arguments. When the arguments for the cwl-runner are first parsed, there are a couple of checks to make sure that the paths provided in these two arguments refer to actual directories:

cwltool/main.py:350

    if args.tmp_outdir_prefix != 'tmp':
        # Use user defined temp directory (if it exists)
        args.tmp_outdir_prefix = os.path.abspath(args.tmp_outdir_prefix)
        if not os.path.exists(args.tmp_outdir_prefix):
            _logger.error("Intermediate output directory prefix doesn't exist, reverting to default")
            return 1

    if args.tmpdir_prefix != 'tmp':
        # Use user defined prefix (if the folder exists)
        args.tmpdir_prefix = os.path.abspath(args.tmpdir_prefix)
        if not os.path.exists(args.tmpdir_prefix):
            _logger.error("Temporary directory prefix doesn't exist.")
            return 1

However, when these arguments are actually used, they aren't required to be, and in fact, will be used like the prefixes to directories that are created during the CWL run:

draft2tool.py:137

        if dockerReq and kwargs.get("use_container"):
            out_prefix = kwargs.get("tmp_outdir_prefix")
            j.outdir = kwargs.get("outdir") or tempfile.mkdtemp(prefix=out_prefix)
            tmpdir_prefix = kwargs.get('tmpdir_prefix')
            j.tmpdir = kwargs.get("tmpdir") or tempfile.mkdtemp(prefix=tmpdir_prefix)
        else:
            j.outdir = builder.outdir
            j.tmpdir = builder.tmpdir

Is it possible to either clarify in the documentation exactly how these arguments are expected to be used, or reconcile the usage in the code? My two cents is to make the first code block check for that the basedir of the argument exists, and then using them as prefixes to which random strings will be appended.

store stderr

See common-workflow-language/common-workflow-language#83

Upgrading to latest version of cwl-runner breaks command line inputs

The previous version of cwl-runner I could do this:

cwl-runner cat.cwl --input README.md

using the following workflow (performs a simple cat):

#!/usr/bin/env cwl-runner
cwlVersion: "cwl:draft-3"
class: CommandLineTool
description: "This tool is developed for SMC-RNA Challenge for detecting gene fusions (STAR fusion)"
inputs:
  #Give it a list of input files
  - id: input
    type: File
    inputBinding:
      position: 0 
outputs:
  - id: output
    type: File
    outputBinding:
      glob: test.txt
stdout: test.txt
baseCommand: [cat]

and I would get a result:

Final process status is success
{
    "output": {
        "size": 68, 
        "path": "/Users/spanglry/Code/importers/test.txt", 
        "checksum": "sha1$3f9848f75373e501d9832e9d47899b6ee3cfa780", 
        "class": "File"
    }
}

And the contents of README.md would be present in the output file test.txt.

Today I upgraded to the latest cwl-runner and the exact same process now fails!

/usr/local/bin/cwl-runner 1.0.20160425140546
Traceback (most recent call last):
  File "/usr/local/bin/cwl-runner", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 590, in main
    stdout=stdout)
  File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 434, in load_job_order
    cmd_line = vars(toolparser.parse_args(args.job_order))
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 1701, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 1733, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 1939, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 1879, in consume_optional
    take_action(action, args, option_string)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 1807, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.py", line 802, in __call__
    raise NotImplementedError(_('.__call__() not defined'))
NotImplementedError: .__call__() not defined

Any idea what happened?

CWL enum requires a name field

When specifying an enum parameter for a CWL file, the validation of the type will fail, unless you specify an additional 'name' parameter, required in avro but not in cwl.
a workaround is to specify the name field.

https://gist.github.com/hmenager/5e4ceee602b667cf77bbea86fbb31202

Enable reverse lookup of files when returning JSON

This summarizes a local discussion with @tetron.

Currently when running a CWL workflow inside a Docker container, there is no reverse mapping of outputs back to the original file locations.

When you are re-shaping input this results in not being able to find returned output. For example, we have code that re-groups BAM files into batches for simultaneous variant calling but returning regrouped JSON with a path to the original file (inside the Docker container) results in errors downstream since the file is not remapped externally.

The implementation fix is to add reverse lookups to collect_output_ports. This requires adding a reversemap to DockerPathMapper and using it to remap with adjustFiles.

The path is not passed correctly from one step to the next in a workflow.

Hi,
I have been facing this problem while executing the tool specification I wrote and added docker requirement in it for GATK tools. Before adding docker requirement, the workflow was working fine but after adding it following issues were faced.
Following are the links to the test data used (Assuming hg19.fa and supplementary reference files might be available otherwise will share those too):

Test data:
https://www.dropbox.com/sh/01vvou1p6i4w2ai/AADmVyVMSq0XQ2FrNfyo2MOfa?dl=0
Workflow and tool specification files: https://www.dropbox.com/sh/ukvz4vh0zg57ubw/AADDdP46SHkyDOCUgwxp_lz7a?dl=0

All the test files are included in test_data directory along with the variant files. The last step (realignTargetCreator) fails to execute and generates following error:

INFO  02:49:08,676 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  02:49:08,678 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-0-gf196186, Compiled 2015/09/27 12:19:32 
INFO  02:49:08,679 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  02:49:08,679 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  02:49:08,683 HelpFormatter - Program Args: -T RealignerTargetCreator -R /tmp/job925415211_tools/outputFiles/hg19.fa -I /tmp/job925415211_tmpOjuEb6/markDups.bam --known /tmp/job925415211_tools/dbsnp_138.vcf --known /tmp/job925415211_Ref_datasets/Mills_and_1000G_gold_st.vcf --known /tmp/job925415211_Ref_datasets/1000G_phase1.indels.hg19.vcf -o realignTargetCreator_output.intervals 
INFO  02:49:08,698 HelpFormatter - Executing as biodocker@578307327c0a on Linux 4.0.9-boot2docker amd64; OpenJDK 64-Bit Server VM 1.7.0_79-b14. 
INFO  02:49:08,699 HelpFormatter - Date/Time: 2015/12/09 02:49:08 
INFO  02:49:08,699 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  02:49:08,700 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  02:49:08,798 GenomeAnalysisEngine - Strictness is SILENT 
INFO  02:49:08,932 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
INFO  02:49:08,941 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.4-0-gf196186): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: Couldn't read file /tmp/job925415211_tmpOjuEb6/markDups.bam because java.io.FileNotFoundException: /tmp/job925415211_tmpOjuEb6/markDups.bam (Is a directory)

When the workflow is executed in --debug mode, I observed that all the other tools are using the original tmp dir location of the file suring input binding if used as a source for the next step whereas realignTargetCreator behaves differently.

I am not sure where the root of error might be and after discussing it on cwl gitter chat room I am generating this issue. According to my understanding, the mapping of path is not correct but I might be wrong. realignTargetCreator is using GATK docker that might not be detecting the file presence.

To add further, the tool works perfectly fine when run alone.
Any help will be much appreciated,
Thank you.

CreateFileRequirement: input_dir and work_dir are the same

Gitter log:

samtools-faidx.cwl

@portah 12:26
But this solution will only work if out_dir different from input_dir
So as part of workflow it going to work, as stand_alone and being run in current dir will generate an error about file exist
Is there a way to rename a file after execution

@portah 12:37

[job 4415558160] /Users/porter/Work/scidap/workflows/tools$ docker run -i --volume=/Users/porter/Work/scidap/workflows/tools/./test-files/mm10.fa:/tmp/job877968402_test-files/mm10.fa:ro --volume=/Users/porter/Work/scidap/workflows/tools:/tmp/job_output:rw --volume=/var/folders/hx/3qsmpl9s50zdmn49jb03l_tw0000gn/T/tmpF_Wh4u:/tmp/job_tmp:rw --workdir=/tmp/job_output --user=1000 --rm --env=TMPDIR=/tmp/job_tmp --env=PATH=/usr/local/bin/:/usr/bin:/bin scidap/samtools:v1.2-216-gdffc67f samtools faidx mm10.fa
Final process status is success
{
    "index_result": {
        "path": "/Users/porter/Work/scidap/workflows/tools/mm10.fa.fai", 
        "size": 689, 
        "class": "File", 
        "checksum": "sha1$e7fef44bbe095319668b0c5a903d5b8ffa793080"
    }
}%

But to run it standalone original input file has to be in separate directory
so I've run it ./samtools-faidx.cwl --input=./test-files/mm10.fa

@tetron 13:34
@portah well, it could probabyl be smarter and not generate an error in that case
@portah when input_dir and work_dir are the same

@portah 13:41
Making pull request for samtools-faidx.cwl

Store the execution metadata, including command line, when running a job

From common-workflow-language/common-workflow-language#84

parsing error for a simple workflow

Hello I've tried a simple workflow but I got an error message and I couldn't find out what I did wrong. Could anyone help me with it? I'd appreciate it.
By the way, I checked the indentations are okay and I didn't use any tab instead of space.
Could it be a bug in the parser? Hope it's something easy I can fix. Thanks in advance.

-bash-4.1$ cat hello.workflow.cwl 
cwlVersion: cwl:draft-3
class: Workflow
inputs:
  - id: input_message
    type: string

outputs:
  - id: finaloutput
    type: File
    source: "#wc/outputfile"

steps:
  - id: hello2
    run: hello2.cwl
    inputs:
      - id: message
        source: "#input_message"
    outputs:
      - id: output

   - id: wc
     run: wc.cwl
     inputs:
       - id: inputfile
         source: "#hello2/output"
     outputs:
       - id: outputfile

-bash-4.1$ cat hello2.cwl 
cwlVersion: cwl:draft-3
class: CommandLineTool
baseCommand: echo
stdout: output.txt
inputs:
  - id: message
    type: string
    inputBinding:
      position: 1
outputs:
  - id: output
    type: File
    outputBinding:
      glob: output.txt

-bash-4.1$ cat wc.cwl 
cwlVersion: cwl:draft-3
class: CommandLineTool
baseCommand: wc
stdout: wc.output.txt
inputs:
  - id: inputfile
    type: File
    inputBinding:
      position: 1
outputs:
  - id: outputfile
    type: File
    outputBinding:
      glob: wc.output.txt

-bash-4.1$ cat hello.workflow-job1.yml 
input_message: Yay! Hello world!


-bash-4.1$ cwl-runner hello.workflow.cwl hello.workflow-job1.yml
/home/duplexa/venv/cwl/bin/cwl-runner 1.0.20160504183010
I'm sorry, I couldn't load this CWL file, try again with --debug for more information.
Syntax error while parsing a block collection
  in "file:///home/duplexa/emsar.workflow/hello.workflow.cwl", line 13, column 3
expected <block end>, but found '<block sequence start>'
  in "file:///home/duplexa/emsar.workflow/hello.workflow.cwl", line 21, column 4

non-scatter fields in scatter workflow steps not working

When using StepInputExpressionRequirement step inputs with valueFrom should be mapped according to the WorkflowStepInput specification. There is a bug when using scatter steps, where the valueFrom field is not checked for non-scatter inputs (resulting in a None value).

--update should produce yaml if given yaml

augment DOT/graphviz output

Moved from common-workflow-language/common-workflow-language#72

--update no longer exists

Hi,
We noticed that the --update option was deleted in this commit.
f021126#diff-7e9ba38c5e8dd21caad9558b5d833a79L119

Was this intentional? It looks like the supporting code in cwltool/update.py was not actually deleted.
Thanks!

bug of dry-run

Hi,
I use cwl-runner the run the echo example you show with --dry-run, and the cwl-runner throw out : " if final_status[0] != "success": IndexError: list index out of range"
It's normal withour --dry-run

Support Python 3

From common-workflow-language/common-workflow-language#65

dockerFile

How to use dockerFile,
As I understand it works with dockerImageId, both of them has to be specified!

It would be nice to try it in some order?

dockerImageId
dockerPull
dockerFile

requirements:
  - "@import": envvar-global.cwl
  - class: InlineJavascriptRequirement
  - class: DockerRequirement
    dockerImageId: scidap/star:v2.5.0a
    dockerPull: scidap/star:v2.5.0a
    dockerFile: |
      #################################################################
      # Dockerfile
      #
      # Software:         STAR
      # Software Version: 2.5.0a
      # Description:      STAR image for SciDAP
      # Website:          https://github.com/alexdobin/STAR, http://scidap.com/
      # Provides:         STAR
      # Base Image:       scidap/scidap:v0.0.1
      # Build Cmd:        docker build --rm -t scidap/star:v2.5.0a .
      # Pull Cmd:         docker pull scidap/star:v2.5.0a
      # Run Cmd:          docker run --rm scidap/star:v2.5.0a STAR
      #################################################################

      ### Base Image
      FROM scidap/scidap:v0.0.1
      MAINTAINER Andrey V Kartashov "[email protected]"
      ENV DEBIAN_FRONTEND noninteractive

      ################## BEGIN INSTALLATION ######################

      WORKDIR /tmp

      ### Install STAR

      ENV VERSION 2.5.0a
      ENV NAME STAR
      ENV URL "https://github.com/alexdobin/STAR/archive/${NAME}_${VERSION}.tar.gz"

      RUN wget -q -O - $URL | tar -zxv && \
      cd ${NAME}-${NAME}_${VERSION}/source && \
      make -j 4 && \
      cd .. && \
      cp ./bin/Linux_x86_64_static/STAR /usr/local/bin/ && \
      cd .. && \
      strip /usr/local/bin/${NAME}; true && \
      rm -rf ./${NAME}-${NAME}_${VERSION}/

/usr/local/bin/cwltool 1.0.20151126171959
Got workflow error
Traceback (most recent call last):
  File "build/bdist.macosx-10.11-intel/egg/cwltool/main.py", line 159, in single_job_executor
    r.run(**kwargs)
  File "build/bdist.macosx-10.11-intel/egg/cwltool/job.py", line 56, in run
    img_id = docker.get_from_requirements(docker_req, docker_is_req, pull_image)
  File "build/bdist.macosx-10.11-intel/egg/cwltool/docker.py", line 95, in get_from_requirements
    if get_image(r, pull_image, dry_run):
  File "build/bdist.macosx-10.11-intel/egg/cwltool/docker.py", line 21, in get_image
    sp = dockerRequirement["dockerImageId"].split(":")
KeyError: 'dockerImageId'
Workflow error:
  'dockerImageId'

Allowing directories to be assigned as a File type

Hello,

It appears that cwltool no longer allows you to assign a path as a File, which then makes it difficult to mount a directory to the docker run. This is especially problematic for things like variant annotation software which often have a particular directory structure containing 100's of annotation files.

It seems like the commit present here is potentially the culprit. This functionality worked with cwl version 1.0.20160129183049 but it definitely doesn't work in the current one. How can we handle this issue?

content-type for remote descriptors

To follow-up on dockstore/dockstore#248

@tetron
Is there a particular MIME/media type that we should agree on for the cwl-runner when it requests files? Perhaps "application/x-yaml" or "text/plain"?

add support for other RDF encodings besides RDF-XML: RDFa, turtle, et cetera

cwltool uses both default and job_order paths

Long story short:

It appears as though cwltool validates that both the default and job_order paths exist.
Ideally, it would only validate the job_order paths if provided, the default paths if not provided.

Long story:

A toy command and file tree to demonstrate this follows
Required files: https://gist.github.com/denis-yuen/a1d36345ecc4bdc1580a

.
├── 8e888694-9c56-4529-a750-d6bfbd4a74e7.txt -> inputs/0bf6ab9f-83a8-4b72-ad67-460eb696bd64/ref_file_2
├── b4cdad91-676a-446c-a635-57453f17617a.txt -> inputs/d2816fb8-7773-4b2a-bd31-ee4a7efa8e04/ref_file_1
├── configs
├── foo.json
├── image-descriptor.cwl
├── inputs
│   ├── 0bf6ab9f-83a8-4b72-ad67-460eb696bd64
│   │   └── ref_file_2
│   ├── 0e129840-fa77-4cec-a885-19b1e538507a
│   │   └── hello_input
│   └── d2816fb8-7773-4b2a-bd31-ee4a7efa8e04
│       └── ref_file_1
├── logs
├── node-engine.cwl
├── outputs
│   └── hello-output.txt
└── working

When I run the following command, it executes successfully. ( cwltool --outdir /home/dyuen/consonance/consonance-arch/./datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/outputs/ image-descriptor.cwl foo.json )

However, note the symbolic links in the root. These correspond to the default paths in image-descriptor.cwl which should be overridden by the entries in foo.json.

But when I delete those two links, cwltool dies with the following validation error.

/usr/local/bin/cwltool 1.0.20151013173827
Tool definition failed validation:
While checking field `inputs`
  While checking object `file:///home/dyuen/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/image-descriptor.cwl#ref_file_1`
    While checking field `default`
      Field `path` contains undefined reference to `file:///home/dyuen/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/8e888694-9c56-4529-a750-d6bfbd4a74e7.txt`
  While checking object `file:///home/dyuen/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/image-descriptor.cwl#ref_file_2`
    While checking field `default`
      Field `path` contains undefined reference to `file:///home/dyuen/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/b4cdad91-676a-446c-a635-57453f17617a.txt`

Note that its validating the default paths, not the paths provided in foo.json. However, if I move the files referenced in the foo.json to match the defaults.

.
├── 8e888694-9c56-4529-a750-d6bfbd4a74e7.txt
├── b4cdad91-676a-446c-a635-57453f17617a.txt
├── configs
├── foo.json
├── hello-output.txt
├── image-descriptor.cwl
├── inputs
│   └── 0e129840-fa77-4cec-a885-19b1e538507a
│       └── hello_input
├── logs
├── node-engine.cwl
├── outputs
└── working

Then it dies saying it can't find the paths in the foo.json

dyuen@odl-dyuen:~/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995$ cwltool --outdir /home/dyuen/consonance/consonance-arch/./datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/outputs/ image-descriptor.cwl foo.json
/usr/local/bin/cwltool 1.0.20151013173827
Unknown hint file:///home/dyuen/consonance/consonance-arch/datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/ResourceRequirement
Got workflow error
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 153, in single_job_executor
    for r in jobiter:
  File "/usr/local/lib/python2.7/dist-packages/cwltool/draft2tool.py", line 127, in job
    builder.pathmapper = self.makePathMapper(reffiles, input_basedir, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/draft2tool.py", line 62, in makePathMapper
    return DockerPathMapper(reffiles, input_basedir)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/pathmapper.py", line 73, in __init__
    st = os.lstat(deref)
OSError: [Errno 2] No such file or directory: '/home/dyuen/consonance/consonance-arch/./datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/inputs/0bf6ab9f-83a8-4b72-ad67-460eb696bd64/ref_file_2'
Workflow error:
  [Errno 2] No such file or directory: '/home/dyuen/consonance/consonance-arch/./datastore/launcher-03ad4441-c074-4893-b5bd-6d14aabb4995/inputs/0bf6ab9f-83a8-4b72-ad67-460eb696bd64/ref_file_2'

Add requirements.txt and .python-version to root

From common-workflow-language/common-workflow-language#112 & common-workflow-language/common-workflow-language#113

Dockerized tools fail if they don't respect TMPDIR

As of e645ade, cwltool sets --read-only=true when running docker. This makes the container's file system read-only, excepting only volumes attached as rw at runtime. I think this is a great idea, but have run into issues with /tmp.

Since /tmp is not a volume, it is read-only inside the container. I do see that the TMPDIR variable is set to a rw volume, but many tools don't respect TMPDIR, still try to write to /tmp, and now fail.

I can probably work around this in a few cases, but I figured it was worth discussing. Is there a reason the job's temp directory is mounted at /tmp/job_tmp instead of just /tmp?

In https://github.com/common-workflow-language/cwltool/blob/master/cwltool/expression.py#L24 schema_salad.ref_resolver.resolve_json_pointer is used but this function was dropped in schema_salad on December 8th in common-workflow-language/schema_salad@f647d86#diff-ceb29e0b30c435fbee3c927af99c9f35L420

Looks like we need a test for this codepath & a fix

cwltool fails first iteration when it needs to pull commonworkflowlanguage/nodejs-engine

Not sure if expected behaviour.

When cwltool needs to pull commonworkflowlanguage/nodejs-engine , it will fail running the tool. A second attempt succeeds.

ubuntu@cwl-sort-test:~/CancerCollaboratory/dockstore-tool-linux-sort$ cwltool --non-strict Dockstore.cwl test.json
/usr/local/bin/cwltool 1.0.20160108200940
Got workflow error
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 158, in single_job_executor
    for r in jobiter:
  File "/usr/local/lib/python2.7/dist-packages/cwltool/draft2tool.py", line 130, in job
    j.stdout = builder.do_eval(self.tool["stdout"])
  File "/usr/local/lib/python2.7/dist-packages/cwltool/builder.py", line 165, in do_eval
    context=context, pull_image=pull_image)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/expression.py", line 135, in do_eval
    return sandboxjs.interpolate(ex, jshead(r.get("expressionLib", []), rootvars))
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 128, in interpolate
    e = execjs(scan[w[0]+1:w[1]], jslib)
  File "/usr/local/lib/python2.7/dist-packages/cwltool/sandboxjs.py", line 41, in execjs
    raise JavascriptException("Returncode was: %s\nscript was: %s\nstdout was: '%s'\nstderr was: '%s'\n" % (nodejs.returncode, script, stdoutdata, stderrdata))
JavascriptException: Returncode was: 2
script was: console.log(JSON.stringify(require("vm").runInNewContext("\"use strict\";var inputs = {\"allocatedResources\": {\"mem\": 1000, \"cpu\": 1}, \"input\": [{\"path\": \"file:///home/ubuntu/CancerCollaborat
ory/dockstore-tool-linux-sort/example.bedGraph\", \"class\": \"File\"}], \"key\": [\"1,1\", \"2,2n\"], \"output\": \"example.bedGraph.sorted\"};\nvar self = null;\nvar runtime = {\"outdirSize\": 1024, \"ram\": 102
4, \"tmpdirSize\": 1024, \"cores\": 1, \"tmpdir\": \"/tmp/job_tmp\", \"outdir\": \"/tmp/job_output\"};\n(function(){return ((inputs.output));})()", {})));

stdout was: ''
stderr was: 'Unable to find image 'commonworkflowlanguage/nodejs-engine:latest' locally
latest: Pulling from commonworkflowlanguage/nodejs-engine
f9443a9b4216: Pulling fs layer
1ddcf54c0400: Pulling fs layer
e091dad1ae93: Pulling fs layer
41f5a9faf342: Pulling fs layer
41f5a9faf342: Verifying Checksum
41f5a9faf342: Download complete
1ddcf54c0400: Verifying Checksum
1ddcf54c0400: Download complete
e091dad1ae93: Verifying Checksum
e091dad1ae93: Download complete
'

Workflow error:
  Returncode was: 2
script was: console.log(JSON.stringify(require("vm").runInNewContext("\"use strict\";var inputs = {\"allocatedResources\": {\"mem\": 1000, \"cpu\": 1}, \"input\": [{\"path\": \"file:///home/ubuntu/CancerCollaborat
ory/dockstore-tool-linux-sort/example.bedGraph\", \"class\": \"File\"}], \"key\": [\"1,1\", \"2,2n\"], \"output\": \"example.bedGraph.sorted\"};\nvar self = null;\nvar runtime = {\"outdirSize\": 1024, \"ram\": 102
4, \"tmpdirSize\": 1024, \"cores\": 1, \"tmpdir\": \"/tmp/job_tmp\", \"outdir\": \"/tmp/job_output\"};\n(function(){return ((inputs.output));})()", {})));

stdout was: ''
stderr was: 'Unable to find image 'commonworkflowlanguage/nodejs-engine:latest' locally
latest: Pulling from commonworkflowlanguage/nodejs-engine
f9443a9b4216: Pulling fs layer
1ddcf54c0400: Pulling fs layer
e091dad1ae93: Pulling fs layer
41f5a9faf342: Pulling fs layer
41f5a9faf342: Verifying Checksum
41f5a9faf342: Download complete
1ddcf54c0400: Verifying Checksum
1ddcf54c0400: Download complete
e091dad1ae93: Verifying Checksum
e091dad1ae93: Download complete
'

ubuntu@cwl-sort-test:~/CancerCollaboratory/dockstore-tool-linux-sort$ cwltool --non-strict Dockstore.cwl test.json
/usr/local/bin/cwltool 1.0.20160108200940
[job 139960086713808] /home/ubuntu/CancerCollaboratory/dockstore-tool-linux-sort$ docker run -i --volume=/home/ubuntu/CancerCollaboratory/dockstore-tool-linux-sort/example.bedGraph:/tmp/job299693805_dockstore-tool
-linux-sort/example.bedGraph:ro --volume=/home/ubuntu/CancerCollaboratory/dockstore-tool-linux-sort:/tmp/job_output:rw --volume=/tmp/tmpMKbQ3D:/tmp/job_tmp:rw --workdir=/tmp/job_output --read-only=true --user=1000
 --rm --env=TMPDIR=/tmp/job_tmp quay.io/collaboratory/dockstore-tool-linux-sort sort -k 1,1 -k 2,2n /tmp/job299693805_dockstore-tool-linux-sort/example.bedGraph > /home/ubuntu/CancerCollaboratory/dockstore-tool-li
nux-sort/example.bedGraph.sorted
Final process status is success
{
    "sorted": {
        "path": "/home/ubuntu/CancerCollaboratory/dockstore-tool-linux-sort/example.bedGraph.sorted", 
        "size": 796, 
        "class": "File", 
        "checksum": "sha1$36b0a5a18a584d095597912b702251f90db60a64"
    }
}

Workaround: pre-load the engine via docker pull commonworkflowlanguage/nodejs-engine

Dockstore.cwl.txt

cwltool --update behaves unexpectedly

cwltool --update does not seem to accept draft-3 as a valid version (and it only updates as far as dev4) which is odd because it seems like it should https://github.com/common-workflow-language/cwltool/blob/master/cwltool/schemas/draft-3/Process.yml#L31

ex:

dyuen@orz:~/dockstore/dockstore-launcher$ cat test.cwl 
#!/usr/bin/env cwl-runner

class: CommandLineTool
description: "Markdown description text here"
id: "HelloWorld"
label: "HelloWorld Tool"

cwlVersion: "cwl:draft-3.dev1"
dyuen@orz:~/dockstore/dockstore-launcher$ cwltool --non-strict --update test.cwl
/usr/local/bin/cwltool 1.0.20160108200940
{
    "cwlVersion": "https://w3id.org/cwl/cwl#draft-3.dev4", 
    "requirements": [
        {
            "class": "InlineJavascriptRequirement"
        }
    ], 
    "description": "Markdown description text here", 
    "id": "HelloWorld", 
    "label": "HelloWorld Tool", 
    "class": "CommandLineTool", 
    "name": "file:///home/dyuen/dockstore/dockstore-launcher/test.cwl"
}
dyuen@orz:~/dockstore/dockstore-launcher$ vim test.cwl 
dyuen@orz:~/dockstore/dockstore-launcher$ cat test.cwl 
#!/usr/bin/env cwl-runner

class: CommandLineTool
description: "Markdown description text here"
id: "HelloWorld"
label: "HelloWorld Tool"

cwlVersion: "cwl:draft-3"
dyuen@orz:~/dockstore/dockstore-launcher$ cwltool --non-strict --update test.cwl
/usr/local/bin/cwltool 1.0.20160108200940
I'm sorry, I couldn't load this CWL file.
Unrecognized version https://w3id.org/cwl/cwl#draft-3
dyuen@orz:~/dockstore/dockstore-launcher$

Versions of stuff:

dyuen@orz:~/dockstore/dockstore-launcher$ pip list | grep cwl
cwl-runner (1.0)
cwltool (1.0.20160108200940)
dyuen@orz:~/dockstore/dockstore-launcher$ sudo pip install cwltool --upgrade
...
Requirement already up-to-date: cwltool in /usr/local/lib/python2.7/dist-packages

This is related to common-workflow-language/schema_salad#7 (sorry)