GithubHelp home page GithubHelp logo

common-workflow-language / cwljava Goto Github PK

View Code? Open in Web Editor NEW
12.0 12.0 8.0 10.94 MB

Java SDK for the Common Workflow Language standards

Java 81.71% HTML 0.02% Common Workflow Language 15.61% Python 0.03% JavaScript 2.63%
commonwl cwl java

cwljava's Introduction

Common Workflow Language

Main website: https://www.commonwl.org

GitHub repository for www.commonwl.org: https://www.github.com/common-workflow-language/cwl-website

CWL v1.0.x: https://github.com/common-workflow-language/common-workflow-language (this repository)

CWL v1.1.x: https://github.com/common-workflow-language/cwl-v1.1/

CWL v1.2.x: https://github.com/common-workflow-language/cwl-v1.2/

Support Gitter GitHub stars

[Video] Common Workflow Language explained in 64 seconds The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a
variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.

Open Stand badge CWL is developed by a multi-vendor working group consisting of organizations and individuals aiming to enable scientists to share data analysis workflows. The CWL project is maintained on Github and we follow the Open-Stand.org principles for collaborative open standards development. Legally, CWL is a member project of Software Freedom Conservancy and is formally managed by the elected CWL leadership team, however every-day project decisions are made by the CWL community which is open for participation by anyone.

CWL builds on technologies such as JSON-LD for data modeling and Docker for portable runtime environments.

User Guide

The CWL user guide provides a gentle introduction to learning how to write CWL command line tool and workflow descriptions.

CWLの日本語での解説ドキュメント is a 15 minute introduction to the CWL project in Japanese.

CWL Recommended Practices

CWLの日本語での解説ドキュメント is a 15 minute introduction to the CWL project in Japanese.

A series of video lessons about CWL is available in Russian as part of the Управление вычислениями(Computation Management) free online course.

Citation

To reference the CWL project in a scholary work, please use the following citation:

Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, and The CWL Community. (2022): Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language. Commun. ACM 65, 6 (June 2022), 54–63. https://doi.org/10.1145/3486897

To cite version 1.0 of the CWL standards specifically, please use the following citation inclusive of the DOI.

Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić (editors), Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, Luka Stojanovic (2016): Common Workflow Language, v1.0. Specification, Common Workflow Language working group. https://w3id.org/cwl/v1.0/ doi:10.6084/m9.figshare.3115156.v2

A collection of existing references to CWL can be found at https://zotero.org/groups/cwl

Code of Conduct

The CWL Project is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, age, race, or religion. We do not tolerate harassment of participants in any form. This code of conduct applies to all CWL Project spaces, including the Google Group, the Gitter chat room, the Google Hangouts chats, both online and off. Anyone who violates this code of conduct may be sanctioned or expelled from these spaces at the discretion of the leadership team.

For more details, see our Code of Conduct.

For the following content:

  • Support, Community and Contributing
  • CWL Implementations
  • Repositories of CWL Tools and Workflows
  • Software for working with CWL
    • Editors and viewers
    • Utilities
    • Converters and code generators
    • Code libraries
  • Projects the CWL community is participating in
  • Participating Organizations
  • Individual Contributors
  • CWL Advisors
  • CWL Leadership team

Please see https://www.commonwl.org

cwljava's People

Contributors

dependabot[bot] avatar jdidion avatar kinow avatar mr-c avatar snyk-bot avatar stain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cwljava's Issues

Workflow with imported schema is not parsable when packed

This conformance test https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/schemadef-wf.cwl when packed cannot be parsed:

java.lang.ClassCastException: java.lang.String cannot be cast to java.util.List
        at org.w3id.cwl.cwl1_2.utils.YamlUtils.listFromString(YamlUtils.java:172)
        at org.w3id.cwl.cwl1_2.utils.Loader.documentLoadByUrl(Loader.java:69)
        at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:86)
        at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:27)
        at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:6)
        at org.w3id.cwl.cwl1_2.utils.OptionalLoader.load(OptionalLoader.java:21)
        at org.w3id.cwl.cwl1_2.utils.OptionalLoader.load(OptionalLoader.java:6)
        at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
        at org.w3id.cwl.cwl1_2.utils.IdMapLoader.load(IdMapLoader.java:51)
        at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
        at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
        at org.w3id.cwl.cwl1_2.CommandLineToolImpl.<init>(CommandLineToolImpl.java:458)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
        at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
        at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
        at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
        at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
        at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
        at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:27)
        at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:6)
        at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
        at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
        at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:39)
        at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:18)
        at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:86)
        at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:45)

mvn javadoc:javadoc fails

[ERROR] Failed to execute goal on project tools: Could not resolve dependencies for project io.cwl:tools:jar:1.0-SNAPSHOT: Could not find artifact io.cwl:core:jar:1.0-SNAPSHOT -> [Help 1]

Incomplete $import support

A first guess is that this has something to do with $import, since that's the only commonality I see between these tests.

[info]   java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
[info]   at org.w3id.cwl.cwl1_2.utils.YamlUtils.mapFromString(YamlUtils.java:10)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.documentLoadByUrl(Loader.java:56)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:72)
[info]   at org.w3id.cwl.cwl1_2.CommandLineToolImpl.<init>(CommandLineToolImpl.java:444)
[info]   at sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
[info]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[info]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:41)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:20)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:86)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:45)

Avoid cwltool dependency

I tried building with mvn clean install and I got:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running CWLClientTest
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
org.apache.commons.exec.ExecuteException: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "cwltool" (in directory "."): error=2, No such file or directory)
    at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:205)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Cannot run program "cwltool" (in directory "."): error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
    at java.lang.Runtime.exec(Runtime.java:620)
    at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:61)
    at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:279)
    at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:336)
    at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
    at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
    ... 1 more
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
    at java.lang.ProcessImpl.start(ProcessImpl.java:134)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
    ... 7 more
org.apache.commons.exec.ExecuteException: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "cwltool" (in directory "."): error=2, No such file or directory)
    at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:205)
    at java.lang.Thread.run(Thread.java:745)

Why would I need to have cwltool installed (the Python implementation) if I am building the Java implementation? Surely the two should be independent..?

Enum types are not parsed correctly

After packing https://github.com/common-workflow-language/cwl-v1.2/blob/1.2.1_proposed/tests/anon_enum_inside_array_inside_schemadef.cwl using cwlpack and then parsing with cwljava, the enum has symbols like "file:/Users/jdidion/projects/cwlScala/target/test-classes/CommandLineTools/conformance/#anon_enum_inside_array_inside_schemadef.cwl/first/user_type_2/species/homo_sapiens" rather than just "homo_sapiens". Packed workflow below.

{
    "cwlVersion": "v1.2",
    "class": "CommandLineTool",
    "requirements": [
        {
            "class": "InlineJavascriptRequirement"
        }
    ],
    "inputs": [
        {
            "type": {
                "name": "user_type_2",
                "type": "record",
                "fields": [
                    {
                        "type": [
                            "null",
                            {
                                "type": "enum",
                                "symbols": [
                                    "homo_sapiens",
                                    "mus_musculus"
                                ]
                            }
                        ],
                        "name": "species"
                    },
                    {
                        "type": [
                            "null",
                            {
                                "type": "enum",
                                "symbols": [
                                    "GRCh37",
                                    "GRCh38",
                                    "GRCm38"
                                ]
                            }
                        ],
                        "name": "ncbi_build"
                    }
                ]
            },
            "id": "first"
        }
    ],
    "baseCommand": "echo",
    "arguments": [
        {
            "prefix": "species",
            "valueFrom": "$(inputs.first.species)"
        },
        {
            "prefix": "ncbi_build",
            "valueFrom": "$(inputs.first.ncbi_build)"
        }
    ],
    "outputs": [
        {
            "type": "stdout",
            "id": "result"
        }
    ],
    "id": "anon_enum_inside_array_inside_schemadef.cwl"
}

No way to access custom metadata

CWL provides the ability to use custom metadata via $namespaces. However, the objects generated by the Java parser do not contain the metadata, or at least there's no method by which to access it.

Fails to parse conformance tests (ValidationException: Failed to match union type)

org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
[info]   Trying 'RecordField'
[info]     the `inputs` field is not valid because:
[info]       Failed to match union type
[info]         Expected object with Java type of java.util.List but got java.util.LinkedHashMap
[info]         Trying 'RecordField'
[info]           the `secondaryFiles` field is not valid because:
[info]             Failed to match union type
[info]               Expected null
[info]               Expected object with Java type of java.util.Map but got java.util.ArrayList
[info]               Failed to match union type
[info]                 Expected object with Java type of java.util.List but got java.lang.String
[info]                 Expected object with Java type of java.util.Map but got java.lang.String
[info]   Trying 'RecordField'
[info]     the `inputs` field is not valid because:
[info]       Failed to match union type
[info]         Expected object with Java type of java.util.List but got java.util.LinkedHashMap
[info]         Trying 'RecordField'
[info]           the `secondaryFiles` field is not valid because:
[info]             Failed to match union type
[info]               Expected null
[info]               Expected object with Java type of java.util.Map but got java.util.ArrayList
[info]               Failed to match union type
[info]                 Expected object with Java type of java.util.List but got java.lang.String
[info]                 Expected object with Java type of java.util.Map but got java.lang.String
[info]     the `expression` field is not valid because:
[info]       Expected object with Java type of java.lang.String but got null
[info]   Trying 'RecordField'
[info]     the `inputs` field is not valid because:
[info]       Failed to match union type
[info]         Expected object with Java type of java.util.List but got java.util.LinkedHashMap
[info]         Trying 'RecordField'
[info]           the `secondaryFiles` field is not valid because:
[info]             Failed to match union type
[info]               Expected null
[info]               Expected object with Java type of java.util.Map but got java.util.ArrayList
[info]               Failed to match union type
[info]                 Expected object with Java type of java.util.List but got java.lang.String
[info]                 Expected object with Java type of java.util.Map but got java.lang.String
[info]     the `steps` field is not valid because:
[info]       Expected object with Java type of java.util.List but got null
[info]   Trying 'RecordField'
[info]     the `inputs` field is not valid because:
[info]       Failed to match union type
[info]         Expected object with Java type of java.util.List but got java.util.LinkedHashMap
[info]         Trying 'RecordField'
[info]           the `secondaryFiles` field is not valid because:
[info]             Failed to match union type
[info]               Expected null
[info]               Expected object with Java type of java.util.Map but got java.util.ArrayList
[info]               Failed to match union type
[info]                 Expected object with Java type of java.util.List but got java.lang.String
[info]                 Expected object with Java type of java.util.Map but got java.lang.String
[info]   Expected object with Java type of java.util.List but got java.util.LinkedHashMap
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:31)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:41)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:20)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:86)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:45)

Document preprocessing and resolution

Hi Peter (@tetron),

Based on our discussion on Friday, Workflows would need this the most rather than CommandLineTools. May I assume the client/runner that will read the CWL will most likely perform this? If you could point me to some CWL files that utilizes this, that would be great. The more files, the better, since the closest I could find were the following two inter-dependent ones:

Initially I would like to keep this functionality as part of a client/runner - also to be built in Java - and initially separate from the base SDK. Eventually they will be a directory and this SDK will be the base/core directory - though with time they will all be together :)

Thanks,
Paul

Code generation from avro

Hi,

@pgrosu
@tetron

I'm going to sketch out how code generation from the CWL avro schema worked for me and then link to the code where I actually do it (and test it). Feel free to borrow/copy liberally. Manually creating the Java classes could not have been easy. Hopefully this should save some time for future me/you when migrating to draft4, etc.

Generating classes

The procedure for generating classes is here with the important bits being:

  1. Get schema salad from the common-workflow-language organization and run python -mschema_salad --print-avro ~/common-workflow-language/draft-3/cwl-avro.yml
    This converts from yml to avro avsc.
  2. Get the avro tools jar and CWL avsc and call java -jar avro-tools-1.7.7.jar compile schema cwl.avsc cwl This generates Java classes.
  3. Copy them to the appropriate directory in your code (you will need to insert package names)

Using the generated classes

Note that the avro schema (avsc) as generated from yml lacks namespaces, so the generated Java classes lack package names.

Also, you won't be able to use the avro tool classes to convert between CWL and your generated Java classes directly due to:

  1. the avro tools only understand JSON and their binary format normally. Solution, use cwltool to convert from cwl to json
  2. there are some quirks with the cwl files even in json form that seem to confuse the avro tool, I don't remember the details but as a result, I needed to use gson with some customizations instead of avro out-of-the-box.

Examples

Example of code that you can use-reuse

Converting from a json-representation of CWL to java objects

Directory of auto-generated classes

How to configure gson to do the conversion

Hope that helps!

Supporting recursive expansion of `typeDSL`

Official domain name question

Hi @mr-c / @tetron

I am putting the new version into packages, and wanted to confirm that official domain name for CWL is the following:

commonwl.org

If it is, then all packages will be under org.commonwl.

Thanks,
Paul

Embedded process ID is ignored

I am trying to parse the following workflow, which was generated by cwlpack --json --add-ids. The parser ignores the ID of the wc3-tool process is ignored and instead substitutes the ID #count-lines19-wf.cwl/step1/null.

{
    "class": "Workflow",
    "cwlVersion": "v1.2",
    "inputs": [
        {
            "type": "File",
            "id": "file1"
        }
    ],
    "outputs": [
        {
            "type": "int",
            "outputSource": "step1/output",
            "id": "count_output"
        }
    ],
    "steps": [
        {
            "run": {
                "class": "CommandLineTool",
                "requirements": [
                    {
                        "class": "InlineJavascriptRequirement"
                    }
                ],
                "inputs": [
                    {
                        "type": {
                            "type": "array",
                            "items": "File"
                        },
                        "inputBinding": {},
                        "id": "file1"
                    }
                ],
                "outputs": [
                    {
                        "type": "int",
                        "outputBinding": {
                            "glob": "output.txt",
                            "loadContents": true,
                            "outputEval": "${\n  var s = self[0].contents.split(/\\r?\\n/);\n  return parseInt(s[s.length-2]);\n}\n"
                        },
                        "id": "output"
                    }
                ],
                "stdout": "output.txt",
                "baseCommand": "wc",
                "id": "count-lines19-wf.cwl:step_step1:wc3-tool.cwl"
            },
            "in": [
                {
                    "source": [
                        "file1"
                    ],
                    "linkMerge": "merge_nested",
                    "id": "file1"
                }
            ],
            "out": [
                "output"
            ],
            "id": "step1"
        }
    ],
    "requirements": [
        {
            "class": "SubworkflowFeatureRequirement"
        },
        {
            "class": "InlineJavascriptRequirement"
        }
    ],
    "id": "count-lines19-wf.cwl"
}

Fails to parse requirements correctly

When I parse this workflow, the resulting CommandLineToolImpl contains two copies of the InlineJavascriptRequirement.

cwlVersion: v1.2

class: CommandLineTool

requirements:
  - class: InlineJavascriptRequirement
  - class: InitialWorkDirRequirement
    listing:
      - entryname: emptyWritableDir
        entry: "$({class: 'Directory', listing: []})"
        writable: true

hints:
  - class: DockerRequirement
    dockerPull: alpine

inputs: []
outputs:
  out:
    type: Directory
    outputBinding:
      glob: emptyWritableDir
arguments: [touch, emptyWritableDir/blurg]

parsing CWL v1.0 & v1.1

Recently, we integrated cwljava v1.0 into Dockstore, and during testing, we found some workflows that cause the parser to throw. In our webservice, we use our own preprocessor to combine the various component CWL files into one big CWL represented by Maps/Lists, and parse it with cwljava here:
https://github.com/dockstore/dockstore/blob/67f4547e771864cafacdc1c92fa7bd47261e32da/dockstore-webservice/src/main/java/io/dockstore/webservice/languages/CWLHandler.java#L364

The following workflows cause loadDocument to throw a ValidationException:

https://github.com/ICGC-TCGA-PanCancer/OxoG-Dockstore-Tools/tree/master
primary descriptor: /oxog_varbam_annotate_wf.cwl

https://github.com/h3abionet/h3agatk/tree/1.0.1
primary descriptor: /workflows/GATK/GATK-complete-WES-Workflow-h3abionet.cwl

The first workflow contains a SchemaDefRequirement and the parser appears to have trouble parsing the type references (TumourType.yaml#TumourType etc). When I change the type references to int, the workflow successfully parses.

Judging from exception message, the second workflow seems to be failing for a different reason, but I haven't pinpointed what, exactly. It is possible that it's not valid, but a cursory inspection didn't turn up any problems.

The exception messages are pretty big, so I put them and some stack trace info in the comments.

Please let me know if you need any more info. Thanks!

Selecting an implementation

I've refactored both WIP in branches into more organized projects. #4 for the branch and @denis-yuen #5 for the branch by @pgrosu
#4 compiles, but it looks like #5 has issues with method overriding (I had a similar issue with the scala attempt)

ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on project core: Compilation failure
[ERROR] /Users/ellrott/workspaces/cwljava/core/src/main/java/io/cwl/schema/OutputRecordSchema.java:[51,30] getfields() in io.cwl.schema.OutputRecordSchema cannot override getfields() in io.cwl.schema.RecordSchema
[ERROR] return type io.cwl.schema.OutputRecordField[] is not compatible with io.cwl.schema.RecordField[]

Yaml output of the instantiated CommandLineTool Object

@mr-c I just did the following (thought I had to use snakeyaml-1.19-android.jar rather than the normal jar - the Program_Context is an instance of CommandLineTool:

      DumperOptions Yaml_Formatter = new DumperOptions();
      Yaml_Formatter.setDefaultFlowStyle(DumperOptions.FlowStyle.FLOW);
      Yaml_Formatter.setPrettyFlow(true);

      Yaml yaml = new Yaml( Yaml_Formatter );
      String YAML_Program_Context = yaml.dump( Program_Context );
      System.out.println( YAML_Program_Context );

Below is the output:

!!org.commonwl.lang.CommandLineTool {
  arguments: null,
  baseCommand: wc,
  class_value: null,
  cwlVersion: null,
  doc: null,
  hints: null,
  id: null,
  inputs: [
    {
      default_value: null,
      doc: null,
      format: null,
      id: example_string,
      inputBinding: {
        itemSeparator: null,
        loadContents: null,
        position: 0,
        prefix: null,
        separate: null,
        shellQuote: null,
        valueFrom: null
      },
      label: null,
      secondaryFiles: null,
      streamable: null,
      type: string
    },
    {
      default_value: null,
      doc: null,
      format: null,
      id: file2,
      inputBinding: {
        itemSeparator: null,
        loadContents: null,
        position: 1,
        prefix: null,
        separate: null,
        shellQuote: null,
        valueFrom: null
      },
      label: null,
      secondaryFiles: null,
      streamable: null,
      type: File
    },
    {
      default_value: null,
      doc: null,
      format: null,
      id: file1,
      inputBinding: {
        itemSeparator: null,
        loadContents: null,
        position: 2,
        prefix: null,
        separate: null,
        shellQuote: null,
        valueFrom: null
      },
      label: null,
      secondaryFiles: null,
      streamable: null,
      type: File
    }]
  ,
  label: null,
  outputs: [
    {
      doc: null,
      format: null,
      id: output,
      label: null,
      outputBinding: {
        glob: output,
        loadContents: null,
        outputEval: null
      },
      secondaryFiles: null,
      streamable: null,
      type: File
    }]
  ,
  permanentFailCodes: null,
  requirements: null,
  stderr: null,
  stdin: whale.txt,
  stdout: output,
  successCodes: null,
  temporaryFailCodes: null
}

CWL tool discovery

How to discover cwl tool in java workflow engine like Apache Taverna .Is YAML parser used to provide CWL file reading capability .Then can i use that to get cwl tool configuration data(metadata).

secondaryFiles with pattern cannot be parsed when given in workflow inputs

When parsing packed cwl which has the secondaryFiles given in the workflow inputs raises the error below:

org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@7c28c1
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@75b3673
    the `inputs` field is not valid because:
      Failed to match union type
        Expected object with Java type of java.util.List but got java.util.LinkedHashMap
        Trying 'RecordField'
          the `secondaryFiles` field is not valid because:
            Missing 'pattern' in secondaryFiles specification entry.
    the `expression` field is not valid because:
      Expected a string.

where the secondaryFiles is given in the standard schema:

                        "secondaryFiles": [
                            {
                                "pattern": "^.bai",
                                "required": true
                            }
                        ], 

However, reformating it as a list of strings would work:

                         "secondaryFiles": [ "^.bai" ], 

The issue might result from https://github.com/common-workflow-lab/cwljava/blob/fd2d2bb0652e9d57ecee8c380157cc0f3186d636/src/main/java/org/w3id/cwl/cwl1_2/utils/SecondaryFilesDslLoader.java#L38-L44

where the pattern and required entries are removed and the source doc is modified, so when iterating candidate loaders in unionloader.java the latter could not find those entries.
For example, when testing I noticed that the loader of CommandInputParameter processed the inputs first and removed the secondaryFiles pattern and required, but later when the workflow input parameter loader tried to parse the doc it could only find an empty map.

The secondaryFiles as tool inputs seem to be working. So as the ones in the short format in both workflow and tool level.

NullPointerException

There are 6 workflow tests that fail with the same exception. Looks to have something to do with parsing the outputSource that refers to a step output.

java.lang.NullPointerException:
[info]   at org.w3id.cwl.cwl1_2.utils.LoadingOptions.expandUrl(LoadingOptions.java:94)
[info]   at org.w3id.cwl.cwl1_2.utils.UriLoader.expandUrl(UriLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.UriLoader.load(UriLoader.java:48)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.WorkflowOutputParameterImpl.<init>(WorkflowOutputParameterImpl.java:350)
[info]   at sun.reflect.GeneratedConstructorAccessor54.newInstance(Unknown Source)
[info]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[info]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:27)
[info]   at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.IdMapLoader.load(IdMapLoader.java:51)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.WorkflowImpl.<init>(WorkflowImpl.java:348)
[info]   at sun.reflect.GeneratedConstructorAccessor51.newInstance(Unknown Source)
[info]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[info]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.WorkflowStepImpl.<init>(WorkflowStepImpl.java:401)
[info]   at sun.reflect.GeneratedConstructorAccessor50.newInstance(Unknown Source)
[info]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[info]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:27)
[info]   at org.w3id.cwl.cwl1_2.utils.ArrayLoader.load(ArrayLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.IdMapLoader.load(IdMapLoader.java:51)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.load(Loader.java:16)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.loadField(Loader.java:99)
[info]   at org.w3id.cwl.cwl1_2.WorkflowImpl.<init>(WorkflowImpl.java:438)
[info]   at sun.reflect.GeneratedConstructorAccessor51.newInstance(Unknown Source)
[info]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[info]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:23)
[info]   at org.w3id.cwl.cwl1_2.utils.RecordLoader.load(RecordLoader.java:6)
[info]   at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:26)
[info]   at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:41)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:18)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:86)
[info]   at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:45)

WorkflowStepInput source IDs incorrect when parsing workflow packed with cwlpack

I used cwlpack to pack the conformance test scatter-wf4. The resulting packed workflow (shown below) validates with cwltool --validate. However, when I parse the packed workflow using cwljava, the WorkflowStepInput sources have the form "echo_in1/inp1" when they should be either "main/inp1" or just "inp1".

{
    "cwlVersion": "v1.2",
    "$graph": [
        {
            "id": "echo",
            "class": "CommandLineTool",
            "inputs": {
                "echo_in1": {
                    "type": "string",
                    "inputBinding": {}
                },
                "echo_in2": {
                    "type": "string",
                    "inputBinding": {}
                }
            },
            "outputs": {
                "echo_out": {
                    "type": "string",
                    "outputBinding": {
                        "glob": "step1_out",
                        "loadContents": true,
                        "outputEval": "$(self[0].contents)"
                    }
                }
            },
            "baseCommand": "echo",
            "arguments": [
                "-n",
                "foo"
            ],
            "stdout": "step1_out"
        },
        {
            "id": "main",
            "class": "Workflow",
            "inputs": {
                "inp1": "string[]",
                "inp2": "string[]"
            },
            "requirements": [
                {
                    "class": "ScatterFeatureRequirement"
                }
            ],
            "steps": {
                "step1": {
                    "scatter": [
                        "echo_in1",
                        "echo_in2"
                    ],
                    "scatterMethod": "dotproduct",
                    "in": {
                        "echo_in1": "inp1",
                        "echo_in2": "inp2"
                    },
                    "out": [
                        "echo_out"
                    ],
                    "run": "#echo"
                }
            },
            "outputs": [
                {
                    "id": "out",
                    "outputSource": "step1/echo_out",
                    "type": {
                        "type": "array",
                        "items": "string"
                    }
                }
            ]
        }
    ],
    "inputs": [],
    "outputs": [],
    "requirements": [
        {
            "class": "InlineJavascriptRequirement"
        }
    ]
}

A small issue about abstract and method definition

@tetron So I have a small dilemma. So InputParameter implements Parameter which is an interface since it's abstract. If I have Parameter as an interface, then I have to have define the set methods for type called settype with the argument of the type RecordSchema in InputParameter. But InputParameter specializes RecordSchema to InputRecordSchema. The same applies for EnumSchema being specialized to InputEnumSchema, and ArraySchema being specialized to InputArraySchema.

The same exact thing is happening for OutputParameter, as above.

A similar thing is happening for Workflow where OutputParameter is specialized to WorkflowOutputParameter, which comes from Process for the field outputs, which is an array of OutputParameter.

A similar thing is happening for CommandLineTool where InputParameter is specialized to CommandInputParameter, which comes from Process for the field inputs, which is an array of InputParameter. Also OutputParameter is specialized to CommandOutputParameter, and thus also would not be defined from Process for the field outputs, which is an array of OutputParameter.

So we have two options:

  • Ignore the specialization restriction and define as well the methods settype( RecordSchema ... ), settype( EnumSchema ... ), settype( ArraySchema ... ) etc., since Java will not compile it if they are undefined from an implemented interface,
  • Or should Draft-3 be updated?

Thanks,
`p

Partial/lazy parse?

Hello! We're integrating cwljava into Dockstore, and occasionally, we need to pluck a few (or less) values out of a CWL: for example, the id and doc fields (if they exist) from the root workflow. In these situations, we don't care about the contents of the rest of the CWL and want to be tolerant of errors: the CWL as a whole might not be 100% well-formed, so we don't want to completely parse it and trigger a ValidationException. We want to parse just enough of the CWL that we can extract the desired information.

Is there a way to use cwljava to perform a partial parse? In other words, to somehow prevent the irrelevant portions of the CWL from being parsed?

If that's possible, great, and if not, that would be a handy feature, we would definitely use it.

Using this SDK to generate CWL Documents

Can this SDK be used to generate CWL documents?
We are trying to add a CWL exporter to our project (we have our workflows stored in a custom format) and since we have a Java backend, I was hopping that we could use the classes generated in this SDK to create the documents and then export them in YAML. However I don't see any setters in the generated classes, or constructors that I could use to export our definitions into CWL CommandLineTool, Workflow, etc. Am I missing something?
I also tried to use schema-salad to re-generate the Java classes but couldn't find the option to generate the classes differently.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.