GithubHelp home page GithubHelp logo

Comments (12)

jeffbaumes avatar jeffbaumes commented on July 18, 2024

In the task spec docs, there is a WORKFLOW_TASK spec for JSON, which defines multiple tasks and the connections between tasks. Is that close to what you are after?

from girder_worker.

rbaral avatar rbaral commented on July 18, 2024

The link https://girder-worker.readthedocs.io/en/latest/api-docs.html#the-task-specification always takes me to the page which says:

Service Unavailable

The service is temporarily unavailable. Please try again later.

Basically, the examples that we have seen in the girder worker docs has everything stated in the python code itself. Rather than this, I was looking for the way which can allow me to specify the workflow details in XML or JSON or any well-formatted text so that it can be reloaded (or updated) by a generic python code (just like a parser of the workflow description) and be able to get the workflow configuration from the XML/JSON file.

To give a simple example, most of the logging tools have .xml, or .properties file that can take the logging configuration. The same configuration can be hard coded in the Logger class itself (which makes it highly coupled). The girderworker examples provide the latter convention (the workflow details are highly coupled with the python code itself). What I was looking is the former convention (separating the workflow configuration from the workflow runner).
Am I clear?

Thanks.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

Not sure why the RTD link is not working for you. The docs I mentioned are also hosted on GitHub here.

There is a pure-JSON no-Python way of constructing workflow specs. The facebook example (here on RTD or here on GitHub) shows a raw workflow object being constructed without using the pythonic Workflow helper class. Here is the spec fully put together as pure JSON:

{
  "mode": "workflow",
  "inputs": [
    {
      "type": "graph",
      "name": "G",
      "format": "adjacencylist"
    }
  ],
  "outputs": [
    {
      "type": "graph",
      "name": "result_graph",
      "format": "networkx"
    }
  ],
  "connections": [
    {
      "input": "G",
      "input_step": "most_popular",
      "name": "G"
    },
    {
      "output": "G",
      "input_step": "find_neighborhood",
      "input": "G",
      "output_step": "most_popular"
    },
    {
      "output": "most_popular_person",
      "input_step": "find_neighborhood",
      "input": "most_popular_person",
      "output_step": "most_popular"
    },
    {
      "output": "subgraph",
      "name": "result_graph",
      "output_step": "find_neighborhood"
    }
  ],
  "steps": [
    {
      "name": "most_popular",
      "task": {
        "inputs": [
          {
            "type": "graph",
            "name": "G",
            "format": "networkx"
          }
        ],
        "script": "\nfrom networkx import degree\n\ndegrees = degree(G)\nmost_popular_person = max(degrees, key=degrees.get)\n",
        "outputs": [
          {
            "type": "string",
            "name": "most_popular_person",
            "format": "text"
          },
          {
            "type": "graph",
            "name": "G",
            "format": "networkx"
          }
        ]
      }
    },
    {
      "name": "find_neighborhood",
      "task": {
        "inputs": [
          {
            "type": "graph",
            "name": "G",
            "format": "networkx"
          },
          {
            "type": "string",
            "name": "most_popular_person",
            "format": "text"
          }
        ],
        "script": "\nfrom networkx import ego_graph\n\nsubgraph = ego_graph(G, most_popular_person)\n",
        "outputs": [
          {
            "type": "graph",
            "name": "subgraph",
            "format": "networkx"
          }
        ]
      }
    }
  ]
}

from girder_worker.

rbaral avatar rbaral commented on July 18, 2024

Thanks. It should be helpful. I will try to load and run this json using the girderworker.run method.
I was wondering if the script needs to be inlined here too. Is there anyway the script point to a separate file/method call instead of the inline script?

Thanks.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

For single-step tasks, you can use the script_uri field instead of script to point to a file:

"script_uri": "file:///path/to/my/script.py"

Using workflow = girder_worker.load(json_filename) will resolve the script_uri and place it's contents into the script field, ready to be sent to the run function.

I notice that this will not currently work for ingesting multiple scripts in workflow specifications. I will create an issue to resolve this.

from girder_worker.

rbaral avatar rbaral commented on July 18, 2024

I get the key not found error when I use script_uri. The same works when I use script. The error is:

Traceback (most recent call last):
  File "austinworkflow.py", line 37, in <module>
    outputs={'sliceroutputfile': {'format': 'text'}})
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/utils.py", line 291, in wrapped
    return fn(*args, **kwargs)
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/__init__.py", line 292, in run
    auto_convert=auto_convert, validate=validate, **kwargs)
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/executors/workflow.py", line 46, in run
    out = girder_worker.run(steps[step]["task"], bindings[step])
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/utils.py", line 291, in wrapped
    return fn(*args, **kwargs)
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/__init__.py", line 292, in run
    auto_convert=auto_convert, validate=validate, **kwargs)
  File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/executors/python.py", line 29, in run
    lines = task["script"].split("\n")
KeyError: 'script'

My JSON file (austinspec.json):

{
  "mode": "workflow",
  "inputs": [
    {
      "type": "string",
      "name": "slicerinputfile",
      "format": "text"
    },
    {
      "type": "number",
       "name": "lat",
       "format": "number"
     },
     {
       "type": "number",
       "name": "lon",
       "format": "number"
      }
  ],
  "outputs": [
    {
      "type": "string",
      "name": "sliceroutputfile",
      "format": "text"
    }
  ],
"connections": [
    {
      "input": "slicerinputfile",
      "input_step": "slicer",
      "name": "slicerinputfile"
    },
    {
      "input": "lat",
      "input_step": "slicer",
      "name": "lat"
    },
    {
      "input": "lon",
      "input_step": "slicer",
      "name": "lon"
    },
    {
      "output": "sliceroutputfile",
      "output_step": "slicer",
      "name": "sliceroutputfile"
    }
   ],
   "steps": [
    {
      "name": "slicer",
      "task": {
        "inputs": [
          {
            "type": "string",
            "name": "slicerinputfile",
            "format": "text"
          },
          {
            "type": "number",
            "name": "lat",
            "format": "number"
          },
          {
            "type": "number",
            "name": "lon",
            "format": "number"
          }
        ],
        "script_uri": "file:///slicerscript.py",
        "outputs": [
          {
            "type": "string",
            "name": "sliceroutputfile",
            "format": "text"
          }
        ]
      }
    }
   ]
}

My workflow execution file (austinworkflow.py):

import json
import girder_worker
import yaml


fileNameParam = "tasmax.CRCM.ccsm-current.dayavg.common.nc"
lat = 30.25
lon = -97.75


slicerInputFileName = {
    'type': 'string',
    'format': 'text',
    'data': fileNameParam
    }

latVal = {
    'type': 'number',
    'format': 'number',
    'data': lat
    }

lonVal = {
    'type': 'number',
    'format': 'number',
    'data': lon
    }

with open('austinspec.json') as spec:
    workflow = yaml.safe_load(spec) #to avoid the unicode character u' in the keys

for item in workflow.keys():
    print "keys:",str(item)

output = girder_worker.run(workflow,
                               inputs={'slicerinputfile': slicerInputFileName, 'lat':latVal, 'lon':lonVal},
                               outputs={'sliceroutputfile': {'format': 'text'}})

print "output is:",str(output['sliceroutputfile'])

My script file (slicerscript.py) contains just the single line:

sliceroutputfile = "abcd.txt"

Also, why does the parser of the girderworker gives problem with even the blank spaces. It needs the JSON very strict, no extra spaces allowed.

NOTE: All the files are in the same folder.

Thanks.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

I get the key not found error when I use script_uri. The same works when I use script.

Yes, this is because loading script URIs does not yet work for workflow steps (see #52).

Also, why does the parser of the girderworker gives problem with even the blank spaces. It needs the JSON very strict, no extra spaces allowed.

I'm not sure what exactly you mean, but yes it must be valid JSON. If you give an example of what doesn't work I can better see what the issue is.

from girder_worker.

rbaral avatar rbaral commented on July 18, 2024

From this page http://girder-worker.readthedocs.io/en/latest/api-docs.html, the method

girder_worker.load(task_file)[source]
Load a task JSON into memory, resolving any "script_uri" fields by replacing it with a "script" field containing the contents pointed to by "script_uri" (see girder_worker.uri for URI formats). A script_fetch_mode field may also be set

Parameters: analysis_file – The path to the JSON file to load.
Returns: The analysis as a dictionary.

looks like the thing that I am trying.

From your response above, it should work for single step, but it didn't work for my case even though I had single step. Having inline code is really painful if the code has many lines.

To clarify the second part of previous question, it gave me strange error "\t found in column..." when there were some whitespaces in the json file.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

From your response above, it should work for single step, but it didn't work for my case even though I had single step.

Ah, I had meant it does not work for workflow specs at all (no matter how many steps) but only for task specs. For example, workflow = girder_worker.load('austinspec.json') should work if the JSON simply contained:

      {
        "inputs": [
          {
            "type": "string",
            "name": "slicerinputfile",
            "format": "text"
          },
          {
            "type": "number",
            "name": "lat",
            "format": "number"
          },
          {
            "type": "number",
            "name": "lon",
            "format": "number"
          }
        ],
        "script_uri": "file://slicerscript.py",
        "outputs": [
          {
            "type": "string",
            "name": "sliceroutputfile",
            "format": "text"
          }
        ]
      }

Note also the two slashes instead of three in the script URI. Girder worker essentially removes file:// prefix to find the path, so using three slashes will assume the file is at / instead of the current path.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

See #53. Workflows should support script_uri on master now.

from girder_worker.

rbaral avatar rbaral commented on July 18, 2024

Does the girderworker engine support the reading of JSON string rather than JSON file? I tried it but it gives me
AttributeError: 'str' object has no attribute 'get'

I just had the string format of the JSON content from the valid JSON file and fed it to the workflow engine as a python string variable. I mean a single string, not the decomposed maps as in the fbworkflow example.

from girder_worker.

jeffbaumes avatar jeffbaumes commented on July 18, 2024

Does the girderworker engine support the reading of JSON string rather than JSON file?

If you are referring to the run method, it cannot accept a JSON string or a Python file object. It requires the JSON to be loaded as a Python dict (using girder_worker.load, json.load, or json.loads) before being passed as an argument to run.

from girder_worker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.