Comments (12)
In the task spec docs, there is a WORKFLOW_TASK
spec for JSON, which defines multiple tasks and the connections between tasks. Is that close to what you are after?
from girder_worker.
The link https://girder-worker.readthedocs.io/en/latest/api-docs.html#the-task-specification always takes me to the page which says:
Service Unavailable
The service is temporarily unavailable. Please try again later.
Basically, the examples that we have seen in the girder worker docs has everything stated in the python code itself. Rather than this, I was looking for the way which can allow me to specify the workflow details in XML or JSON or any well-formatted text so that it can be reloaded (or updated) by a generic python code (just like a parser of the workflow description) and be able to get the workflow configuration from the XML/JSON file.
To give a simple example, most of the logging tools have .xml, or .properties file that can take the logging configuration. The same configuration can be hard coded in the Logger class itself (which makes it highly coupled). The girderworker examples provide the latter convention (the workflow details are highly coupled with the python code itself). What I was looking is the former convention (separating the workflow configuration from the workflow runner).
Am I clear?
Thanks.
from girder_worker.
Not sure why the RTD link is not working for you. The docs I mentioned are also hosted on GitHub here.
There is a pure-JSON no-Python way of constructing workflow specs. The facebook example (here on RTD or here on GitHub) shows a raw workflow object being constructed without using the pythonic Workflow
helper class. Here is the spec fully put together as pure JSON:
{
"mode": "workflow",
"inputs": [
{
"type": "graph",
"name": "G",
"format": "adjacencylist"
}
],
"outputs": [
{
"type": "graph",
"name": "result_graph",
"format": "networkx"
}
],
"connections": [
{
"input": "G",
"input_step": "most_popular",
"name": "G"
},
{
"output": "G",
"input_step": "find_neighborhood",
"input": "G",
"output_step": "most_popular"
},
{
"output": "most_popular_person",
"input_step": "find_neighborhood",
"input": "most_popular_person",
"output_step": "most_popular"
},
{
"output": "subgraph",
"name": "result_graph",
"output_step": "find_neighborhood"
}
],
"steps": [
{
"name": "most_popular",
"task": {
"inputs": [
{
"type": "graph",
"name": "G",
"format": "networkx"
}
],
"script": "\nfrom networkx import degree\n\ndegrees = degree(G)\nmost_popular_person = max(degrees, key=degrees.get)\n",
"outputs": [
{
"type": "string",
"name": "most_popular_person",
"format": "text"
},
{
"type": "graph",
"name": "G",
"format": "networkx"
}
]
}
},
{
"name": "find_neighborhood",
"task": {
"inputs": [
{
"type": "graph",
"name": "G",
"format": "networkx"
},
{
"type": "string",
"name": "most_popular_person",
"format": "text"
}
],
"script": "\nfrom networkx import ego_graph\n\nsubgraph = ego_graph(G, most_popular_person)\n",
"outputs": [
{
"type": "graph",
"name": "subgraph",
"format": "networkx"
}
]
}
}
]
}
from girder_worker.
Thanks. It should be helpful. I will try to load and run this json using the girderworker.run method.
I was wondering if the script needs to be inlined here too. Is there anyway the script point to a separate file/method call instead of the inline script?
Thanks.
from girder_worker.
For single-step tasks, you can use the script_uri
field instead of script
to point to a file:
"script_uri": "file:///path/to/my/script.py"
Using workflow = girder_worker.load(json_filename)
will resolve the script_uri
and place it's contents into the script
field, ready to be sent to the run
function.
I notice that this will not currently work for ingesting multiple scripts in workflow specifications. I will create an issue to resolve this.
from girder_worker.
I get the key not found error when I use script_uri. The same works when I use script. The error is:
Traceback (most recent call last):
File "austinworkflow.py", line 37, in <module>
outputs={'sliceroutputfile': {'format': 'text'}})
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/utils.py", line 291, in wrapped
return fn(*args, **kwargs)
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/__init__.py", line 292, in run
auto_convert=auto_convert, validate=validate, **kwargs)
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/executors/workflow.py", line 46, in run
out = girder_worker.run(steps[step]["task"], bindings[step])
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/utils.py", line 291, in wrapped
return fn(*args, **kwargs)
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/__init__.py", line 292, in run
auto_convert=auto_convert, validate=validate, **kwargs)
File "/Users/baral/girder_env/lib/python2.7/site-packages/girder_worker/executors/python.py", line 29, in run
lines = task["script"].split("\n")
KeyError: 'script'
My JSON file (austinspec.json):
{
"mode": "workflow",
"inputs": [
{
"type": "string",
"name": "slicerinputfile",
"format": "text"
},
{
"type": "number",
"name": "lat",
"format": "number"
},
{
"type": "number",
"name": "lon",
"format": "number"
}
],
"outputs": [
{
"type": "string",
"name": "sliceroutputfile",
"format": "text"
}
],
"connections": [
{
"input": "slicerinputfile",
"input_step": "slicer",
"name": "slicerinputfile"
},
{
"input": "lat",
"input_step": "slicer",
"name": "lat"
},
{
"input": "lon",
"input_step": "slicer",
"name": "lon"
},
{
"output": "sliceroutputfile",
"output_step": "slicer",
"name": "sliceroutputfile"
}
],
"steps": [
{
"name": "slicer",
"task": {
"inputs": [
{
"type": "string",
"name": "slicerinputfile",
"format": "text"
},
{
"type": "number",
"name": "lat",
"format": "number"
},
{
"type": "number",
"name": "lon",
"format": "number"
}
],
"script_uri": "file:///slicerscript.py",
"outputs": [
{
"type": "string",
"name": "sliceroutputfile",
"format": "text"
}
]
}
}
]
}
My workflow execution file (austinworkflow.py):
import json
import girder_worker
import yaml
fileNameParam = "tasmax.CRCM.ccsm-current.dayavg.common.nc"
lat = 30.25
lon = -97.75
slicerInputFileName = {
'type': 'string',
'format': 'text',
'data': fileNameParam
}
latVal = {
'type': 'number',
'format': 'number',
'data': lat
}
lonVal = {
'type': 'number',
'format': 'number',
'data': lon
}
with open('austinspec.json') as spec:
workflow = yaml.safe_load(spec) #to avoid the unicode character u' in the keys
for item in workflow.keys():
print "keys:",str(item)
output = girder_worker.run(workflow,
inputs={'slicerinputfile': slicerInputFileName, 'lat':latVal, 'lon':lonVal},
outputs={'sliceroutputfile': {'format': 'text'}})
print "output is:",str(output['sliceroutputfile'])
My script file (slicerscript.py) contains just the single line:
sliceroutputfile = "abcd.txt"
Also, why does the parser of the girderworker gives problem with even the blank spaces. It needs the JSON very strict, no extra spaces allowed.
NOTE: All the files are in the same folder.
Thanks.
from girder_worker.
I get the key not found error when I use script_uri. The same works when I use script.
Yes, this is because loading script URIs does not yet work for workflow steps (see #52).
Also, why does the parser of the girderworker gives problem with even the blank spaces. It needs the JSON very strict, no extra spaces allowed.
I'm not sure what exactly you mean, but yes it must be valid JSON. If you give an example of what doesn't work I can better see what the issue is.
from girder_worker.
From this page http://girder-worker.readthedocs.io/en/latest/api-docs.html, the method
girder_worker.load(task_file)[source]
Load a task JSON into memory, resolving any "script_uri" fields by replacing it with a "script" field containing the contents pointed to by "script_uri" (see girder_worker.uri for URI formats). A script_fetch_mode field may also be setParameters: analysis_file – The path to the JSON file to load.
Returns: The analysis as a dictionary.
looks like the thing that I am trying.
From your response above, it should work for single step, but it didn't work for my case even though I had single step. Having inline code is really painful if the code has many lines.
To clarify the second part of previous question, it gave me strange error "\t found in column..." when there were some whitespaces in the json file.
from girder_worker.
From your response above, it should work for single step, but it didn't work for my case even though I had single step.
Ah, I had meant it does not work for workflow specs at all (no matter how many steps) but only for task specs. For example, workflow = girder_worker.load('austinspec.json')
should work if the JSON simply contained:
{
"inputs": [
{
"type": "string",
"name": "slicerinputfile",
"format": "text"
},
{
"type": "number",
"name": "lat",
"format": "number"
},
{
"type": "number",
"name": "lon",
"format": "number"
}
],
"script_uri": "file://slicerscript.py",
"outputs": [
{
"type": "string",
"name": "sliceroutputfile",
"format": "text"
}
]
}
Note also the two slashes instead of three in the script URI. Girder worker essentially removes file://
prefix to find the path, so using three slashes will assume the file is at /
instead of the current path.
from girder_worker.
See #53. Workflows should support script_uri on master now.
from girder_worker.
Does the girderworker engine support the reading of JSON string rather than JSON file? I tried it but it gives me
AttributeError: 'str' object has no attribute 'get'
I just had the string format of the JSON content from the valid JSON file and fed it to the workflow engine as a python string variable. I mean a single string, not the decomposed maps as in the fbworkflow example.
from girder_worker.
Does the girderworker engine support the reading of JSON string rather than JSON file?
If you are referring to the run
method, it cannot accept a JSON string or a Python file object. It requires the JSON to be loaded as a Python dict (using girder_worker.load
, json.load
, or json.loads
) before being passed as an argument to run
.
from girder_worker.
Related Issues (20)
- Flesh out documentation section about Girder + Girder_Worker
- Document job_manager usage in tasks
- RFC: Consider cache configuration in new girder worker tasks HOT 2
- Girder worker integration test plugin is not compatible with girder 3.
- Create "How to write an integration test" documentation for girder worker
- spaces in girder folder name are incorrectly converted to container args. HOT 5
- Check if Celery Pinning is still necessary HOT 24
- Docker in docker permission issue HOT 1
- Girder Client 2.4.0 breaking integration test HOT 7
- girder_plugin settings for worker.api_url, worker.broker, worker.backend are not being picked up correctly HOT 3
- Using kwargs in chained tasks fails to create Job object
- Revoke integration tests are hanging HOT 1
- girder_result_hooks cleanup methods are not called after task is done
- File paths from docker transforms should be wrapped in quotes HOT 1
- Document job event binding and output reference kwargs. HOT 7
- Running NVidia dockers
- docker_run can't process non-string arguments HOT 3
- Exceptions are silently ignored during plugin load
- Girder plugin: use setting default HOT 1
- Pin celery version in requirements.in HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from girder_worker.