Comments (23)
aligned on that @kushsharma @ravisuhag , we will move everything under dependencies
thank you for the feedback. Somehow in our discussions we are misaligned that if we go with that option we will break the compatibility.
from optimus.
On the http sensor, for now we can just go with supporting 200+ as sensor success, later we can consider providing more flexibility if needed to the users.
from optimus.
Just wanted to mention that we design this looking from a perspective for supporting generic user-defined sensors as well like an HTTP sensor waiting for 200.
from optimus.
yeah makes sense
from optimus.
http sensor config proposal at job.yaml:
dependencies :
- sensor :
type : http
param :
method : POST
url : https://optimus-host:80/serve/
headers:
- Content-type : application/json
- Authentication : Token
body : '{example: {"param-name-1": "value-1", "param-name-2": "value-2"}}'
from optimus.
Config looks good. Do we have a clear segregation between what is a pre-hook and what is a sensor?
from optimus.
I want to simplify the sensor config as normally http sensors are just HTTP GET type and wait for success response.
Even airflow supports only GET method for HTTP sensor. Please check the source code here
proposal - 1 :
dependencies :
- http :
name : sensorName
request-params:
key-1 : value-1
key-2 : value-2
url : https://optimus-host:80/serve/
headers:
Content-type : application/json
Authentication : Token
corresponding at go struct will be :
type JobDependency struct {
JobName string `yaml:"job"`
Type string `yaml:"type,omitempty"`
HttpSensor Http `yaml:"http,omitempty"`
}
type Http struct {
Name string `yaml:"name"`
RequestParams map[string]string `yaml:"request-params"`
URL string `yaml:"url"`
Headers map[string]string `yaml:"headers"`
}
we are going configure sensor under dependencies section. I suppose, pre-hook dependencies are defined at plugin level under pluginInfo
Edit: added name under http type to identify each sensor uniquely and to differentiate from other hooks.
from optimus.
proposal-2 :
put extenal-dependencies at root
external-dependencies :
http :
-
Name : http-sensor-1
request-params:
key-1 : value-1
key-2 : value-2
url : https://optimus-host:80/serve/1/
headers:
Content-type : application/json
Authentication : Token-1
-
Name : http-sensor-2
request-params:
key-1 : value-3
key-2 : value-4
url : https://optimus-host:80/serve/2/
headers:
Content-type : application/json
Authentication : Token-2
corresponding proton changes:
corresponding optimus changes:
Model Changes:
Yaml:
Domain:
Internal(models.JobSpec) :
Proto Message:
from optimus.
JSON Sample Structure which will be stored at DB :
{
"HTTPDependencies":[
{
"Name":"http_sample_dependency_1",
"RequestParams":{
"request-param-1":"value-1"
},
"URL":"http://sample/dependency/1",
"Headers":{
"Content-Type":"application/json"
}
},
{
"Name":"http_sample_dependency_2",
"RequestParams":{
"request-param-2":"value-2"
},
"URL":"http://sample/dependency/2",
"Headers":{
"Content-Type":"application/json"
}
}
]
}
from optimus.
The new HTTP schema looks good. Any reason to extract external dependencies to root?
Also, should we care about the response code? What if user wants to treat 200
, 201
response code as a success parameter of the sensor or even 404
?
from optimus.
- Any reason to extract external dependencies to root?
- decouple new external dependencies from existing job dependency section as dependency resolution is not needed for sensors
- Should we care about the response code? What if user wants to treat 200, 201 response code as a success parameter of the sensor or even 404?
- No, We shouldn't.
from optimus.
few more reasons why we pulled out external dependencies out
- We just want users to allow only one dependency to be configured in an array element, if we keep supporting http & other dependencies in the same place, users can configure all or few dependencies in order to avoid that we need to rely on type param, which is a duplication for user and the logic behind the scenes will follow the same. Keeping these tradeoffs and the one which @siddhanta-rath mentioned I believe it is better to pull out external dependencies.
from optimus.
We just want users to allow only one dependency to be configured in an array element, if we keep supporting http & other dependencies in the same place, users can configure all or few dependencies
A job can have an external dependency and an internal dependency right? Why can't we configure both? Also, I think type
param can be removed and we can easily figure out what to do with the dependency based on what the user has provided. Correct me if I am wrong.
dependency resolution is not needed for sensors
This is an internal technical thing that we can sort out without affecting users? They don't have to worry about one more root config?
from optimus.
How can we avoid users configuring multiple dependencies of various types without type field in a single array element? @kushsharma
Ideally I would expect a single root element for dependenices by having a structure where the array of job dependencies parallel to array of http and GCS dependencies, it is just that the change isn't backward compatible.
from optimus.
@sravankorumilli I might be confused about what exactly you are trying to say so feel free to correct me. What I was thinking is this is what dependencies look like currently
dependencies:
- job: sample_internal_job
type: intra # not needed
- job: cross_project/sample_external_job
type: inter # not needed
- http:
name : sensorName
url : https://optimus-host:80/serve/
First, we can get rid of type
and simply based on the job name figure out its the same project or across projects. The second thing is as suggested above if we use HTTP dependency, it is pretty clear it is an external dependency as it has HTTP field, so where is the confusion?
from optimus.
@kushsharma I think need here is across two Optimus deployments and not within the same Optimus deployment and two projects within that. But I think we can achieve that also without root level and specifying extra params/configs for across deployments one.
from optimus.
@ravisuhag yes, I just gave a generic example, in case of optimus-to-optimus this could be from top of my head
dependencies:
- job: sample_internal_job
- job: cross_project/sample_external_job
- http:
name : sensorName
query:
project: p1
job: j1
url : https://optimus-host:80/path-for-status-check
from optimus.
we wanted to group a specific type dependencies at yaml, so it will be intuitive for user to where to put what?
yaml file will also look much cleaner.
problem with older approach:
User can give input like this :
dependencies:
- job: job_name_1
- job: job_name_2
- http :
name : sensorName_1
request-params:
key-1 : value-1
key-2 : value-2
url : https://optimus-host:80/serve/1
headers:
Content-type : application/json
Authentication : Token
- job: job_name_3
- job: job_name_4
- http :
name : sensorName_2
request-params:
key-1 : value-1
key-2 : value-2
url : https://optimus-host:80/serve/2
headers:
Content-type : application/json
Authentication : Token
- job: job_name_5
- job: job_name_6
- gcs:
path: gcs://path
service-account: account_details
- http :
name : sensorName_3
request-params:
key-1 : value-1
key-2 : value-2
url : https://optimus-host:80/serve/3
headers:
Content-type : application/json
Authentication : Token
With new approach, we are grouping http dependencies under http and
will group gcs dependencies under gcs tag like below :
external-dependencies :
http :
-
Name : http_sensor_1
request-params:
key-1 : value_1
key-2 : value_2
url : https://httpbin.org/get
headers:
Content-type : application/json
Authentication : Token_1
-
Name : http_sensor_2
request-params:
key-1 : value_3
key-2 : value_4
url : https://httpbin.org/get
headers:
Content-type : application/json
Authentication : Token_2
gcs :
-
path: gcs://path_1
service-account: account_details_1
-
path: gcs://path
service-account: account_details_2
@sravankorumilli correct me if i am wrong
from optimus.
@siddhanta-rath I think the learning curve will increase this creating multiple nodes of dependencies and two places whee we define them.
I think all dependencies grouped together will be better. We can think about what is the best way to structure them.
from optimus.
Okay, lets put all the dependencies under single root dependencies
and will move back to eariler proposal-1 as mentioned on the previous comment:
#138 (comment)
from optimus.
we are not getting rid of type
at the moment will keep it for now, will remove it as part of seperate refactorying exercise but we don't need that any longer.
from optimus.
Nice. Can we also rename request-params
to just query
or queries
or params
? Request is kind of implicit here.
from optimus.
Sure. Lets change it to params
from optimus.
Related Issues (20)
- Get Window api is n't backward compatible.
- Optimus replay, replay dry run, fails to resolve dependency from neighbour optimus HOT 2
- Clean up admin build instance api
- Fail to read job spec where assets are not required
- Zombie job deployment process HOT 1
- Job sensor is aiming for inaccurate window of upstream
- Move plugin install command in server side
- Mark namespace name mandatory for a tenant
- Updating the secret in one project affects other secret in another project
- Improve dev setup enhancing dev experience with proper seeding HOT 1
- Артём HOT 2
- Один
- Update code to support airflow version > 2.2.0 HOT 1
- Add sub-context for resource backup
- The database migration might not run in some scenarios
- GetJobSpecification and GetJobSpecifications are not returning 404 if the job is not found
- Add ability to download all jobs/resources in a project HOT 1
- Static and inferred downstream job creates duplicate sensor
- Provide Replay Support for a Single Job
- Dataset Resource Failed in Deployment HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optimus.