epam / cloud-pipeline Goto Github PK

Cloud agnostic genomics analysis, scientific computation and storage platform

License: Apache License 2.0

Shell 3.82% Java 38.80% PLpgSQL 0.06% JavaScript 37.50% CSS 5.02% HTML 0.26% Yacc 0.01% Dockerfile 0.48% Python 13.24% Lua 0.16% Groovy 0.01% Nextflow 0.01% Less 0.36% Common Workflow Language 0.01% PowerShell 0.18% Batchfile 0.01% SCSS 0.01% WDL 0.09% Roff 0.01%

cloud aws azure ngs modelling biology chemistry google-cloud

cloud-pipeline's Introduction

Cloud Pipeline

Cloud Pipeline solution wraps AWS, GCP and Azure compute and storage resources into a single service. Providing an easy and scalable approach to accomplish a wide range of scientific tasks.

Data processing: create data processing pipelines and run them in the Cloud in the automated way. Each pipeline represents a workflow script with versioned source code, documentation, and configuration. You can create such scripts in the Cloud Pipeline environment or upload them from the local machine.
Data storage management: create your data storage, download or upload data or edit files right in the Cloud Pipeline user interface. File version control is supported.
Tools management: create and deploy your own calculation environment using Docker's container concept. Almost every pipeline requires a specific package of software to run it, which is defined in a docker image. So when you start a pipeline, Cloud Pipeline starts a new cloud instance (nodes) and runs a docker image at it.
Scientific computing GUI applications: launch and run GUI-based applications using self-service Web interface. It is possible to choose cloud instance configuration, or even use a cluster. Applications are launched as Docker containers exposing Web endpoints or a remote desktop connection (noVNC, NoMachine).

Cloud Pipeline provides a Web-based GUI and also supports CLI, which exposes most of the GUI features.

Cloud Pipeline supports Amazon Web Services , Google Cloud Platform and Microsoft Azure Cloud providers to run computing and store data.

Documentation

Detailed documentation on the Cloud Pipeline platform is available via:

Prebuilt binaries

Cloud Pipeline prebuilt binaries are available from the GitHub Releases page

cloud-pipeline's People

Contributors

Stargazers

Watchers

cloud-pipeline's Issues

Adapt CP CLI integration tests to support Google Cloud Storages

Google Cloud Storage support for pipe-cli was introduced in the related issue #11.

Currently, pipe-cli integration tests can be launched for Aws and Azure cloud storage providers. Google storage provider is not supported yet. Therefore integration tests should be adapted for usage with Google cloud provider.

Implement support for getting GCP instance types and prices

CP API shall support listing of available instance types and its prices for GCP regions.
Provide implementation of CloudInstancePriceService interface.

Fix JWT token encoding in jwt-generator module

jwt-generator currently issues JWT token with wrong claims for groups and roles.

Azure VM images build script produces unusable VMs

Currently build_infra.sh produces Azure VM images, that cannot be used to start new VM instances.
I.e. such VMs seem not to be fully cleaned/generalized.

The issue here is related to the way waagent deprovision is called: we call it via Azure API, not an SSH session and use just a hardcoded timeout to wait for deprovisioning. May be we capture a VM image to early.

All the installation assets shall be updated with cloud-specific parameters

The following items are being registered in the Cloud Pipeline during the fresh installation:

Docker images (deploy/docker/cp-tools)
Folder template (deploy/docker/cp-api-srv/folder-templates)
Pipelines templates (workflows/pipe-templates)
Demo pipelines (workflows/pipe-demo)

All of them have an option to specify default compute/storage environment in terms of the VM size/Instance type and object storage type

Current installation routines use the value from the spec.json files as is. Those files contain a mixture of AWS/Azure options.

This shall be fixed to use the current cloud provider's default VM size/Instance type. Current provider is the one, which is set to a default region during installation

NPE while terminating a stale cluster node

Hard to provide exact reproduction scenario.

The following exception is logged while trying to delete an node without a run_id label from a cluster (region/az labels are in place):

[ERROR] 2019-04-04 20:03:32.812 [https-jsse-nio-8080-exec-3] ExceptionHandlerAdvice - This operation has been aborted: uri=/pipeline/restapi/cluster/node/ip-172-31-45-169.eu-central-1.compute.internal;client=10.244.0.1;session=C8F55698FC89BBBBD8A404B9C7E65C78;user=PIPE_
ADMIN.
java.lang.NullPointerException: null
        at com.epam.pipeline.dao.region.CloudRegionDao.loadByProviderAndRegionCode(CloudRegionDao.java:148) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at com.epam.pipeline.manager.region.CloudRegionManager.load(CloudRegionManager.java:278) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at com.epam.pipeline.manager.region.CloudRegionManager$$FastClassBySpringCGLIB$$1d1915bb.invoke(<generated>) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) ~[spring-core-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721) ~[spring-aop-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656) ~[spring-aop-4.3.7.RELEASE.jar!/:4.3.7.RELEASE]
        at com.epam.pipeline.manager.region.CloudRegionManager$$EnhancerBySpringCGLIB$$f441825a.load(<generated>) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at com.epam.pipeline.manager.cluster.NodesManager.lambda$terminateNode$5(NodesManager.java:188) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at java.util.Optional.orElseGet(Optional.java:267) ~[?:1.8.0_201]
        at com.epam.pipeline.manager.cluster.NodesManager.terminateNode(NodesManager.java:188) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]
        at com.epam.pipeline.manager.cluster.NodesManager$$FastClassBySpringCGLIB$$c296eb3.invoke(<generated>) ~[classes!/:0.15.0.203.a3b1e241dfec92409dcf16be9d8481e2b669da1c]

ElasticsearchAgentService: Could not resolve type id 'AZ'

Seems that indexing daemon is not fully ported for the Azure BLOB storages

[WARN ] 2019-03-27 18:59:45.047 [pool-3-thread-1] ElasticsearchAgentService - Exception while trying to send data to Elasticsearch service
java.util.concurrent.CompletionException: com.epam.pipeline.exception.PipelineResponseException: com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve type id 'AZ' as a subtype of [simple type, class com.epam.pipeline.entity.datastorage.AbstractDa
taStorage]: known type ids = [NFS, S3] (for POJO property 'payload')
 at [Source: (okhttp3.ResponseBody$BomAwareReader); line: 1, column: 167] (through reference chain: com.epam.pipeline.rest.Result["payload"]->java.util.ArrayList[0])
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[?:1.8.0_201]
        at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) [?:1.8.0_201]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1629) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: com.epam.pipeline.exception.PipelineResponseException: com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve type id 'AZ' as a subtype of [simple type, class com.epam.pipeline.entity.datastorage.AbstractDataStorage]: known type ids = [N
FS, S3] (for POJO property 'payload')
 at [Source: (okhttp3.ResponseBody$BomAwareReader); line: 1, column: 167] (through reference chain: com.epam.pipeline.rest.Result["payload"]->java.util.ArrayList[0])
        at com.epam.pipeline.utils.QueryUtils.execute(QueryUtils.java:60) ~[model-0.15.0.246.jar!/:?]
        at com.epam.pipeline.elasticsearchagent.service.impl.CloudPipelineAPIClient.loadAllDataStorages(CloudPipelineAPIClient.java:64) ~[classes!/:0.15.0.246]
        at com.epam.pipeline.elasticsearchagent.service.impl.S3Synchronizer.synchronize(S3Synchronizer.java:76) ~[classes!/:0.15.0.246]
        at com.epam.pipeline.elasticsearchagent.service.ElasticsearchAgentService.lambda$null$0(ElasticsearchAgentService.java:85) ~[classes!/:0.15.0.246]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) ~[?:1.8.0_201]
        ... 3 more
Caused by: com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve type id 'AZ' as a subtype of [simple type, class com.epam.pipeline.entity.datastorage.AbstractDataStorage]: known type ids = [NFS, S3] (for POJO property 'payload')
 at [Source: (okhttp3.ResponseBody$BomAwareReader); line: 1, column: 167] (through reference chain: com.epam.pipeline.rest.Result["payload"]->java.util.ArrayList[0])
        at com.fasterxml.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1628) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownTypeId(DeserializationContext.java:1186) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._handleUnknownTypeId(TypeDeserializerBase.java:291) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:162) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:288) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:245) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:27) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.impl.FieldProperty.deserializeAndSet(FieldProperty.java:136) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:288) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:151) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:1611) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1203) ~[jackson-databind-2.9.6.jar!/:2.9.6]
        at retrofit2.converter.jackson.JacksonResponseBodyConverter.convert(JacksonResponseBodyConverter.java:32) ~[converter-jackson-2.4.0.jar!/:?]
        at retrofit2.converter.jackson.JacksonResponseBodyConverter.convert(JacksonResponseBodyConverter.java:23) ~[converter-jackson-2.4.0.jar!/:?]
        at retrofit2.ServiceMethod.toResponse(ServiceMethod.java:122) ~[retrofit-2.4.0.jar!/:?]
        at retrofit2.OkHttpCall.parseResponse(OkHttpCall.java:217) ~[retrofit-2.4.0.jar!/:?]
        at retrofit2.OkHttpCall.execute(OkHttpCall.java:180) ~[retrofit-2.4.0.jar!/:?]
        at com.epam.pipeline.utils.QueryUtils.execute(QueryUtils.java:46) ~[model-0.15.0.246.jar!/:?]
        at com.epam.pipeline.elasticsearchagent.service.impl.CloudPipelineAPIClient.loadAllDataStorages(CloudPipelineAPIClient.java:64) ~[classes!/:0.15.0.246]
        at com.epam.pipeline.elasticsearchagent.service.impl.S3Synchronizer.synchronize(S3Synchronizer.java:76) ~[classes!/:0.15.0.246]
        at com.epam.pipeline.elasticsearchagent.service.ElasticsearchAgentService.lambda$null$0(ElasticsearchAgentService.java:85) ~[classes!/:0.15.0.246]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) ~[?:1.8.0_201]
        ... 3 more

Implement in E2E GUI tests not covered CP feature: [Sharing running instances]

The ability of sharing running instances allows for users to run an application (endpoint-tools) sharing it with other users/groups.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Sharing running instances
Configuring of "Friendly URL"	EPMCMBIBPC-2674 EPMCMBIBPC-2677	+ +	PASSED PASSED
Share tool run with user	EPMCMBIBPC-2678	+	PASSED
Share tool run with group	EPMCMBIBPC-2679	+	PASSED
Displaying of "Sharing tool" at "Services" widget	EPMCMBIBPC-2680	+	PASSED
Share the SSH session	EPMCMBIBPC-3179	+	PASSED

DTS integration for PATH parameter types

Currently DTS integration handles only INPUT and OUTPUT parameters types.

If API determines that an INPUT param corresponds to the DTS location, dataset is transferred to the object storage container and then copied to the compute node for processing.

For some operations, we do not need the latter operation: only data upload from the DTS to the object storage container.

I propose to implement this using PATH parameters, which are not handled at all by the DTS/API:

If a DTS-related path is specified in the PATH parameter
Job initialization routine shall perform data transfer from the DTS-managed location to the transfer object storage
Update PATH parameter value (container environment) to the new object storage location (e.g. /projects/FASTQ/sample_1.fq -> s3://transfer-bucket/FASTQ/sample_1.fq)
Data copy from the object storage to the compute node FS shall not happen for this scenario

Implement in E2E GUI tests not covered CP feature: [Download data from external http/ftp resources into CP]

Users can provide CSV/TSV files with lists of the external links (to the datasets on HTTP/FTP resources) and submit a data transfer job so that the files will be moved to the cloud storage in the background mode.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Data downloading from external resources
Upload resources list from the file	EPMCMBIBPC-2722	-	?
Download content in different regimens	EPMCMBIBPC-2723 EPMCMBIBPC-2724 EPMCMBIBPC-2725 EPMCMBIBPC-2726	- - - -	? ? ? ?

MSGEN pipeline hangs on workflow failure

Currently MSGEN pipeline hangs on unexpected Microsoft Genomics workflow terminal statuses.

As long as msgen status output contains workflow message rather than its status the usage of the operation should be replaced with calls to msgen list.

Change es-agent module build to use internal module dependencies for common library

elasticsearch-agent module shall use internal project dependency cloud-pipeline-common instead of importing dependency from artifactory.

GUI update for GCP Integration

Changes on the server side regarding GCP integration:

New CloudProvider - GCP is added.
GCPRegion class with the following fields was added:

"id"
"name"
"default"
"regionId": "europe-west3-c"
"provider": "GCP"
"authFile" - string optional
"sshPublicKeyPath" - string required
"project" - string required
"applicationName" - string required
"tempCredentialsRole" - string required

New Storage type GS added with scheme gs:// (OBJECT_STORAGE for GCPRegions)

Pipeline run reports "Docker pull" status, while node is not yet initialized

Scenario

New run was started using Azure westeurope region
It was not able to startup the node and timed out
InitializeNode correctly reports error and InitializeNode is restarted

Expected
GUI displays Starting indicator for the run, until InitializeNode is not finished correctly

Observed
GUI reports Docker pull indicator (issue is observed in Azure environment, need to retest with AWS as well)

Setup build artifacts publishing using Travis and S3

Prepare a distribution S3 bucket (public read-only)
Publish the following artifacts, produced #5 by #6:

Publish distribution tar and a pipectl installer into the S3 bucket
Publish dockers (distribution and tools) to the DockrHub (lifescience repository)

pipectl shall allow to deploy (optionally) demo pipelines and datastorages

At the moment, a fresh Cloud Pipeline deployment contains a number of docker images, which can be run on their own. Including NGS/MD/Generic images.

But to make it easier to start work for the end users - it would be nice to bootstrap a number of "ready to use" pipelines as well. That shall include different pipelining technique: plain scripting in shell, WDL, etc.

This shall be accomplished by specifying some optional parameter at an installation command of the pipectl manager.
Some of the pipelines can be used only for a specific Cloud Provider - this shall be configurable for that pipeline.

Existing pipelines (not deployed automatically):

NGS Demultiplex (using bcl2fastq) - custom scripts
NGS Whole Exome - custom scripts
NGS Mutect - WDL
NGS MSGEN - Microsoft Genomics integration - Only Azure

TODOs (addressed in #40):

NGS/SingleCell CellRanger
MD - NAMD
MD - Gromacs

Implement STS credentials for GCP

CP shall support issuing of STS credentials for GCP in order to assume Cloud Storage permissions for pipe storage commands.
Provide implementation of TemporaryCredentialsGenerator interface for GCP.

Creating folder should fail if it already exists

Currently, folder creation using datastorage/{id}/list API method does not fail even if target folder exists. The expected behavior in such a case is a failure.

Provide model for GCP region entity and implementation for java & python authorization options for GCP requests

Decide which authorization options will be suitable for GCP Java & Python sdk libraries
Provide implementation of AbstractCloudRegion class for GCP

Implement in E2E GUI tests not covered CP feature: [Issues]

Issues is a tool to share results with other users or get feedback. It allows keeping the discussion in one place - traceable and linked to specific data.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Issues
Issue lifecycle	EPMCMBIBPC-2691 EPMCMBIBPC-2692 EPMCMBIBPC-2693 EPMCMBIBPC-2694 EPMCMBIBPC-2695 EPMCMBIBPC-2696 EPMCMBIBPC-2700	- - - - - - -	? ? ? ? ? ? ?
Issue's comment lifecycle	EPMCMBIBPC-2697 EPMCMBIBPC-2698 EPMCMBIBPC-2699	- - -	? ? ?
Open issue for different objects (Folder, Pipeline, Project, Tool)	EPMCMBIBPC-2701 EPMCMBIBPC-2702 EPMCMBIBPC-2703 EPMCMBIBPC-2704	- - - -	? ? ? ?

Cannot specify region when an existing object storage is added

GUI interface doesn't provide an option to select a region when Add existing object storage menu is used.

Implement in E2E GUI tests not covered CP feature: [Folder cloning & locking]

Folder cloning allows a user to clone any folder to a specific destination: to user's personal folder, or user's project. It would be helpful to create a new project way faster due to copying metadata, configurations, storages from another project.

Folder locking allows a user to save his folder and its children from any changes of other users.

Check whether all of the features above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Clone
Clone Project folder	EPMCMBIBPC-1938 EPMCMBIBPC-1984	- -	? ?
Clone folder with objects	EPMCMBIBPC-1986 EPMCMBIBPC-1987 EPMCMBIBPC-1988 EPMCMBIBPC-1989	- - - -	? ? ? ?
Lock
Lock Project folder	EPMCMBIBPC-2686 EPMCMBIBPC-2688 EPMCMBIBPC-2689	- - -	? ? ?
Lock folder with objects	EPMCMBIBPC-2684 EPMCMBIBPC-2685 EPMCMBIBPC-2687 EPMCMBIBPC-2690	- - - -	? ? ? ?

Node initialisation hangs if Azure VM ID is used in the networks config

If a cluster.networks.config preference contains full ID of the image, e.g. /subscriptions/11111111-2222-2222-3333-4444444/resourceGroups/res-grp/providers/Microsoft.Compute/images/CloudPipeline-Image-Common (which is generated by the build_infra.sh script) - InitializeNode task hangs (almost) forever. For it was 20+ minutes.

Last log:

This leads me the idea that nodeup for azure does not contain enough error handling.
So we shall:

Add more logging - if something may fail: catch it and report
If something failed - drop the nodeup script execution
If something may hang - introduce wait loops

Setup Cloud Pipeline deployment process to the dev environment

Once #7 artifacts are published to the corresponding repositories - everything shall be deployed to the AWS and Azure environments:

Per commit deployment, this shall redeploy only Cloud Pipeline binaries (e.g. API/Notifier/SearchAgent/DockerScanner)
Nightly deployment, this shall cleanup existing installation and redeploy everything from scratch

Unified error message for storage tags operations

Currently, different cloud providers has different error messages on datastorage/{id}/tags API methods. This is a problem because some CP CLI integration tests require the error messages to be unified.

For example, on updating tags of a non-existing storage item different messages are returned for each Cloud Provider:

AWS - Path 'non_existing_path' doesn't exist.
AZURE - Azure item 'non_existing_path' does not exist.
GCP - Google Storage path 'non_existing_path' for bucket 'bucket_name' not found.

Probably, the problem exists for other datastorage endpoints not only the tagging related.

Setup E2E tests for the dev environment

Setup existing e2e selenium tests to be run once #8 deployment routines pass
Details on the exact approach shall be added here once #8 is done (e.g. where are we going to execute the tests themselves?)

MSGEN integration - consider concurrent execution limits

MSGEN limits the number of concurrent executions to 20 (default value).
We shall consider this when scheduling multi-sample runs from the Cloud Pipeline.

We shall verify MSGEN behavior when oversubscription occurs. Can we just try to submit everything, and handle error messages to keep a sample(s) in the queue? Or a more fancy logic is required.

Setup dockers builds process using Travis

Setup Travis build routines:

Build all required distribution/tools dockers
Build pipectl installer

Implement support for Google Cloud Storage in CP CLI

Provide implementation of the following CP CLI operations for Google Cloud Storage:

pipe storage ls
pipe storage cp
pipe storage mv
pipe storage rm
pipe storage restore

Extend a set of "out-of-the-box" docker images and pipelines with MD and MS use cases

Molecular Dynamics tool group:

NAMD
Gromacs

Modelling and Simulation tool group:

RStudio with corresponding packages
Jupyter notebook with corresponding packages
May be some commercial tools, which are free to download. License shall be provided by the end user: Monolix, Matlab, ...

Machine Learning tool group (CUDA-supported):

RStudio with corresponding packages
Jupyter notebook with corresponding packages

Include DTS jar file into the distribution tarball

Currently gradlew distTar (or distZip) triggers DTS service build, but do not pack the resulting jar into the distribution tarball

So the dist* gradle tasks shall be extended to include dts.jar

Implement support for Google Cloud Storage in CP Java API

Provide implementation of AbstractDataStorage and StorageProvider classes to support all CP API operations with Google Cloud Storage.

E2E GUI tests shall use SAML-IDP for SSO

E2E GUI tests shall use SAML-IDP for user login.

Google Cloud Storage shall be mounted to instances on run startup

Add support for Google Cloud Storage in mount_storage.py script:

check auth options for Cloud Storage FUSE
add installation of Cloud Storage FUSE to mount script
add support of GCP storages in mount script

Implement in E2E GUI tests not covered CP feature: [Docker image version settings]

Execution environment settings and parameters can be assigned directly to a specific docker version. These settings are defined in the "SETTINGS" tab of the tool's version menu. If these (version-level) settings are specified - they will be applied to each run of such docker image.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Docker image version settings
Applying of version settings	EPMCMBIBPC-2719	-	?
Hierarchy of the execution environment settings	EPMCMBIBPC-2720	-	?

Demo pipelines lack READMEs

The following demo NGS pipelines are lacking documentation. Please add some information to the README.md

batch
demultiplex
wes-analysis

Implement in E2E GUI tests not covered CP feature: [Global Search]

In CP a capability to search over all existing objects types in the whole Platform is implemented.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Global Search
General tests	EPMCMBIBPC-2653 EPMCMBIBPC-2657 EPMCMBIBPC-2671 EPMCMBIBPC-2676	+ + + +	Passed Passed Passed Passed
Folders search	EPMCMBIBPC-2654 EPMCMBIBPC-2655 EPMCMBIBPC-2656	+ + +	Passed Passed Passed
Pipelines search	EPMCMBIBPC-2658 EPMCMBIBPC-2672 EPMCMBIBPC-2673	+ + +	Passed Passed Passed
Runs search	EPMCMBIBPC-2662 EPMCMBIBPC-2663 EPMCMBIBPC-2664 EPMCMBIBPC-2665 EPMCMBIBPC-2668 EPMCMBIBPC-2669	+ + + + + +	Passed Passed Passed Passed Passed Passed
Tools search	EPMCMBIBPC-2667	+	Passed
Storages search	EPMCMBIBPC-2660 EPMCMBIBPC-2661 EPMCMBIBPC-2666	+ + +	Passed Passed Passed
Issues search	EPMCMBIBPC-2670	+	Passed
Special expressions in query string	EPMCMBIBPC-2675	+	Passed

Setup Cloud Pipeline build process using Travis

Setup Travis build routines:

Build all binaries and pack into a distribution tar
Build documentation
Setup PostgreSQL instance (docker) to run tests
Setup pipe CLI Windows build

Implement in E2E GUI tests not covered CP feature: [Restrictions on instance type/price type]

Admin can restrict the selection of instance types and price type for the specific users/groups or tools to minimize the number of invalid configurations runs.
Restrictions could be set by different ways on several forms:

within "User management" for the specific users/groups
within "Instance management" panel for the specific tool
global for the Platform, within System-level Settings

Check whether all of the features above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Restrictions on instance type
Via "User management" tab for user	EPMCMBIBPC-2637 EPMCMBIBPC-2638 EPMCMBIBPC-2639 EPMCMBIBPC-2641	+ + + +	PASSED PASSED PASSED PASSED
Via "User management" tab for group	EPMCMBIBPC-2640 EPMCMBIBPC-2643	+ +	PASSED PASSED
Via "Instance management" for tool	EPMCMBIBPC-2642	+	PASSED
Via system-level settings	EPMCMBIBPC-2644 EPMCMBIBPC-2645	+ +	PASSED PASSED
Hierarchy of different levels	EPMCMBIBPC-2646 EPMCMBIBPC-2647	+ +	PASSED PASSED
Restrictions on price type
Via "User management" tab for user	EPMCMBIBPC-2648	+	PASSED
Via "User management" tab for group	EPMCMBIBPC-2649	+	PASSED
Via "Instance management" for tool	EPMCMBIBPC-2650	+	PASSED
Via system-level settings	EPMCMBIBPC-2651	+	PASSED
Hierarchy of different levels	EPMCMBIBPC-2652	+	PASSED

Implement in E2E GUI tests not covered CP feature: [Dashboard]

CP home page is a dashboard with the settable and editable widgets that realize quick access to the main Platform functionality. Each user can set the view of that page on his own needs.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Dashboard
Configure view	EPMCMBIBPC-2705 EPMCMBIBPC-2710	- -	? ?
ACTIVITIES widget	EPMCMBIBPC-2713	-	?
DATA widget	EPMCMBIBPC-2709 EPMCMBIBPC-2711	- -	? ?
NOTIFICATIONS widget	EPMCMBIBPC-2712	-	?
TOOLS widget	EPMCMBIBPC-2716 EPMCMBIBPC-2717	- -	? ?
PIPELINES widget	EPMCMBIBPC-2714	-	?
PROJECTS widget	EPMCMBIBPC-2715	-	?
RECENTLY COMPLETED RUNS widget	EPMCMBIBPC-2708	-	?
ACTIVE RUNS widget	EPMCMBIBPC-2706 EPMCMBIBPC-2707	- -	? ?
SERVICES widget	EPMCMBIBPC-2718	-	?

Fresh deployment throws "duplicate key value violates unique constraint tool_image_unique"

While deploying Cloud Pipeline from scratch and dockers are being pushed to the registry (with API authentication enabled) - PSQL Exception is logged to the pipeline.log for each new docker version:

duplicate key value violates unique constraint "tool_image_unique"
  Detail: Key (image, tool_group_id)=(library/ubuntu-nomachine, 1) already exists.; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "tool_image_unique"
  Detail: Key (image, tool_group_id)=(library/ubuntu-nomachine, 1) already exists.

Full log: tool_image_unique.log

But in the end - all the tools are available in the registry and are listed in the GUI.
If this is a just logging issue (e.g. we try to create a new tool/group for each of the tool version) - please fix and close.

Otherwise - please provide a description such behavior

Implement support for GCP instances handling in CP Java API

Provide implementation of interface CloudInstanceService for GCP.

Very slow operations on the BLOB Fuse mounted directories

Region: westeurope
BLOB contents: 1 file
ls -l: ~1min+
echo 1 > 1.txt: ~1min+

'NoneType' has no len() for empty networks list

If cluster.networks.config preference contains empty list of networks, the following error is thrown during node initialization step:

For AWS this is handled automatically and a VM is created in the default VPC/Subnet. For Azure we shall handle this manually. If networks list is empty - try to determine a default vnet/subnet in the nodeup.

Azure VMs are created with OS disk of type Standard HDD

Currently Azure VM instances are always created with OS disk of type Standard HDD and data disk of type Premium SSD
HDD type of the OS disk may affect the node initialization performance
Would be nice to set OS disk to Premium SSD the same as data disk.
E.g. with azure-cli it can be done via --storage-sku Premium_LRS, which will set Premium SSD to all disks

Slow container initialisation for Azure VMs

The following behavior is observed for Azure:

Average InitializeNode + Docker pull tasks duration is ~3min
Average launch.sh task duration is ~7min (BTW, MountDataStorages task takes 1-2min for only 3 BLOB storage)

While the expected duration of (2) is ~3min.

Maybe this is related to #29, need to retest once it is implemented

Implement in E2E GUI tests not covered CP feature: [Limit mounts]

User can specify storages that should be mounted (limit storages mounts) during the pipeline running.

Check whether all details of the feature above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Limit mounts
Configure of storages mount limits	EPMCMBIBPC-2681	+	PASSED
Run pipelines with storages mount limits	EPMCMBIBPC-2682 EPMCMBIBPC-2683	+ +	PASSED PASSED
Configure of sensitive storages mount limits	EPMCMBIBPC-3177	+	PASSED
Run pipelines with sensitive storages mount limits	EPMCMBIBPC-3178	+	PASSED

Implement in E2E GUI tests not covered CP feature: [Node jobs & monitor]

The list of jobs being processed on the node at the moment you can find on one of the tabs of the detailed node information page.

On another tab of the detailed node information page, there is a node monitor that contains 4 diagrams of the resources usage.

Check whether all of the features above have corresponding test cases, write the missing ones and implement tests.
Implementation status, see the table below:

Feature	Test cases availability	E2E test implementation	Test result
Node jobs
Jobs displaying	EPMCMBIBPC-2721	-	?
Node monitor
Node monitor displaying	EPMCMBIBPC-1431	-	?

"Download file" is not available for large files from the Azure BLOB storages

Large files shall NOT be editable
There shall be a message displayed in the Preview pane and in the Edit dialog

File is too large to be shown. Download file to view full contents

For AWS this works fine
For Azure the following can be observed

Implement support for GCP in CP autoscaling scripts

Autoscaling scripts (nodeup, nodedown, terminate_node, reassign_node) shall support management of GCP common instances ("on-demand").

epam / cloud-pipeline Goto Github PK

cloud-pipeline's Introduction

Cloud Pipeline

Documentation

Prebuilt binaries

cloud-pipeline's People

Contributors

Stargazers

Watchers

Forkers

cloud-pipeline's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs