googlecloudplatform / compute-image-tools Goto Github PK

View Code? Open in Web Editor NEW

194.0 40.0 147.0 17.91 MB

Tools and scripts for Google Compute Engine images.

Home Page: https://cloud.google.com/compute/docs/images

License: Apache License 2.0

Go 64.71% Shell 8.95% Python 12.72% PowerShell 12.31% Batchfile 1.00% Dockerfile 0.32%

compute-image-tools's Introduction

Compute Image Tools

Tools for building, testing, releasing, and upgrading Google Compute Engine images.

GCE Export

Streams an attached Google Compute Engine disk to an image file in a Google Cloud Storage bucket.

Docker

Latest: gcr.io/compute-image-tools/gce_export:latest
Release: gcr.io/compute-image-tools/gce_export:release

Linux x64

Windows x64

Windows Upgrade

Performs in-place OS upgrades. The tool can be invoked with gcloud beta compute os-config os-upgrade.

Docker

Latest: gcr.io/compute-image-tools/gce_windows_upgrade:latest
Release: gcr.io/compute-image-tools/gce_windows_upgrade:release

Linux x64

Image Publish

Creates Google Compute Engine images from raw disk files.

Docker

Latest: gcr.io/compute-image-tools/gce_image_publish:latest
Release: gcr.io/compute-image-tools/gce_image_publish:release

Linux x64

Windows x64

OSX x64

Contributing

Have a patch that will benefit this project? Awesome! Follow these steps to have it accepted.

Please sign our Contributor License Agreement.
Fork this Git repository and make your changes.
Create a Pull Request.
Incorporate review feedback to your changes.
Accepted!

License

All files in this repository are under the Apache License, Version 2.0 unless noted otherwise.

compute-image-tools's People

Contributors

Stargazers

Watchers

Forkers

adjackura kleopatra999 zmarano alaingef bradleyfalzon helen-fornazier tomlanyon etsangsplk jeffmendoza ericdand harry2040 arunoo jpnickolas illfelder jsoref mjjohnson1 ryanhaining ryanwe functicons anandadalton pjh huamichaelchen tomoe collabora-gce dennis-harvey-ck vapeurs ed00m 4701410 b0noi gaolemeng clrprod zoran15 jeffli678 hopkiw rofuentes pankajjeet001 davecheez anonshellsllc krzyzacy bhanditz dalavancloud dntczdx maxkarami kanglicheng fr34k8 liorkomanski wj-chen shuang-wang cristka ericedens yanglu1031 bbbbb66699 humbleshuttler justineboudeville rkolchmeyer dowgird palaviv varunraj05 jasonwbarnett yfuruyama yankee-skittlez huangw5 str4t3gy chinon99 jpassing lukebayler fiona-liu claytonolleyitn gaohannk hegao12 kurhula 5l1v3r1 bkatyl supertatop ivan164 eissi muskanmahajan37 oleksiy-turchanikov txjx45 same-id gibbleyg quintonamore camedk admariner isabella232 mickeystone global-localhost global19 global19-atlassian-net a-zakem noknok79 mbernhardt6 liryna rueian kevin-ron andrewlukoshko alexandra2698 peterschen sol-won-1100 redborian

compute-image-tools's Issues

Common go presubmits should run against any go code, not just daisy

Right now go presubmits in compute-image-tools only run against go code under daisy/, we use this repo for more tools than just daisy and these common go presubmits (go test, go ver, gofmt), should run against any *.go code. I took a brief look at it and I think separating the daisy tests from the go tests (even if just in name) would make sense.

Fail to set ssh key in metadata

When executing metadata-ssh test, sometimes it fails to change the metadata values in the instance. The test thinks it added an ssh key, then tries to ssh to the instance but it fails because in fact no ssh key was added in the metadata.

How to reproduce:

Fail:

Execute:

daisy -variables source_image=projects/debian-cloud/global/images/family/debian-9 metadata-ssh/metadata-ssh.wf.json

The tester should have set and ssh key in the instance level of the testee here: https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy_workflows/image_test/metadata-ssh/metadata-ssh-tester.py#L34

But tester fails to ssh to the testee here: https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy_workflows/image_test/metadata-ssh/metadata-ssh-tester.py#L35

If you look at testee instance metadata, there is no ssh key value there.

Succeed:

But the test succeeds if you duplicate this line https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy_workflows/image_test/linux_common/utils.py#L319

[instance creation] Support for create VMs in a shared VPC

Hello,

I was wondering whether daisy can support the possibility to create an instance inside of a shared VPC (https://cloud.google.com/vpc/docs/shared-vpc).

The code here:
https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy/instance.go
always requires to provide as network interface both a network and a subnetwork.

However, with a shared VPC the network does not belong to the project where the instance is created and thus daisy breaks (or I was not able to make it work anyhow).

I checked the REST APIs for creating an instance in a shared VPC and only the subnetwork argument is provided so this looks like being the right approach to adapt here. For example:

  "networkInterfaces": [
    {
      "kind": "compute#networkInterface",
      "subnetwork": "projects/my-other-project-id/regions/europe-west1/subnetworks/my-shared-vpc-name"
    }

Thoughts here?

[question/image_tests]: method to validate that gcloud/gsutil are up to date

As requested on the configuration tests, there should be a "Test for gcloud/gsutil (some distros won’t have this) and validate that versions are up to date." and I thought about 2 different approaches:

A specific version for gcloud and gsutil are set on metadata variables and checked on this test. It fails if the specified version is older than the current versions found on the image.
The test runs an equivalent apt-get update && apt-get install --assume-no gcloud and fails if it returns non-zero (so there is a newer gcloud in the repository).

Which one would you suggest? The former is more robust (cross checks) and the latter more flexible (as it only fails if that distribution repository packed a new version and didn't update the image).

I'm also open to different approaches, if you come up with one.

Files translate_centos.wf.json and translate_rhel.wf.json doesn't exist

Files translate_centos.wf.json and translate_rhel.wf.json are used, but they don't exist.

These files are included by:
/daisy/enterprise_linux/translate_centos_7.wf.json
/daisy/enterprise_linux/translate_rhel_7_byol.wf.json
/daisy/enterprise_linux/translate_rhel_7_licensed.wf.json

It generates error messages like the following:

koike@daisy-control:~$ daisy -variables source_image=projects/,project./global/images/vmi-centos64-bit /daisy/enterprise_linux/translate_centos_7.wf.json
2017/08/22 11:14:35 error parsing workflow "/daisy/enterprise_linux/translate_centos_7.wf.json": open /daisy/enterprise_linux/translate_centos.wf.json: no such file or directory

rename package_library dir to 'packages', etc

By convention, the directory name matches the package name. Otherwise it's very confusing in the code that imports the package - it's not immediately clear which import corresponds to the identifier.

GuestOsFeatures documentation has `[]string]`

compute-image-tools/docs/daisy-workflow-config-spec.md

Line 233 in de06766

 | GuestOsFeatures | []string] | *Optional.* Along with the GCE JSON API's more complex object structure, Daisy allows the use of a simple list. | 

[Daisy] [Question] [RFQ] Create a public file in a bucket

I know that daisy can upload files to buckets, e.g, the "Sources" Workflow field. However I can't configure that file permissions, and I would want to configure it as public for my workload.

Do you have any suggestions for this? I consider implementing a different structure for the "Sources" field which the file parameters would be available to configure. I'd also care to not break compatibility with current implementation, so I hope both of them can be accepted. It'd be something like:

"Sources": {
 "public_file": {"Name": "./local_file", "PredefinedAcl": "publicRead"},
 "private_file": "./secret_file"
}

I'd try to stick with the Storage's APIs paramenter as defined on https://cloud.google.com/storage/docs/json_api/v1/objects/insert

Unknown workflow var "imported_disk" for Windows Server 2016

When attempting to import my Windows Server 2016 disk into Google Cloud, I encounter the following error:

[Daisy] Running workflow "import-and-translate" (id=tglms)

[Daisy] Errors in one or more workflows:
  import-and-translate: error populating workflow: error populating step "translate": unknown workflow Var "imported_disk" passed to IncludeWorkflow "translate-image"
ERROR
ERROR: build step 0 "gcr.io/compute-image-tools/daisy:release" failed: exit status 1

I did some digging through the workflow configuration files and found that translate_windows_2016.wf.json has an imported_disk var that is passed to translate_windows_wf.json in the translate-image step.

"Steps": {
    "translate-image": {
      "IncludeWorkflow": {
        "Path": "./translate_windows_wf.json",
        "Vars": {
          "source_disk": "${source_disk}",
          "install_gce_packages": "${install_gce_packages}",
          "sysprep": "${sysprep}",
          "imported_disk": "${disk_name}",
          "drivers": "gs://gce-windows-drivers-public/release/win6.3/",
          "version": "10.0",
          "task_reg": "./task_reg_2016",
          "task_xml": "./task_xml"
        }
},

However, translate_windows_wf.json does not handle this var at all. The translate_windows_2012_r2.wf.json doesn't have this var either and I would assume it works fine.

I believe the solution to this would be to remove the var in question (imported_disk) from the workflow config for Windows Server 2016.

[question/daisy] Why no region parameter?

I'm implementing ForwardingRule step on daisy as requested on #437 (comment) but I see that daisy is not ready for using region as a parameter. In order to implement it I thought about:

Implicitly define it from zone when needed (basically region = zone[:-2])
Add a parameter -region to be passed by command line for supplying that information, otherwise a valid error like no region provided in step or workflow might occur.

What's your opinion on this?

OsLogin API to verify authorized users

Hi,

The integration tests were failing for some images when trying to login in the machine after enabling OsLogin in the project level. I increased the number of tries and it works.

@illfelder you mentioned that there is an API that I can poll for the authorized users in a specific machine. I found this one in the docs: https://cloud.google.com/compute/docs/oslogin/rest/v1/users/getLoginProfile
But it means that I need to execute this call in the instance. Is there a way to verify the authorized users for a given instance from another instance ? I couldn't find the this API in the docs.

Thanks

Cannot download startup script in CentOS/RHEL 7

As you can see in testgrid we have some random failures in CentOS/RHEL 7startup script tests. After checking the logs we noticed that the cause of these failures is an error during the download of the startup script (trying to download from Google Cloud Storage or even Github), cURL raises an error, exits and returns code 7. Since this can cause some issues to GCE users @zmarano said that he will check this behavior internally.

Document what roles a service account created for daisy should have

I'm trying to create a service account to use with daisy, but it's not clear what roles said service account should have.

Could best practices for this be documented>

Daisy: confusing error when Project field misspelled

When the "Project" field in a workflow names a project that doesn't exist, the following error is displayed:

[Daisy] Errors in one or more workflows:
  <workflow-name>: error populating workflow: googleapi: Error 400: Unknown project id: 0, invalid

It seems clear that the Google API is looking up the project, finding that it doesn't exist, returning "0" as the project ID, and then Daisy is trying to use that ID, rather than notice it's zero.

Daisy should instead detect that the project ID is 0, and give an error message like "project could not be found."

Documentation TODOs

CreateDisks/CreateImages/CreateInstances omit ExactName in the resource descriptions.
CreateImages' description of GuestOsFeatures should be put in the "modifications" table instead of "added fields" table since it is modifying an existing compute.Image field.
CreateImages' description does not include Overwrite.

Python coding style

Hi,

Is there a document describing the coding style that should be used? The link pointed by CONTRIBUTING.md doesn't describe the style :(

Required variables are not validated

Meaning, they are not actually required and you can run workflows without specifying them.

Create new step types for networking

We need step types for networking tasks. Ideas so far:

A CreateNetworks step for adding networks
A step for managing firewall rules
A step for creating subnetworks

This issue could perhaps be split into multiple issues as work progresses.

rhel-7-byol - Error running workflow: step "translate-disk"

I have RHEL 7 image created in Oracle VirtualBox that I am trying to import into GCP. I am following the user guide. Step 3 succeeds, but Step 4 is failing:

[translate-rhel-7-byol.translate-disk]: 2017/09/21 16:14:47 WaitForInstancesSignal: watching serial port 1, SuccessMatch: "TranslateSuccess:", FailureMatch: "TranslateFailed:".
[translate-rhel-7-byol]: 2017/09/21 16:15:27 Error running workflow: step "translate-disk" run error: step "wait-for-translator" run error: WaitForInstancesSignal: FailureMatch found for instance "inst-translator-translate-rhel-7-byol-translate-disk-7g237"

[Daisy] missing serial output on WaitForInstanceSignal

I noticed that some workflows were not reading the serial properly (it missed some strings), specially when it was happening right next to a shutdown (e.g: read the shutdown-scripts output).

Below a simplified workflow that can reproduce this bug (one instance always fails to match the BOOTED string)

{
  "Name": "quick-test",
  "Steps": {
    "create0": { "CreateInstances": [ { "Name": "inst-quick-test0", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait0": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test0", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create1": { "CreateInstances": [ { "Name": "inst-quick-test1", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait1": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test1", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create2": { "CreateInstances": [ { "Name": "inst-quick-test2", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait2": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test2", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create3": { "CreateInstances": [ { "Name": "inst-quick-test3", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait3": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test3", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create4": { "CreateInstances": [ { "Name": "inst-quick-test4", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait4": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test4", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create5": { "CreateInstances": [ { "Name": "inst-quick-test5", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait5": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test5", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create6": { "CreateInstances": [ { "Name": "inst-quick-test6", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait6": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test6", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create7": { "CreateInstances": [ { "Name": "inst-quick-test7", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait7": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test7", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create8": { "CreateInstances": [ { "Name": "inst-quick-test8", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait8": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test8", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] },

    "create9": { "CreateInstances": [ { "Name": "inst-quick-test9", "Disks": [{"InitializeParams": {"SourceImage": "projects/debian-cloud/global/images/family/debian-9"}}], "Metadata": {"startup-script": "logger -p daemon.info BOOTED; echo o >/proc/sysrq-trigger"} } ] },
    "wait9": { "Timeout": "15s", "WaitForInstancesSignal": [ { "Name": "inst-quick-test9", "interval": "0.1s", "SerialOutput": { "Port": 1, "SuccessMatch": "BOOTED" } } ] }
  }, "Dependencies": {
    "wait0": ["create0"],
    "wait1": ["create1"],
    "wait2": ["create2"],
    "wait3": ["create3"],
    "wait4": ["create4"],
    "wait5": ["create5"],
    "wait6": ["create6"],
    "wait7": ["create7"],
    "wait8": ["create8"],
    "wait9": ["create9"]
  }
}

Note1: The subsecond interval really improved but I didn't find a perfect solution.

Note2: none of these the serial port streaming had actually grabbed the whole log. On the other hand, running gcloud compute connect-to-serial-port <INSTANCE_NAME> give consistent results. (this seems to be a completely uncorrelated issue actually).

sles11: syslog is not redirected to serial port

For image_tests, to wait for some image to boot, I add a startup script to print the message "BOOTED" and I use Daisy WaitForInstancesSignal, but sles11 doesn't seem to send syslog outputs to the serial port and the test fails.

I tested on the latest sles11:
sles-11-sp4-v20180104 suse-cloud sles-11

[Daisy] subworkflows don't execute until the end

I've noticed, for this test workflow that when one SubWorkflow ends, it forces ending all of the others subworkflows, ignoring their tests:

[test-subworkflow.quick]: 2018-07-02T21:01:46-03:00 Running step "wait-forever" (WaitForInstancesSignal)       
[test-subworkflow.quick.wait-forever]: 2018-07-02T21:01:46-03:00 WaitForInstancesSignal: Instance "inst-quick-test-test-subworkflow-quick-14v47": watching serial port 1, SuccessMatch: "BOOTED".
[test-subworkflow.loop]: 2018-07-02T21:01:48-03:00 Running step "wait-forever" (WaitForInstancesSignal)  
[test-subworkflow.loop.wait-forever]: 2018-07-02T21:01:48-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-subworkflow-loop-8jrgh": watching serial port 1, SuccessMatch: "b32531e7ca631a108a8d924262584a5e", StatusMatch: "Printing".                                                                          
[test-subworkflow.loop.wait-forever]: 2018-07-02T21:01:52-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-subworkflow-loop-8jrgh": StatusMatch found: "Printing each 5 seconds"
[test-subworkflow.quick.wait-forever]: 2018-07-02T21:01:57-03:00 WaitForInstancesSignal: Instance "inst-quick-test-test-subworkflow-quick-14v47": SuccessMatch found "BOOTED"
[test-subworkflow.quick]: 2018-07-02T21:01:57-03:00 Step "wait-forever" (WaitForInstancesSignal) successfully finished. 
[test-subworkflow.quick]: 2018-07-02T21:01:57-03:00 Workflow "quick" cleaning up (this may take up to 2 minutes).                                                                                           
[test-subworkflow.loop]: 2018-07-02T21:01:57-03:00 Workflow "loop" cleaning up (this may take up to 2 minutes).                                                                             
[test-subworkflow.loop.wait-forever]: 2018-07-02T21:01:57-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-subworkflow-loop-8jrgh": StatusMatch found: "Printing each 5 seconds"
[test-subworkflow]: 2018-07-02T21:04:11-03:00 Workflow "test-subworkflow" cleaning up (this may take up to 2 minutes).                                                                           
[test-subworkflow.quick]: 2018-07-02T21:04:12-03:00 Workflow "quick" cleaning up (this may take up to 2 minutes).      
[test-subworkflow.loop]: 2018-07-02T21:04:12-03:00 Workflow "loop" cleaning up (this may take up to 2 minutes).                                                                                             
[Daisy] Workflow "test-subworkflow" finished                                                                                                                                          
[Daisy] All workflows completed successfully.

That doesn't happen with if I use IncludeWorkFlow instead: (I understand that IncludeWorkflow and SubWorkflow are not always interchangeable, but for the sake of the end results of their target (sub)workflows, I guess they should be):

[test-includeworkflow.quick]: 2018-07-02T21:01:47-03:00 Running step "wait-forever" (WaitForInstancesSignal)
[test-includeworkflow.quick.wait-forever]: 2018-07-02T21:01:47-03:00 WaitForInstancesSignal: Instance "inst-quick-test-test-includeworkflow-quick-w121h": watching serial port 1, SuccessMatch: "BOOTED".
[test-includeworkflow.quick.wait-forever]: 2018-07-02T21:01:58-03:00 WaitForInstancesSignal: Instance "inst-quick-test-test-includeworkflow-quick-w121h": SuccessMatch found "BOOTED"
[test-includeworkflow.quick]: 2018-07-02T21:01:58-03:00 Step "wait-forever" (WaitForInstancesSignal) successfully finished.
[test-includeworkflow]: 2018-07-02T21:01:58-03:00 Step "quick" (IncludeWorkflow) successfully finished.
[test-includeworkflow.loop]: 2018-07-02T21:01:59-03:00 Running step "wait-forever" (WaitForInstancesSignal)
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:01:59-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": watching serial port 1, SuccessMatch: "b32531e7ca631a108a8d924262584a5e", StatusMatch: "Printing".
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:05-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:10-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:15-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:20-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:25-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"
[test-includeworkflow]: 2018-07-02T21:02:29-03:00 Error running workflow: step "loop" run error: step "wait-forever" did not complete within the specified timeout of 30s
[test-includeworkflow]: 2018-07-02T21:02:29-03:00 Workflow "test-includeworkflow" cleaning up (this may take up to 2 minutes).
[test-includeworkflow.loop.wait-forever]: 2018-07-02T21:02:30-03:00 WaitForInstancesSignal: Instance "inst-loop-test-test-includeworkflow-loop-w121h": StatusMatch found: "Printing each 5 seconds"

[Daisy] Errors in one or more workflows:
  test-includeworkflow: step "loop" run error: step "wait-forever" did not complete within the specified timeout of 30s

[instance creation] Support for create VMs with internal IP only

Current code here (https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy/instance.go#L203) always enforces an external IP to be assigned to a VM at creation time.

It would be very useful if that would not be always enforced and instead we could also create instances with internal IPs only.

Cleanup of instances needs to happen first

Instances need to be deleted before deleting disks. Also, a DeleteResources step to try to delete the instance at the end of the workflow is not working to delete the instance before the rest of the cleanup happens.

https://github.com/GoogleCloudPlatform/compute-image-tools/blob/master/daisy/workflow.go#L648

[daisy] Cleanup doesn't work?

I'm importing images using steps from this manual https://googlecloudplatform.github.io/compute-image-tools/image-import.html
Images are created as expected but cleanup seems not working or at least not everything is cleaned.
In particular data in bucket ${project}-daisy-bkt automatically created by daisy is not cleaned and the following folders are still present even after import operation successfully finished.

${project}-daisy-bkt/daisy-import-image-20181019-08:41:53-tyr8l/sources
${project}-daisy-bkt/daisy-import-image-20181019-08:41:53-tyr8l/logs

"NoCleanup" is set to 'false' in both workflows. Is it indented behavior and cleanup of those folders should be done manually?

import_image.wf.json:
...
"create-image": {
"CreateImages": [
{
"Name": "${image_name}",
"SourceDisk": "${import_disk_name}",
"Family": "${family}",
"Description": "${description}",
"ExactName": true,
"NoCleanup": false
}
]
},

import_disk.wf.json:
...
"setup-disks": {
"CreateDisks": [
{
"Name": "disk-importer",
"SourceImage": "${import_instance_disk_image}",
"SizeGb": "${importer_instance_disk_size}",
"Type": "pd-ssd"
},
{
"Name": "${disk_name}",
"SizeGb": "10",
"Type": "pd-ssd",
"ExactName": true,
"NoCleanup": false
}
]
},

validate doesn't flag files that don't exist

{
...
  "Sources": {
    "foo": "local/path/to/file1",
...
  },
...
}

$ daisy  -validate no-foo.wf.json
[Daisy] Validating workflow "package-no-foo"
[package-no-foo]: 2018/03/20 18:19:23 Validating workflow
[package-no-foo]: 2018/03/20 18:19:23 Validating step "setup-disk"
[package-no-foo]: 2018/03/20 18:19:23 Validating step "create-disk-from-image"
[package-no-foo]: 2018/03/20 18:19:23 Validating step "translate-disk-inst"
[package-no-foo]: 2018/03/20 18:19:24 Validation Complete

$ daisy  no-foo.wf.json
[Daisy] Running workflow "package-no-foo"
[package-no-foo]: 2018/03/20 18:19:58 Validating workflow
[package-no-foo]: 2018/03/20 18:19:58 Validating step "create-disk-from-image"
[package-no-foo]: 2018/03/20 18:19:58 Validating step "setup-disk"
[package-no-foo]: 2018/03/20 18:19:59 Validating step "translate-disk-inst"
[package-no-foo]: 2018/03/20 18:19:59 Validation Complete
[package-no-foo]: 2018/03/20 18:19:59 Using the GCS path gs://...-daisy-bkt/daisy-package-no-foo-20180320-22:19:58-47ss8
[package-no-foo]: 2018/03/20 18:19:59 Uploading sources
[package-no-foo]: 2018/03/20 18:19:59 Error uploading sources: FileIOError: stat .../local/path/to/file1: no such file or directory
[package-no-foo]: 2018/03/20 18:19:59 Workflow "package-no-foo" cleaning up (this may take up to 2 minutes).

[Daisy] Errors in one or more workflows:
  package-no-foo: FileIOError: stat .../local/path/to/file1: no such file or directory

Daisy: confusing error when step type misspelled

When the step type of a step is not valid, Daisy reports that no step type is defined:

[Daisy] Errors in one or more workflows:
  <workflow-name>: error populating workflow: no step type defined

It should instead print out the step type that couldn't be identified (if present) and give an error message like "unrecognized step type: ".

Import problems with Gcloud and Daisy

Hi,

I'm suffering the following error when importing the disk. I do not know if the timeout can be resolved but with the command gcloud it seems impossible modify timeout of Daisy. I suppose it will be defined in the file of workflows:

# gcloud compute images import xxxx --source-file gs://xxxx.vmdk --os windows-2008r2 --zone=europe-west1-b --timeout=22h; default="22h" --async --timeout=79200
....
[import-and-translate.import]: 2018-10-23T08:59:21Z Step "setup-disks" (CreateDisks) successfully finished.
[import-and-translate.import]: 2018-10-23T08:59:21Z Running step "import-virtual-disk" (CreateInstances)
[import-and-translate.import.import-virtual-disk]: 2018-10-23T08:59:21Z CreateInstances: Creating instance "inst-importer-import-and-translate-import-xxxxx".
[import-and-translate.import]: 2018-10-23T08:59:55Z Step "import-virtual-disk" (CreateInstances) successfully finished.
[import-and-translate.import]: 2018-10-23T08:59:55Z Running step "wait-for-signal" (WaitForInstancesSignal)
[import-and-translate.import.import-virtual-disk]: 2018-10-23T08:59:55Z CreateInstances: Streaming instance "inst-importer-import-and-translate-import-xxxxx" serial port 1 output to https://storage.cloud.google.com/xxxx-xxxx-daisy-bkt/daisy-import-and-translate-xxxxx-08:59:09-xxxxx/logs/inst-importer-import-and-translate-import-xxxxx-serial-port1.log
[import-and-translate.import.wait-for-signal]: 2018-10-23T08:59:55Z WaitForInstancesSignal: Instance "inst-importer-import-and-translate-import-xxxxx": watching serial port 1, SuccessMatch: "ImportSuccess:", FailureMatch: "ImportFailed:", StatusMatch: "Import:".
[import-and-translate.import.wait-for-signal]: 2018-10-23T09:00:06Z WaitForInstancesSignal: Instance "inst-importer-import-and-translate-import-xxxxx": StatusMatch found: "Import: Importing xxxx-xxxx-daisy-bkt/daisy-import-and-translate-xxxxx-08:59:09-xxxxx/sources/source_disk_file of size 129GB to temp-translation-disk-xxxxx in projects/751850240769/zones/europe-west1-b.'"
[import-and-translate.import.wait-for-signal]: 2018-10-23T09:00:06Z WaitForInstancesSignal: Instance "inst-importer-import-and-translate-import-xxxxx": StatusMatch found: "Import: Importing xxxx-xxxx-daisy-bkt/daisy-import-and-translate-xxxxx-08:59:09-xxxxx/sources/source_disk_file of size 129GB to temp-translation-disk-xxxxx in projects/751850240769/zones/europe-west1-b."

[import-and-translate]: 2018-10-23T10:29:18Z Error running workflow: step "import" did not complete within the specified timeout of 1h30m0s
[import-and-translate]: 2018-10-23T10:29:18Z Workflow "import-and-translate" cleaning up (this may take up to 2 minutes).

[Daisy] Errors in one or more workflows:
  import-and-translate: step "import" did not complete within the specified timeout of 1h30m0s

Also I need exec gcloud compute images import without copy disk in every attempt, using temp volume.
I currently have problems with Daisy and I need to avoid copying the OS disk each time I'm going to perform a test with import script.

Regards!

[question] Proper way of using local-ssd on daisy

I tried to set a "CreateDisks" step with local-ssd and I had no success on doing so. Then I tried to add it directly on "Disks" inside of a "CreateInstances" but I noticed that daisy is not prepared for validating a local-ssd disk type, right?

Are these changes scheduled for being implemented? Or can I start to do this? Is there any architectural suggestion that you'd like to give? For the moment I'm enabling validation of a nameless disk (as the API requests, or else I have a Error 400 "No source can be specified for local SSD.") and clear the URL attributes of the disk. That seems a little bit hacky so I don't know if that's the best way of doing that.

Create new step type(s) to attach and detach disks from instances

We need a new step type with a name like AttachDisks to attach disks, and another, call it DetachDisks to detach disks. These should attach/detach disks from running instances.

Source files are not uploaded to the bucket in double level of SubWorkflows

Workflow A: defines a list of Sources
Workflow B: includes Workflow A
Workflow C: includes Workflow B

Source files are not uploaded to the daisy bucket when executing Workflow C, generating the error

startup-script: WARNING Could not download gs://...

Because the file is not there.

It can be reproduced by executing these workflows: collabora-gce@1b620ad#diff-27b2c61f02c7fb81f893a50c780dca5a

[question/image_tests/configuration]: what is expected to be verified on kernel command line args

Hi once again,

As requested on the configuration tests, there should be a "Ensure boot loader kernel command line args (per distro)." but I need more information. Which arguments do you have in mind?

[question/image_tests]: Create networks through daisy

As requested on the Multi NIC tests. I would like to know if there is a plan to support network/subnets creation on daisy to implement this test: "Ensure two VM’s with multiple interfaces sharing one VPC network can talk to each other over the VPC." as I guess it would be better to create a specific network for testing them.

EL6 translation tests fail because SDK is not installed

The logic to install the SDK on EL6 is part of the image builds. The same logic will work for translate.

cannot download binaries

Links return the error "Anonymous caller does not have storage.objects.get access to compute-image-tools/release/windows/import_precheck_release.exe."

Need more documentation on "Scopes" when creating instances

I recently needed to give a new instance read-write access to a certain GCS bucket. Our current docs show the "Scopes" field as a thing, but don't really explain how to use it. At bare minimum, we should provide a link here.

prow/presubmit-unittests reporting SUCCESS when tests failed

e.g. https://k8s-gubernator.appspot.com/build/gce-daisy-test/pr-logs/pull/GoogleCloudPlatform_compute-image-tools/295/presubmit-daisy-unit-tests/956428357128425475/

Error validating workflow "workflow": error populating workflow: APIError: dialing: missing 'type' field in credentials

I'm not sure what dialing is...

I used https://console.cloud.google.com/apis/credentials to create an OAuth 2.0 client IDs and then downloaded the resulting JSON file.

[question/image_tests]: how an instance add metadata-ssh in another instance

Hello,

To implement the metadata ssh linux integration tests I was creating two machines ("tester" and "testee"), the tester machine would generate the ssh keys and update the metadata of the testee machine and verify login.

My current problem is when I run add-metadata from one machine to the other I get an "Insufficient Permission" error. So I suppose I can't alter the ssh-keys without running gcloud auth login, is that correct? Is there any other way around?

+ gcloud compute instances add-metadata inst-metadata-ssh-testee-5yg4l --metadata-from-file ssh-keys=tester-ssh-key.pub --zone projects/447259094475/zones/us-central1-b
ERROR: (gcloud.compute.instances.add-metadata) Could not fetch resource:
 - Insufficient Permission

Here is my current draft implementation: https://github.com/helen-fornazier/compute-image-tools/blob/metadata-ssh/daisy_workflows/image_test/metadata-ssh.wf.json

translate_el.sh: fail to chroot

For both Centos 7 and RHEL 7 I get the error below, I'll dig more to find a solution.
The exist point is when it tries to execute the following and fails:
chroot ${MNT} yum -y install google-compute-engine google-compute-engine-init google-config
Other errors are also present

Debian GNU/Linux 9 inst-translator-translate-centos-7-translate-disk-stvzt ttyS0

inst-translator-translate-centos-7-translate-disk-stvzt login: Aug 22 20:52:46 localhost google_metadata_script_runner[705]: Copying gs://main-nucleus-128012-daisy-bkt/daisy-translate-centos-7-20170822-20:51:58-stvzt/sources/translate_el.sh...
Aug 22 20:52:46 localhost google_metadata_script_runner[705]: / [0 files][    0.0 B/  4.0 KiB]                                                #015/ [1 files][  4.0 KiB/  4.0 KiB]
Aug 22 20:52:46 localhost google_metadata_script_runner[705]: Operation completed over 1 objects/4.0 KiB.
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + URL=http://metadata/computeMetadata/v1/instance/attributes
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: ++ curl -f -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/el_release
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:                                  Dload  Upload   Total   Spent    Left  Speed
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: #015  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0#015100     1  100     1    0     0     66      0 --:--:-- --:--:-- --:--:--    71
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + EL_RELEASE=7
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: ++ curl -f -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/install_gce_packages
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:                                  Dload  Upload   Total   Spent    Left  Speed
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: #015  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0#015100     4  100     4    0 [   37.194660] SGI XFS with ACLs, security attributes, realtime, no debug enabled
    0    671      0 --:--:-- --:[   37.205338] XFS (sdb1): Mounting V4 Filesystem
--:-- --:--:--   800
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + INSTALL_GCE=true
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: ++ curl -f -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/use_rhel_gce_license
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url:                                  Dload  Upload   Total   Spent    Left  Speed
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: #015  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0#015100     5  100     5    0     0    835      0 --:--:-- --:--:-- --:--:--  1000
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + RHEL_LICENSE=false
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + MNT=/mnt/imported_disk
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + mkdir /mnt/imported_disk
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + '[' -b /dev/sdb1 ']'
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + echo 'Trying to mount /dev/sdb1'
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: Trying to mount /dev/sdb1
Aug 22 20:52:46 localhost startup-script: INFO startup-script-url: + mount /dev/sdb1 /mnt/imported_disk
Aug 22 20:52:47 localhost kernel: [   37.194660] SGI XFS with ACLs, security attributes, realtime, no debug enabled
Aug 22 20:52:47 localhost kernel: [   37.205338] XFS (sdb1): Mounting V4 Filesystem
[   38.631959] random: crng init done
Aug 22 20:52:48 localhost kernel: [   38.631959] random: crng init done
[   40.274745] XFS (sdb1): Ending clean mount
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + '[' 0 -ne 0 ']'
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + for f in proc sys dev run
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + mount -o bind /proc /mnt/imported_disk/proc
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: mount: mount point /mnt/imported_disk/proc does not exist
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + for f in proc sys dev run
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + mount -o bind /sys /mnt/imported_disk/sys
Aug 22 20:52:50 localhost kernel: [   40.274745] XFS (sdb1): Ending clean mount
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: mount: mount point /mnt/imported_disk/sys does not exist
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + for f in proc sys dev run
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + mount -o bind /dev /mnt/imported_disk/dev
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: mount: mount point /mnt/imported_disk/dev does not exist
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + for f in proc sys dev run
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + mount -o bind /run /mnt/imported_disk/run
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: mount: mount point /mnt/imported_disk/run does not exist
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + cp /etc/resolv.conf /mnt/imported_disk/etc/resolv.conf
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: cp: cannot create regular file '/mnt/imported_disk/etc/resolv.conf': No such file or directory
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + oot /mnt/imported_disk restorecon /etc/resolv.conf
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: /startup-N0dtKF/tmpnDsYM_: line 47: oot: command not found
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + [[ '' == \t\r\u\e ]]
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + [[ true == \t\r\u\e ]]
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + echo 'Installing GCE packages.'
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: Installing GCE packages.
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + cat
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: /startup-N0dtKF/tmpnDsYM_: line 64: /etc/yum.repos.d/google-cloud.repo: No such file or directory
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + [[ 7 == \7 ]]
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + cat
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: /startup-N0dtKF/tmpnDsYM_: line 76: /etc/yum.repos.d/google-cloud.repo: No such file or directory
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + chroot /mnt/imported_disk yum -y install google-cloud-sdk
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: chroot: failed to run command â€˜yumâ€™: No such file or directory
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + chroot /mnt/imported_disk yum -y install google-compute-engine google-compute-engine-init google-config
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: chroot: failed to run command â€˜yumâ€™: No such file or directory
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + '[' 127 -ne 0 ']'
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + echo 'TranslateFailed: GCE package install failed.'
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: TranslateFailed: GCE package install failed.
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: + exit 1
Aug 22 20:52:50 localhost startup-script: INFO startup-script-url: Return code 1.
Aug 22 20:52:50 localhost startup-script: INFO Finished running startup scripts.
Aug 22 20:52:50 localhost systemd[1]: Started Google Compute Engine Startup Scripts.
Aug 22 20:52:50 localhost systemd[1]: Reached target Multi-User System.
Aug 22 20:52:50 localhost systemd[1]: Reached target Graphical Interface.
Aug 22 20:52:50 localhost systemd[1]: Starting Update UTMP about System Runlevel Changes...
Aug 22 20:52:50 localhost systemd[1]: Started Update UTMP about System Runlevel Changes.
Aug 22 20:52:50 localhost systemd[1]: Startup finished in 1.170s (kernel) + 38.823s (userspace) = 39.993s.

Markdown tables aren't rendering in https://googlecloudplatform.github.io/compute-image-tools/image-import.html#compatibility-precheck-tool

Prow: pool of projects - design discussion

We have some integration tests for images ready and I would like to move forward to integrate those tests in the testing infrastructure.

Several integration tests modifies metadata in the project level, those tests conflicts with each other thus can't be ran in parallel unless we have a way to tell Prow to execute those conflicting tests in different GCE projects.
This prohibits us to have a test per image. The current way to add the integration tests in Prow is to have a single workflow to serialize all the image tests and this is not ideal.

The idea is to have a configuration file where we can define which are the conflicting tests and another to inform Prow a list of GCE projects for it to use. Prow would launch the conflicting tests in parallel, each one in a different project, or serialize the execution if no project is available (if all of them are busy).

I would like to have your feedback on the matter, if this should be in the Prow level or in the workflow description, if you already have any ideas in mind of if I can go on and work on a proposal for the implementation.

Image status should be taken into account when validating image resources

Images can have multiple states including OBSOLETE- which means that the image resource exists but you can't actually use it to create disks or instances. Validation should take the state into account.

SSD Space Limits

I've tried to run the image export but have run into a wall. I'm trying to export my 100GB regular disk image but keep getting a message that I've exceeded my SSD quota even though none of my projects use SSDs.

When the workflow runs and gets to the create disk step for the temporary disk, it is trying to create a 200GB SSD which is twice my quota allotment. It appears the workflow is hard-coded to use only SSD ( I assume for performance)

  "Steps": {
    "setup-disks": {
      "CreateDisks": [
        {
          "Name": "disk-${NAME}",
          "SourceImage": "${export_instance_disk_image}",
          "SizeGb": "${export_instance_disk_size}",
          "Type": "pd-ssd"
        }
      ]
    },

It is possible to add a config to the commandline to the workflow to override the default hard-drive type? I pasted the output below from when I try to run the command below.

C:\Google\CloudSDK>gcloud compute images export --destination-uri gs://rositaimages/rostia5.vmdk --image rosita-5-base --export-format vmdk
Created [https://cloudbuild.googleapis.com/v1/projects/saftinettest/builds/9a9c8104-b5b8-4be1-b7af-21dbc5727253].
Logs are available at [https://console.cloud.google.com/gcr/builds/9a9c8104-b5b8-4be1-b7af-21dbc5727253?project=415975723976].
------------------------------------------------- REMOTE BUILD OUTPUT --------------------------------------------------
starting build "9a9c8104-b5b8-4be1-b7af-21dbc5727253"

FETCHSOURCE
BUILD
Pulling image: gcr.io/compute-image-tools/daisy:release
release: Pulling from compute-image-tools/daisy
ce230f45c7a3: Pulling fs layer
f0d1b015b384: Pulling fs layer
0777565376da: Pulling fs layer
ce230f45c7a3: Verifying Checksum
ce230f45c7a3: Download complete
0777565376da: Verifying Checksum
0777565376da: Download complete
ce230f45c7a3: Pull complete
f0d1b015b384: Verifying Checksum
f0d1b015b384: Download complete
f0d1b015b384: Pull complete
0777565376da: Pull complete
Digest: sha256:400a0b76663fa9d1ee9090f748ec1a4c441d989bad640b39161bbd82e880acd2
Status: Downloaded newer image for gcr.io/compute-image-tools/daisy:release
[Daisy] Running workflow "image-export-ext"
[image-export-ext]: 2018/04/11 15:56:51 Validating workflow
[image-export-ext]: 2018/04/11 15:56:51 Validating step "setup-disks"
[image-export-ext]: 2018/04/11 15:56:51 Validating step "export-disk"
[image-export-ext.export-disk]: 2018/04/11 15:56:51 Validating step "setup-disks"
[image-export-ext.export-disk]: 2018/04/11 15:56:52 Validating step "run-export-disk"
[image-export-ext.export-disk]: 2018/04/11 15:56:52 Validating step "wait-for-inst-export-disk"
[image-export-ext.export-disk]: 2018/04/11 15:56:52 Validating step "copy-image-object"
[image-export-ext.export-disk]: 2018/04/11 15:56:52 Validating step "delete-inst"
[image-export-ext]: 2018/04/11 15:56:52 Validation Complete
[image-export-ext]: 2018/04/11 15:56:52 Using the GCS path gs://saftinettest-daisy-bkt/daisy-image-export-ext-20180411-15:56:51-8n99b
[image-export-ext]: 2018/04/11 15:56:52 Uploading sources
[image-export-ext]: 2018/04/11 15:56:53 Running workflow
[image-export-ext]: 2018/04/11 15:56:53 Running step "setup-disks" (CreateDisks)
[image-export-ext]: 2018/04/11 15:56:53 CreateDisks: creating disk "disk-image-export-ext-image-export-ext-8n99b".
[image-export-ext]: 2018/04/11 15:57:06 Step "setup-disks" (CreateDisks) successfully finished.
[image-export-ext]: 2018/04/11 15:57:06 Running step "export-disk" (IncludeWorkflow)
[image-export-ext.export-disk]: 2018/04/11 15:57:06 Running step "setup-disks" (CreateDisks)
[image-export-ext.export-disk]: 2018/04/11 15:57:06 CreateDisks: creating disk "disk-export-disk-image-export-ext-export-disk-8n99b".
[image-export-ext]: 2018/04/11 15:57:06 Error running workflow: step "export-disk" run error: step "setup-disks" run error: googleapi: Error 403: Quota 'SSD_TOTAL_GB' exceeded. Limit: 100.0 in region us-central1., quotaExceeded
[image-export-ext]: 2018/04/11 15:57:06 Workflow "image-export-ext" cleaning up (this may take up to 2 minutes).

[Daisy] Errors in one or more workflows:
  image-export-ext: step "export-disk" run error: step "setup-disks" run error: googleapi: Error 403: Quota 'SSD_TOTAL_GB' exceeded. Limit: 100.0 in region us-central1., quotaExceeded
ERROR
ERROR: build step 0 "gcr.io/compute-image-tools/daisy:release" failed: exit status 1
------------------------------------------------------------------------------------------------------------------------

ERROR: (gcloud.compute.images.export) build 9a9c8104-b5b8-4be1-b7af-21dbc5727253 completed with status "FAILURE"

[Daisy] Soft dependency in the start order among the steps

The way the dependencies work in Daisy is: if step B depends on step A, B will only execute after A completes successfully.
But for some cases we want A and B to start in parallel, and this is not granted.
For instance, in the shutdown image tests, we had an issue where the step (A) to shut down the instance and the step (B) to wait for the logs from A should start in parallel. But sometimes daisy starts A before B, then we end up missing some logs and B never finishes.

The expected scenario is to have both steps to start in parallel:

|--------- shutdown ------------ ... ---|
|--------- wait logs ---------|

But we have this scenario from time to time:

|----- shutdown -------|
|--------- wait logs ---------|

which can cause issues if the log we were waiting for had already been printed.

We would like to grant that the wait step begins before the shutdown

|--------- shutdown ---------|
|----- wait logs -------|

Our workaround, for now, adds an extra step that shut down step depends on it to start (but this is not ideal):

|- dummy -||--------- shutdown ---------|
|----------- wait logs -----------|

If we have a step dependency that can start a step after another step has started but before it is finished would help in this case. Maybe specify how much time we want to wait after the step starts or maybe we could have some modifiers in the dependency list, what do you think?

Workflow validation doesn't catch syntax error in 'Name' field in SubWorkflow

If a workflow has an invalid name, e.g. img-test_independent, executing this workflow fails as expected:

[Daisy] Running workflow "img-test_independent" (id=zy2s1)

[Daisy] Errors in one or more workflows:
  img-test_independent: error validating workflow: workflow field 'Name' must start with a letter and only contain letters, numbers, and hyphens

But if we use this bad Workflow in a IncludeWorflow or SubWorkflow, validation doesn't fail.

Edit: I think I got it wrong. The GCE means to use it differently: https://cloud.google.com/vpc/docs/alias-ip and https://cloud.google.com/vpc/docs/using-routes . So it can be used as a machine accepting different IPs for different services (even with a single network card). Routing has to be done correctly or else these packages will never reach the correct machine.

Do you have more ideas regarding the "this may expand out"?

Ubuntu can't download public gcs file when created in daisy

Hello,

I get the following error when executing this simple workflow: https://paste.ee/p/HFEoD

startup-script: INFO Found startup-script-url in metadata.
startup-script: INFO Downloading url from gs://main-nucleus-128012-daisy-bkt/daisy-startup-script-linux-20181212-19:58:21-n5hrn/sources/startup_file_public.ps1 to /startup-ddyqfbac/tmpcg39bi_g using gsutil.
chronyd[1620]: Selected source 169.254.169.254
google_metadata_script_runner[1680]: Failure: Could not reach metadata service: Not Found.
startup-script: WARNING Could not download gs://main-nucleus-128012-daisy-bkt/daisy-startup-script-linux-20181212-19:58:21-n5hrn/sources/startup_file_public.ps1 using gsutil. Command '['gsutil', 'cp', 'gs://main-nucleus-128012-daisy-bkt/daisy-startup-script-linux-20181212-19:58:21-n5hrn/sources/startup_file_public.ps1', '/startup-ddyqfbac/tmpcg39bi_g']' returned non-zero exit status 1..

But I don't have this issue when I use gcloud with --no-scopes and --no-service-account:

gcloud compute instances create koike-u18-2 \
 --metadata=startup-script-url=gs://main-nucleus-128012-daisy-bkt/daisy-startup-script-linux-20181212-19:58:21-n5hrn/sources/startup_file_public.ps1 \
--image=projects/ubuntu-os-cloud/global/images/family/ubuntu-1804-lts \
--no-service-account --no-scopes

where the file startup_file_public.ps1 is public.

I am not entirely sure what is the difference between the instances created by daisy and gcloud, I am still investigating, please let me know if you have already seen this before.

Thanks

googlecloudplatform / compute-image-tools Goto Github PK

compute-image-tools's Introduction

Compute Image Tools

Contributing

License

compute-image-tools's People

Contributors

Stargazers

Watchers

Forkers

compute-image-tools's Issues

How to reproduce:

Fail:

Succeed:

Recommend Projects

Recommend Topics

Recommend Org

Jobs