GithubHelp home page GithubHelp logo

kubernetes / cloud-provider-vsphere Goto Github PK

View Code? Open in Web Editor NEW
224.0 33.0 169.0 6.22 MB

Kubernetes Cloud Provider for vSphere https://cloud-provider-vsphere.sigs.k8s.io

License: Apache License 2.0

Makefile 2.74% Go 91.82% Shell 4.04% Dockerfile 1.23% Mustache 0.17%
kubernetes vsphere cloud-providers vmware

cloud-provider-vsphere's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloud-provider-vsphere's Issues

csi: add plugin scaffold

Add scaffold for CSI plugin that gives us something to start building and using for CI. This will be a no-op plugin first, with each method just being "not implemented", but allows other efforts to begin in parallel.

csi: design plugin to support multiple APIs

When writing the CSI plugin, we need to make sure we can support a choice of backend APIs. The API can be chosen via config flag/var. We start by implementing FCD, but there may future APIs and it would nice to only have on official VMware plugin. Or we may need to add support for something older, like virtual disk manager. Good to keep our options open.

Standardize and enforce a Go version through the repo

Create a file with the version of Go to be used when building artifacts from this repo.

The file should be used as source of truth in Makefile and across all the scripts created, they should enforce the version contained in the file.

/assign @akutz

Allow Pushing Images with Latest Tag

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

What happened:
When running make upload-images, docker images are pushed and tags based on the build hash.

What you expected to happen:
Having a tag on the build hash is fine, but I would also like to push with the tag latest. By default, I would be nice to push both the build tag and latest tag. Not ideal but acceptable would be optionally pushing a latest tag.

The reason why this is needed is because by not pushing with the latest tag means for each build I need to update my YAML to append the build tag (example 12a86808) to the image name (ie dvonthenen/vsphere-cloud-controller-manager:12a86808).

How to reproduce it (as minimally and precisely as possible):
Run make upload-images

Anything else we need to know?:
Nope

Environment:

Code cleanup

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:

The packages under pkg/ have some dead code in them and are in general bad shape.

What you expected to happen:

A clean up to be perfomed: removal of dead code, code optimization, removal of redundant code paths.

Anything else we need to know?:

/cc @akutz

Setup a CI job to push latest images to dockerhub

Makefile has a make images target that builds a CCM container based on photon:2.0.

It would be great to get the ball rolling with CI creating a job that runs a build and pushes the resulting image to dockerhub.

/kind feature

Node resolution fails when other VMs have same VM-tools-reported guest host FQDNs

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
The CCM fails during node look up when a VM is cloned, even with a new UUID, or when VMs have the same names. I’ve seen this occur from templates and OVFs. Most recently the CCM failed outright when I exported a VM as an OVF, imported that OVF as a new VM, and then stood up K8s on the new VM. Until I removed the original VM from the vSphere inventory, the CCM indicated that it could not resolve the node.

I have to think something is going on internal to vSphere itself that is causing issues when one or more of the following is true:

  • Multiple VMs have the same names…
  • Multiple VMs have the same host name as reported by the VM tools…
  • VMs are deployed from the same template/OVF with new hardware, but have the same files, such as /etc/machine-id, that are read by VM tools and get added to the identity of the VM…
  • Or…gremlins?

So my hypothesis seems to be close to something. Look at the UUIDs of the VMs:

[130]akutz@akutz-a01:Downloads$ govc vm.info -vm.ipath /SDDC-Datacenter/vm/Workloads/yakity-centos
Name:           yakity-centos
  Path:         /SDDC-Datacenter/vm/Workloads/yakity-centos
  UUID:         42301ab6-f495-1845-25ff-5190f281430a
  Guest name:   CentOS 7 (64-bit)
  Memory:       2048MB
  CPU:          1 vCPU(s)
  Power state:  poweredOn
  Boot time:    2018-10-17 21:40:26.36326 +0000 UTC
  IP address:   192.168.3.87
  Host:         10.2.32.4
[0]akutz@akutz-a01:Downloads$ govc vm.info -vm.ipath /SDDC-Datacenter/vm/Workloads/centos-ovf
Name:           centos-ovf
  Path:         /SDDC-Datacenter/vm/Workloads/centos-ovf
  UUID:         42301050-0acb-4e89-31d9-d10386eb4b6e
  Guest name:   CentOS 7 (64-bit)
  Memory:       2048MB
  CPU:          1 vCPU(s)
  Power state:  poweredOn
  Boot time:    2018-10-17 21:38:20.299679 +0000 UTC
  IP address:   192.168.3.52
  Host:         10.2.32.8

Now look at the errors from the CCM:

018-10-17T16:43:55.562713931-05:00 stderr F E1017 21:43:55.562586       1 node_controller.go:417] failed to find kubelet node IP from cloud provider
2018-10-17T16:44:05.571354049-05:00 stderr F I1017 21:44:05.571209       1 instances.go:107] instances.InstanceID() called with yakity.localdomain
2018-10-17T16:44:05.571374006-05:00 stderr F I1017 21:44:05.571230       1 instances.go:111] instances.InstanceID() CACHED with yakity.localdomain
2018-10-17T16:44:05.571379606-05:00 stderr F I1017 21:44:05.571236       1 instances.go:76] instances.NodeAddressesByProviderID() called with vsphere://42301050-0acb-4e89-31d9-d10386eb4b6e
2018-10-17T16:44:05.571384433-05:00 stderr F I1017 21:44:05.571241       1 instances.go:81] instances.NodeAddressesByProviderID() CACHED with 42301050-0acb-4e89-31d9-d10386eb4b6e
2018-10-17T16:44:05.571388477-05:00 stderr F E1017 21:44:05.571250       1 node_controller.go:417] failed to find kubelet node IP from cloud provider

The VM in question is yakity-centos. The CCM on yakity-centos worked when the following were true:

  1. The VM centos-ovf was removed from the inventory
  2. The VM centos-ovf was added back to the inventory with no information yet reported from VM tools since the VM had not yet been powered on

The CCM failed on yakity-centos when the following is true:

  1. The VM centos-ovf is in the inventory and its host name has been reported by VM tools to be the same as the host name reported by yakity-centos

The centos-ovf VM doesn’t even have to be powered on to cause yakity-ovf to fail. All that is required is for centos-ovf to have been powered on at least once so that VM tools reports centos-ovf’s host name as the same as yakity-ovf.

When that’s the case, the CCM fails.

This makes complete sense when viewed in the context of the problem we saw a month ago with VM/host names.

What you expected to happen:
For the CCM to resolve the node as usual.

How to reproduce it (as minimally and precisely as possible):
I cannot be sure, but it seems related to VMs that report the same guest host names via VM tools. On our VMC environment I can reproduce it with the VMs centos-ovf and yakity-centos. Please do not touch these VMs without looping me in, however, as I'm using them at the moment.

Anything else we need to know?:
My mama says I've the prettiest brown eyes...

I think I found an issue, maybe? In instances.go, the NodeAddresses function uses FindVMByName where the NodeAddressesByProviderID function uses FindVMByUUID. Why would FindVMByName ever be used?

I think this is the culprit:

// ExternalID returns the cloud provider ID of the instance identified by
// nodeName. If the instance does not exist or is no longer running, the
// returned error will be cloudprovider.InstanceNotFound.
//
// When nodeName identifies more than one instance, only the first will be
// considered.
func (i *instances) ExternalID(ctx context.Context, nodeName types.NodeName) (string, error) {
	glog.V(4).Info("instances.ExternalID() called with ", nodeName)
	return i.InstanceID(ctx, nodeName)
}

// InstanceID returns the cloud provider ID of the instance identified by nodeName.
func (i *instances) InstanceID(ctx context.Context, nodeName types.NodeName) (string, error) {
	glog.V(4).Info("instances.InstanceID() called with ", nodeName)

	// Check if node has been discovered already
	if node, ok := i.nodeManager.nodeNameMap[string(nodeName)]; ok {
		glog.V(2).Info("instances.InstanceID() CACHED with ", string(nodeName))
		return node.UUID, nil
	}

	if err := i.nodeManager.DiscoverNode(string(nodeName), FindVMByName); err == nil {
		glog.V(2).Info("instances.InstanceID() FOUND with ", string(nodeName))
		return i.nodeManager.nodeNameMap[string(nodeName)].UUID, nil
	}

	glog.V(4).Info("instances.InstanceID() NOT FOUND with ", string(nodeName))
	return "", ErrNodeNotFound
}

ExternalID calls InstanceID, which always uses FindVMByName.

Maybe. I don’t know. I’m sure this all works 99.9999% of the time and there’s just some weird edge case I’m exacerbating. That and my lack of knowledge of the workflow probably means I’m making mountains out of molehills and seeing bugs where there aren’t any.

Environment:

  • vsphere-cloud-controller-manager version: latest
  • OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel (e.g. uname -a):
Linux yakity.localdomain 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: yakity
  • Others:

Bug: gopkg.toml should be updated with dependencies

BUG REPORT:
Some commits imported packages which not updated in Gopkg.toml. It will cause compile error in local dev env. So, it would be better for updating the Gopkg.toml along with each PR commit.

/kind bug

What happened:
For example:
Gopkg.toml only contains minimal dependencies, like k8s.io/api, k8s.io/apimachinery, k8s.io/client-go, k8s.io/kubernetes, etc.

New added dependencies are missing, like k8s.io/sample-controller, etc.

What you expected to happen:
Update the toml file by adding each new dependency.

Anything else we need to know?:

Environment:

  • vsphere-cloud-controller-manager version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

csi: implement Controller publish/unpublish

As part of implementing CSI spec, add capability for Controller{Publish,Unpublish}. This functionality reaches out the vCenter and attaches/detaches the volume to the requested node.

Create a SECURITY_CONTACTS file.

As per the email sent to kubernetes-dev[1], please create a SECURITY_CONTACTS
file.

The template for the file can be found in the kubernetes-template repository[2].
A description for the file is in the steering-committee docs[3], you might need
to search that page for "Security Contacts".

Please feel free to ping me on the PR when you make it, otherwise I will see when
you close this issue. :)

Thanks so much, let me know if you have any questions.

(This issue was generated from a tool, apologies for any weirdness.)

[1] https://groups.google.com/forum/#!topic/kubernetes-dev/codeiIoQ6QE
[2] https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS
[3] https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance-template-short.md

Implement unit tests for pkg/cloudprovider/vsphere

We need unit tests for the cloud provider code to ensure we do not introduce breaking changes.

Currently pkg/cloudprovider/vsphere has no unit tests, we should implement an initial set of unit tests for the package.

These tests should be part of the unit Makefile target that is automatically run as part of the test target, which in turn is run by our CI job.

/kind feature

Set up unit testing for CCM code

This would ideally run a make test on the tide testing infrastructure, this issue is for the creation of the hook, make test can be a noop for now.

/kind feature
/help

Move from intree to out of tree

Hello

I am trying to move vsphere from intree to out of tree but when checking the docs I see that we can install deploying_cloud_provider_vsphere_with_rbac.md and deploying_csi_vsphere_with_rbac.md , what is the differnece between them and should i install them both ?

Thanks

csi: should /etc/cloud/vsphere.conf be required?

While testing CSI, I came across this and wanted to capture it before I forgot.

When starting the CSI plugin on a new node, I get an error:

$ ./vsphere-csi
FATA[0000] Failed to open /etc/cloud/vsphere.conf . Err: open /etc/cloud/vsphere.conf: no such file or directory

I was expecting to be able to set all the needed parameters via env var for my simple case, but if the referenced config file is not present, the plugin does not start up. Furthermore, when looking at the code here It looks like the config file would have precedence over env vars. I'm used to the other way around -- using env vars to override something that may be in a config file.

I wanted to use this issue to clarify that behavior and see if we could agree on what is desired. And then document it.

dep: Removing non-Go files from dependencies

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

FWIW, I added the following to the Gopkg.toml file and did a clean dep ensure:

[prune]
  non-go = true
  go-tests = true
  unused-packages = true

I wanted to see how much disk space was saved, if any, by removing non-Go files from the project's dependencies. The answer? Only about 2MB:

Keep non-Go files

$ du -m -d 0 .vendor
50	.vendor

Remove non-Go files

$ du -m -d 0 vendor
48	vendor

So for now, at least, it does not make sense to remove non-Go files from dependencies as the cost-savings is not worth the risk of losing contents that may be necessary for testing.

What happened: NA

What you expected to happen: NA

How to reproduce it (as minimally and precisely as possible): NA

Anything else we need to know?: NA

Environment: NA

csi: Implement volume create/delete

As part of implementing CSI spec functionality, first step is adding volume create and delete.

Right now this is simplistic, as it just goes to the identified datastore, rather than a datastore cluster.

Committing vendor

/kind feature

Have the project maintainers considered committing vendor? It tends to make project management much easier.

Failed to run CSI with incompatible version, csi-controller gone into unrecoverable state

BUG REPORT

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
Instaled k8s v1.13.2.
cloud-provider=external.
vsphere.conf with secrets.

Installed cloud-controller-manager according docs
Works ok.

Installed csi according docs.
Started.

So far so good, added storage class 'vsphere-gold' and created pvc.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-vsphere-csi-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: vsphere-gold

got error
kubectl --kubeconfig admin.conf -n kube-system logs -f vsphere-csi-controller-0 vsphere-csi-controller

time="2019-01-21T15:19:45Z" level=debug msg="enabled context injector"
time="2019-01-21T15:19:45Z" level=debug msg="init req & rep validation" withSpec=false
time="2019-01-21T15:19:45Z" level=debug msg="init implicit rep validation" withSpecRep=false
time="2019-01-21T15:19:45Z" level=debug msg="init req validation" withSpecReq=false
time="2019-01-21T15:19:45Z" level=debug msg="enabled request ID injector"
time="2019-01-21T15:19:45Z" level=debug msg="enabled request logging"
time="2019-01-21T15:19:45Z" level=debug msg="enabled response logging"
time="2019-01-21T15:19:45Z" level=debug msg="enabled serial volume access"
time="2019-01-21T15:19:45Z" level=info msg="configured: io.k8s.cloud-provider-vsphere.vsphere" api=FCD mode=controller
time="2019-01-21T15:19:45Z" level=info msg="identity service registered"
time="2019-01-21T15:19:45Z" level=info msg="controller service registered"
time="2019-01-21T15:19:45Z" level=info msg=serving endpoint="unix:///var/lib/csi/sockets/pluginproxy/csi.sock"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/Probe: REQ 0001: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/Probe: REP 0001: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REQ 0002: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REP 0002: Name=io.k8s.cloud-provider-vsphere.vsphere, VendorVersion=v0.1.1, XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REQ 0003: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REP 0003: Capabilities=[service:<type:CONTROLLER_SERVICE > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REQ 0004: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:19:46Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REP 0004: Capabilities=[rpc:<type:LIST_VOLUMES >  rpc:<type:CREATE_DELETE_VOLUME >  rpc:<type:PUBLISH_UNPUBLISH_VOLUME > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REQ 0005: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REP 0005: Capabilities=[service:<type:CONTROLLER_SERVICE > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REQ 0006: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REP 0006: Capabilities=[rpc:<type:LIST_VOLUMES >  rpc:<type:CREATE_DELETE_VOLUME >  rpc:<type:PUBLISH_UNPUBLISH_VOLUME > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REQ 0007: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REP 0007: Name=io.k8s.cloud-provider-vsphere.vsphere, VendorVersion=v0.1.1, XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:28:35Z" level=debug msg="/csi.v1.Controller/CreateVolume: REQ 0008: Name=pvc-34370d70-1d91-11e9-8953-00505697cc54, CapacityRange=required_bytes:1073741824 , VolumeCapabilities=[mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > ], Parameters=map[parent_type:Datastore parent_name:ESXi-CL1-VV1 (RAID 50)], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
ERROR: logging before flag.Parse: E0121 15:28:40.988002       1 connection.go:63] Failed to create govmomi client. err: ServerFaultCode: Cannot complete login due to an incorrect user name or password.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x15a6d36]

According 'Cannot complete login due to an incorrect user name or password' i decided there is authentication issue.
Recreated cluster.
Added server, username and password to vsphere.conf
Installed ccm and csi.
Created pvc and got:

kubectl --kubeconfig admin.conf -n kube-system logs -f vsphere-csi-controller-0 vsphere-csi-controller
time="2019-01-21T15:54:15Z" level=debug msg="enabled context injector"
time="2019-01-21T15:54:15Z" level=debug msg="init req & rep validation" withSpec=false
time="2019-01-21T15:54:15Z" level=debug msg="init implicit rep validation" withSpecRep=false
time="2019-01-21T15:54:15Z" level=debug msg="init req validation" withSpecReq=false
time="2019-01-21T15:54:15Z" level=debug msg="enabled request ID injector"
time="2019-01-21T15:54:15Z" level=debug msg="enabled request logging"
time="2019-01-21T15:54:15Z" level=debug msg="enabled response logging"
time="2019-01-21T15:54:15Z" level=debug msg="enabled serial volume access"
time="2019-01-21T15:54:15Z" level=info msg="configured: io.k8s.cloud-provider-vsphere.vsphere" api=FCD mode=controller
time="2019-01-21T15:54:15Z" level=info msg="identity service registered"
time="2019-01-21T15:54:15Z" level=info msg="controller service registered"
time="2019-01-21T15:54:15Z" level=info msg=serving endpoint="unix:///var/lib/csi/sockets/pluginproxy/csi.sock"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/Probe: REQ 0001: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/Probe: REP 0001: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REQ 0002: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REP 0002: Name=io.k8s.cloud-provider-vsphere.vsphere, VendorVersion=v0.1.1, XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REQ 0003: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REP 0003: Capabilities=[service:<type:CONTROLLER_SERVICE > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REQ 0004: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:54:16Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REP 0004: Capabilities=[rpc:<type:LIST_VOLUMES >  rpc:<type:CREATE_DELETE_VOLUME >  rpc:<type:PUBLISH_UNPUBLISH_VOLUME > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REQ 0005: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Identity/GetPluginCapabilities: REP 0005: Capabilities=[service:<type:CONTROLLER_SERVICE > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REQ 0006: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Controller/ControllerGetCapabilities: REP 0006: Capabilities=[rpc:<type:LIST_VOLUMES >  rpc:<type:CREATE_DELETE_VOLUME >  rpc:<type:PUBLISH_UNPUBLISH_VOLUME > ], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REQ 0007: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Identity/GetPluginInfo: REP 0007: Name=io.k8s.cloud-provider-vsphere.vsphere, VendorVersion=v0.1.1, XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
time="2019-01-21T15:58:19Z" level=debug msg="/csi.v1.Controller/CreateVolume: REQ 0008: Name=pvc-5ee40328-1d95-11e9-8858-0050569742c5, CapacityRange=required_bytes:1073741824 , VolumeCapabilities=[mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > ], Parameters=map[parent_name:ESXi-CL1-VV1 (RAID 50) parent_type:Datastore], XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x15a6d36]

goroutine 76 [running]:
k8s.io/cloud-provider-vsphere/vendor/github.com/vmware/govmomi/vslm.NewObjectManager(...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/vmware/govmomi/vslm/object_manager.go:41
k8s.io/cloud-provider-vsphere/pkg/common/vclib.(*DatastoreInfo).GetFirstClassDiskInfo(0xc00072e270, 0x1e3c700, 0xc0006071d0, 0xc0006453b0, 0x28, 0x1, 0x0, 0x0, 0x41199c)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/pkg/common/vclib/datastore.go:197 +0x86
k8s.io/cloud-provider-vsphere/pkg/common/vclib.(*Datacenter).GetFirstClassDisk(0xc0000a6228, 0x1e3c700, 0xc0006071d0, 0xc0000a5280, 0x16, 0x1bd6372, 0x9, 0xc0006453b0, 0x28, 0x1, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/pkg/common/vclib/datacenter.go:518 +0x1ab
k8s.io/cloud-provider-vsphere/pkg/csi/service/fcd.(*controller).CreateVolume(0xc00000ab80, 0x1e3c700, 0xc0006071d0, 0xc0005b83f0, 0xc00000ab80, 0xc000620e01, 0xc000185390)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/pkg/csi/service/fcd/controller.go:132 +0x4a0
k8s.io/cloud-provider-vsphere/vendor/github.com/container-storage-interface/spec/lib/go/csi._Controller_CreateVolume_Handler.func1(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0x0, 0x1e2a300, 0xc000620e40, 0x0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:4579 +0x86
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume.(*interceptor).createVolume(0xc0000b0d80, 0x1e3c700, 0xc0006071d0, 0xc0005b83f0, 0xc000620c60, 0xc000620c80, 0x0, 0x0, 0x0, 0x0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:162 +0x161
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle(0xc0000b0d80, 0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620c80, 0x180, 0x14d, 0x0, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:90 +0x37d
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle-fm(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620c80, 0xc00036a0c0, 0xc000185558, 0x4c7ab7, 0xc00036a0c0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:71 +0x73
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0x14d, 0x0, 0xc00036a0c0, 0x0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 +0x63
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging.(*interceptor).handleServer.func1(0x1e18160, 0xc0000b0c80, 0xc000185670, 0x1)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging/logging_interceptor.go:84 +0x49
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging.(*interceptor).handle(0xc0000b0ce0, 0x1e3c700, 0xc0006071d0, 0x1c0b16f, 0x1f, 0x1b46320, 0xc0005b83f0, 0xc000185720, 0xc000000064, 0x412da5, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging/logging_interceptor.go:130 +0xc6
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging.(*interceptor).handleServer(0xc0000b0ce0, 0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620ca0, 0xc0003f57c0, 0x0, 0xc00023a000, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging/logging_interceptor.go:83 +0xe0
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging.(*interceptor).handleServer-fm(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620ca0, 0x10, 0x17ddfe0, 0x316da01, 0xffffffffffffffff)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/logging/logging_interceptor.go:58 +0x73
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0x8, 0x0, 0x0, 0x30)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 +0x63
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/requestid.(*interceptor).handleServer(0xc000045860, 0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620cc0, 0xc000333918, 0x4d7b4b, 0x1ab6f40, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/requestid/request_id_injector.go:83 +0x28e
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/requestid.(*interceptor).handleServer-fm(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620cc0, 0x1880ba0, 0xc0003f57b0, 0x1e3c700, 0xc0006071d0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware/requestid/request_id_injector.go:24 +0x73
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1(0x1e3c700, 0xc0006071d0, 0x1b46320, 0xc0005b83f0, 0xc0006071d0, 0x20, 0x20, 0x1a6cc00)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 +0x63
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi.(*StoragePlugin).injectContext(0xc000416000, 0x1e3c700, 0xc000606ea0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620ce0, 0x0, 0x0, 0xc00023a000, ...)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware.go:226 +0xa7
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi.(*StoragePlugin).injectContext-fm(0x1e3c700, 0xc000606ea0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620ce0, 0x854efa, 0x1a6cc00, 0xc000620d00, 0xc000620c60)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/middleware.go:22 +0x73
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1(0x1e3c700, 0xc000606ea0, 0x1b46320, 0xc0005b83f0, 0x31ed738, 0xc000333ae8, 0x40c1d8, 0x20)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 +0x63
k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils.ChainUnaryServer.func2(0x1e3c700, 0xc000606ea0, 0x1b46320, 0xc0005b83f0, 0xc000620c60, 0xc000620c80, 0x1882e80, 0x31ed738, 0x1b7d5e0, 0xc00002c900)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:106 +0xf9
k8s.io/cloud-provider-vsphere/vendor/github.com/container-storage-interface/spec/lib/go/csi._Controller_CreateVolume_Handler(0x1b0d460, 0xc00000ab80, 0x1e3c700, 0xc000606ea0, 0xc0005b8380, 0xc00000b340, 0x0, 0x0, 0x30, 0xc0003a6660)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:4581 +0x158
k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc000163180, 0x1e492e0, 0xc00066d000, 0xc00002c900, 0xc0003e90e0, 0x31bf9a0, 0x0, 0x0, 0x0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc/server.go:1026 +0x4cd
k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc.(*Server).handleStream(0xc000163180, 0x1e492e0, 0xc00066d000, 0xc00002c900, 0x0)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc/server.go:1252 +0x1308
k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc00060c4f0, 0xc000163180, 0x1e492e0, 0xc00066d000, 0xc00002c900)
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc/server.go:699 +0x9f
created by k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
        /Users/trhoden/go/src/k8s.io/cloud-provider-vsphere/vendor/google.golang.org/grpc/server.go:697 +0xa1

After container failed permamnet with log on vsphere-csi-controller
time="2019-01-22T09:00:40Z" level=fatal msg="failed to listen" error="listen unix /var/lib/csi/sockets/pluginproxy/csi.sock: bind: address already in use"

What you expected to happen:
Understable error messages in csi .
Recovery after failure,

How to reproduce it (as minimally and precisely as possible):
Deploy k8s v1.13.1
Follow current instructions form docs.

Anything else we need to know?:
It runs on vCener 5.5.
logs from ccm
I0122 09:12:06.986602 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"my-vsphere-csi-pvc", UID:"d711fa42-1e23-11e9-8a47-0050569729eb", APIVersion:"v1", ResourceVersion:"164789", FieldPath:""}): type: 'Normal' reason: 'ExternalProvisioning' waiting for a volume to be created, either by external provisioner "io.k8s.cloud-provider-vsphere.vsphere" or manually created by system administrator
Environment:

  • vsphere-cloud-controller-manager version:
    gcr.io/cloud-provider-vsphere/vsphere-cloud-controller-manager:latest
    latest from google repository
  • OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel (e.g. uname -a):
    Linux k8s-master-0 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
yum info open-vm-tools
Installed Packages
Name        : open-vm-tools
Arch        : x86_64
Version     : 10.2.5
Release     : 3.el7
NAME                                     READY     STATUS             RESTARTS   AGE       IP             NODE
coredns-6fd7dbf94c-4brvg                 1/1       Running            0          18h       10.244.5.2     k8s-worker-2
coredns-6fd7dbf94c-blt57                 1/1       Running            0          18h       10.244.2.3     k8s-master-1
dns-autoscaler-5b4847c446-wqfs6          1/1       Running            0          18h       10.244.3.2     k8s-worker-1
kube-apiserver-k8s-master-0              1/1       Running            0          18h       10.100.0.110   k8s-master-0
kube-apiserver-k8s-master-1              1/1       Running            0          18h       10.100.0.111   k8s-master-1
kube-apiserver-k8s-master-2              1/1       Running            0          18h       10.100.0.112   k8s-master-2
kube-controller-manager-k8s-master-0     1/1       Running            0          18h       10.100.0.110   k8s-master-0
kube-controller-manager-k8s-master-1     1/1       Running            0          18h       10.100.0.111   k8s-master-1
kube-controller-manager-k8s-master-2     1/1       Running            0          18h       10.100.0.112   k8s-master-2
kube-flannel-6w66f                       2/2       Running            0          18h       10.100.0.115   k8s-worker-2
kube-flannel-fk85t                       2/2       Running            0          18h       10.100.0.110   k8s-master-0
kube-flannel-m7mhv                       2/2       Running            0          18h       10.100.0.112   k8s-master-2
kube-flannel-wzfx4                       2/2       Running            0          18h       10.100.0.111   k8s-master-1
kube-flannel-z2nb6                       2/2       Running            0          18h       10.100.0.114   k8s-worker-1
kube-flannel-zhxtq                       2/2       Running            0          18h       10.100.0.113   k8s-worker-0
kube-proxy-8ldbh                         1/1       Running            0          18h       10.100.0.110   k8s-master-0
kube-proxy-9sdts                         1/1       Running            0          18h       10.100.0.112   k8s-master-2
kube-proxy-cqq72                         1/1       Running            0          18h       10.100.0.113   k8s-worker-0
kube-proxy-rj8jt                         1/1       Running            0          18h       10.100.0.115   k8s-worker-2
kube-proxy-v2x8p                         1/1       Running            0          18h       10.100.0.114   k8s-worker-1
kube-proxy-wtx5z                         1/1       Running            0          18h       10.100.0.111   k8s-master-1
kube-scheduler-k8s-master-0              1/1       Running            0          18h       10.100.0.110   k8s-master-0
kube-scheduler-k8s-master-1              1/1       Running            0          18h       10.100.0.111   k8s-master-1
kube-scheduler-k8s-master-2              1/1       Running            0          18h       10.100.0.112   k8s-master-2
kubernetes-dashboard-8457c55f89-ghb8m    1/1       Running            0          18h       10.244.2.2     k8s-master-1
tiller-deploy-56794866d9-c7g5k           1/1       Running            0          18h       10.244.3.4     k8s-worker-1
vsphere-cloud-controller-manager-88s4g   1/1       Running            0          18h       10.100.0.111   k8s-master-1
vsphere-cloud-controller-manager-k9drk   1/1       Running            0          18h       10.100.0.112   k8s-master-2
vsphere-cloud-controller-manager-xjwwr   1/1       Running            0          18h       10.100.0.110   k8s-master-0
vsphere-csi-controller-0                 2/3       CrashLoopBackOff   6          16h       10.244.0.7     k8s-master-0
vsphere-csi-node-l2c5h                   2/2       Running            0          18h       10.100.0.115   k8s-worker-2
vsphere-csi-node-q4xw5                   2/2       Running            0          18h       10.100.0.114   k8s-worker-1
vsphere-csi-node-v8vsj                   2/2       Running            0          18h       10.100.0.113   k8s-worker-0
  • Others:

Merge and improve the documentation of the cloud provider

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

/kind improvement

What happened:
Currently, the ownerhship and status of the in-tree cloud provider documentation is unclear. Also, it ties the cloud provider to the storage related Project "Hatchway" (https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/), where a separation of concerns is perhaps better.

Going forward with both, in-tree and out-of-tree cloud provider, we should consolidate, re-host and clean up the documentation (supported versions, etc.). Also related tools and scripts, e.g. VCP UX deployment script, should be validated against the latest versions or marked as outdated as the current version (https://github.com/vmware/kubernetes/tree/enable-vcp-uxi) seems to be broken with K8s 1.11.

There's several open issues regarding the documentation, e.g. vmware-archive/kubernetes-archived#479 and vmware-archive/kubernetes-archived#491. Also, the current documentation seems to be incorrect with regards to the vsphere.conf on the workers: vmware-archive/kubernetes-archived#501

`kubectl get pv` is not listing dynamically created PVs by vSphere CSI Driver

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Dynamically provisioned persistence volume using vSphere CSI Driver does not get listed with kubectl get pv command.

kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:28:14Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
# kubectl get pv
No resources found.

Strange thing is Kubectl describe pv is listing the PV.

# kubectl describe pv
Name:            pvc-b4d9f41b-2441-11e9-9c0e-005056a43de5
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: io.k8s.cloud-provider-vsphere.vsphere
Finalizers:      []
StorageClass:    my-vsphere-fcd-class
Status:          Pending
Claim:           default/my-vsphere-csi-pvc
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        5Gi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            io.k8s.cloud-provider-vsphere.vsphere
    VolumeHandle:      167ba236-784f-4237-b17e-37dc519bf8b0
    ReadOnly:          false
    VolumeAttributes:      datacenter=vcqaDC
                           name=pvc-b4d9f41b-2441-11e9-9c0e-005056a43de5
                           parent_name=sharedVmfs-0
                           parent_type=Datastore
                           storage.kubernetes.io/csiProvisionerIdentity=1548817403656-8081-io.k8s.cloud-provider-vsphere.vsphere
                           type=First Class Disk
                           vcenter=10.161.155.44
Events:                <none>

If I specify PV name with kubectl get pv pvc-b4d9f41b-2441-11e9-9c0e-005056a43de5, I can see PV is getting listed.

# kubectl get pv pvc-b4d9f41b-2441-11e9-9c0e-005056a43de5
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                        STORAGECLASS           REASON   AGE
pvc-b4d9f41b-2441-11e9-9c0e-005056a43de5   5Gi        RWO            Delete           Pending   default/my-vsphere-csi-pvc   my-vsphere-fcd-class            13h

What you expected to happen:
kubectl get pv should list PVs created by io.k8s.cloud-provider-vsphere.vsphere

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • vsphere-cloud-controller-manager version:
    gcr.io/cloud-provider-vsphere/vsphere-csi:v0.1.1
    gcr.io/cloud-provider-vsphere/vsphere-cloud-controller-manager:latest

  • OS (e.g. from /etc/os-release):

  • Kernel (e.g. uname -a):

  • Install tools:

  • Others:

Add tests to vcpctl

Tests are missing in current vcpctl at this moment. Tests need to cover functionality in pkg/cli package.

It also needs to investigate how to implement E2E tests for vcpctl.

csi: Add e2e tests for plugin that use VMC

The goal here is have a set of tests that can run on VMC infrastructure that exercises E2E functionality of the CSI plugin.

To this end, the following tasks need to be completed:

  • Add support to sk8 for deploying the CSI plugin
  • Determine how to build an e2e.test binary from tests housed in this repo that use the existing K8s testing framework
  • Determine which E2E storage tests from the K8s repo can be run on VMC infrastructure. This will be a subset of what is there now, as most "disruptive" tests (e.g. those that reboot vCenter infrastructure components) cannot be run on VMC.
  • Determine what/where/how to run "destructive" tests, or tests that require additional infrastructure that is hard to achieve on VMC (e.g. multi-vcenter testing, zones, etc.).
  • Once prow jobs can launch tests onto VMC, run the E2E tests there and report status back to testgrid

These tests would be the "real" tests, that run on VMC against an actual vCenter instance.

Refactor NodeManager to use PropertyCollector

Currently state changes within vCenter are discovered by discovering/manually fetching the Moref and then polling the object for changes, we should refactor this code to use the PropertyCollector service offered in vCenter >4.1

/kind feature

Use a better property for reconciliation between nodes and VMs

The current in-tree provider code assumes that the CCM code is running on every node of the cluster and uses the VM UUID (fetched locally from /sys/class/dmi/id/product_serial) to perform reconciliation between the kubernetes node and its underlying VM.

With the out-of-tree provider, that assumption is not true anymore and the current code uses DNS names to perform reconciliation between node and VM, while this works in practice, it is suboptimal.

We should investigate a better way to implement this reconciliation.

/kind feature

Explore load balancer support on VMC

vSphere does not come with a load balancer facility out of the box, but when running in VMC (a.k.a. VMware on AWS) we have access to the ELB service provided by AWS.

This issue should track an exploratory effort to scope the feasibility and the work needed to bring support for ELB in cloud-provider-vsphere.

/kind feature
/assign @akutz

Create vcpctl tool to facilitate CCM provisioning

FEATURE REQUEST:

/kind feature

Deploying a cloud provider on vSphere is a task that has many prerequisites, from creating a user with correct roles on vCenter, to creating a correct configuration for the service. When migrating users from the in-tree version there's also a need to convert configuration files and to make sure sensitive credentials are stored safely.

The tool should fulfill these needs:

  • Perform vSphere configuration health check:

    • disk.uuid set on esx
  • Create vSphere role with minimal set of permissions

  • Create vSphere solution user (generate keypair), to be used with CCM

csi: Implement Node stage/unstage

Implement Node{Stage,Unstage}.

This operation mounts and unmounts the attached volume to a plugin-specific mountpoint, but does not expost the volume to a workload (pod, container) yet.

Add documentation for vcpctl

Write user-facing documentation for vcpctl, as a minimum:

  • Describe its functions
  • Walk the user through a new configuration from start to finish

/kind documentation
/assign @fanzhangio

Investigate using Bazel as build/test tool

Bazel is used by Kubernetes as build tool, would be nice to understand pro/cons of adopting it vs. using make.

The outcome of this issue should be a document or a comment on this issue detailing pro/cons and potentially reaching a consensus on the path forward.

/kind feature

csi: read same config from file as is used with cloud provider

When the config map is mounted to the container running the CSI plugin, the plugin should be capable of reading this file. This will allow the same config map to be used for the cloud provider and for CSI.

Since this is just a k/v flat file in the end, this does not make reading that config file K8s specific.

As much as we can, we want to be able to configure the plugin with env vars, but more advanced configs involving multiple vCenter's, for example, would be complicated through that method.

Return descriptive vm size when calling InstanceType()

Currently when calling InstanceType() we simply return vsphere-vm, this does not help the user in identifying the type of VM that the k8s node is running on.

As vSphere doesn't have t-shirt size instances, we should come up with a meaningful expression of the vm size to be reported as string.

Some ideas (assuming a VM with 4 vCPU and 8GB of RAM):

  • vsphere.4c8192m
  • vm-4cpu8192mem
  • vm.4c8m
  • vm.4c8gb
  • vm.4cpu-8gb
  • vsphere-vm.4cpu-8gb

/kind feature
/cc @dvonthenen @dougm @akutz

Implement E2E tests using vcsim

We need end-to-end tests for the cloud provider code to provide functional testing against a real vSphere API endpoint.

Currently the CCM has no E2E tests, we should implement an initial set of E2E tests for it, these tests should be part of a new Makefile target that will automatically run as part of a new CI job.

  • Implement E2E tests using vcsim
  • Create Makefile target for E2E tests
  • Create a new job in kubernetes/test-infra for E2E tests and make it run at every PR

/kind feature

csi: add CI for plugin using vcsim

Want to be able to run tests on the CSI plugin on a local machine using only vcsim. This requires updates to vcsim as well, to support FCD.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.