GithubHelp home page GithubHelp logo

Comments (3)

maelk avatar maelk commented on June 28, 2024

Did you see this behavior with CentOS7 ?
Can you please add a yaml output of one of the machines ? There should be more info than just "Failed".

from metal3-dev-env.

maelk avatar maelk commented on June 28, 2024

This is independent of the OS. Here are the machines in a failed CI run :

apiVersion: v1
items:
- apiVersion: cluster.x-k8s.io/v1alpha3
  kind: Machine
  metadata:
    creationTimestamp: "2020-03-22T08:26:40Z"
    finalizers:
    - machine.cluster.x-k8s.io
    generation: 3
    labels:
      cluster.x-k8s.io/cluster-name: test1
      cluster.x-k8s.io/control-plane: ""
      kubeadm.controlplane.cluster.x-k8s.io/hash: "3288331389"
    name: test1-controlplane-lpnxq
    namespace: metal3
    ownerReferences:
    - apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
      blockOwnerDeletion: true
      controller: true
      kind: KubeadmControlPlane
      name: test1-controlplane
      uid: 834ec5bd-3cbf-4b87-9ccc-f8b5b07140d4
    resourceVersion: "6836"
    selfLink: /apis/cluster.x-k8s.io/v1alpha3/namespaces/metal3/machines/test1-controlplane-lpnxq
    uid: 6be1adb7-c185-4779-be4c-d04b0e396f2e
  spec:
    bootstrap:
      configRef:
        apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
        kind: KubeadmConfig
        name: test1-controlplane-bzpmb
        namespace: metal3
        uid: 5c3bbc07-b8f3-4ccb-98f5-be299edf226e
      dataSecretName: test1-controlplane-bzpmb
    clusterName: test1
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
      kind: Metal3Machine
      name: test1-controlplane-jfnpf
      namespace: metal3
      uid: 1cc69570-8c3d-4fbd-a8d6-c7d25118cb31
    providerID: metal3://d99b9c2b-5aee-45e6-9249-e7b42f7ccc76
    version: v1.17.0
  status:
    addresses:
    - address: 192.168.111.21
      type: InternalIP
    - address: 172.22.0.91
      type: InternalIP
    - address: node-1
      type: Hostname
    - address: node-1
      type: InternalDNS
    bootstrapReady: true
    failureMessage: 'Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1alpha3,
      Kind=Metal3Machine with name "test1-controlplane-jfnpf": Failed to associate
      the BaremetalHost to the Metal3Machine'
    failureReason: CreateError
    infrastructureReady: true
    lastUpdated: "2020-03-22T08:34:41Z"
    nodeRef:
      name: node-1
      uid: 49aa2fb7-b645-4da3-9600-64e31074bf81
    phase: Failed
- apiVersion: cluster.x-k8s.io/v1alpha3
  kind: Machine
  metadata:
    creationTimestamp: "2020-03-22T08:26:21Z"
    finalizers:
    - machine.cluster.x-k8s.io
    generateName: test1-md-0-78d55dc456-
    generation: 3
    labels:
      cluster.x-k8s.io/cluster-name: test1
      machine-template-hash: "3481187012"
      nodepool: nodepool-0
    name: test1-md-0-78d55dc456-5b86v
    namespace: metal3
    ownerReferences:
    - apiVersion: cluster.x-k8s.io/v1alpha3
      blockOwnerDeletion: true
      controller: true
      kind: MachineSet
      name: test1-md-0-78d55dc456
      uid: 7b2e3dbc-8af2-4f43-8945-6968bc12bd76
    resourceVersion: "9982"
    selfLink: /apis/cluster.x-k8s.io/v1alpha3/namespaces/metal3/machines/test1-md-0-78d55dc456-5b86v
    uid: 3826ad1f-b1ed-4504-b265-2dde934055e9
  spec:
    bootstrap:
      configRef:
        apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
        kind: KubeadmConfig
        name: test1-md-0-5hhsj
        namespace: metal3
        uid: 2d396de3-c3f4-45cc-baff-d915ebcfb20d
      dataSecretName: test1-md-0-5hhsj
    clusterName: test1
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
      kind: Metal3Machine
      name: test1-md-0-hr5jw
      namespace: metal3
      uid: 3477a996-d531-4901-a20e-bd9c93b1a8e6
    providerID: metal3://6fc41e7b-cc2b-4a06-a9d4-883c563d0c47
    version: v1.17.0
  status:
    addresses:
    - address: 172.22.0.87
      type: InternalIP
    - address: 192.168.111.20
      type: InternalIP
    - address: node-0
      type: Hostname
    - address: node-0
      type: InternalDNS
    bootstrapReady: true
    failureMessage: 'Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1alpha3,
      Kind=Metal3Machine with name "test1-md-0-hr5jw": Failed to associate the BaremetalHost
      to the Metal3Machine'
    failureReason: CreateError
    infrastructureReady: true
    lastUpdated: "2020-03-22T08:45:24Z"
    nodeRef:
      name: node-0
      uid: 8d146172-c593-455e-bf3c-c5c1dcb1bb73
    phase: Failed
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

The problem is that CAPM3 reports an error when it fails to associate the machine (i.e. requeues) and CAPI picks it up, but does not pick up the change back to normal state. CAPM3 needs to be modified to not set the Metal3Machine to error when it is a transient error.

This will be tackled in metal3-io/cluster-api-provider-metal3#30
/close

from metal3-dev-env.

metal3-io-bot avatar metal3-io-bot commented on June 28, 2024

@maelk: Closing this issue.

In response to this:

This is independent of the OS. Here are the machines in a failed CI run :

apiVersion: v1
items:
- apiVersion: cluster.x-k8s.io/v1alpha3
 kind: Machine
 metadata:
   creationTimestamp: "2020-03-22T08:26:40Z"
   finalizers:
   - machine.cluster.x-k8s.io
   generation: 3
   labels:
     cluster.x-k8s.io/cluster-name: test1
     cluster.x-k8s.io/control-plane: ""
     kubeadm.controlplane.cluster.x-k8s.io/hash: "3288331389"
   name: test1-controlplane-lpnxq
   namespace: metal3
   ownerReferences:
   - apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
     blockOwnerDeletion: true
     controller: true
     kind: KubeadmControlPlane
     name: test1-controlplane
     uid: 834ec5bd-3cbf-4b87-9ccc-f8b5b07140d4
   resourceVersion: "6836"
   selfLink: /apis/cluster.x-k8s.io/v1alpha3/namespaces/metal3/machines/test1-controlplane-lpnxq
   uid: 6be1adb7-c185-4779-be4c-d04b0e396f2e
 spec:
   bootstrap:
     configRef:
       apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
       kind: KubeadmConfig
       name: test1-controlplane-bzpmb
       namespace: metal3
       uid: 5c3bbc07-b8f3-4ccb-98f5-be299edf226e
     dataSecretName: test1-controlplane-bzpmb
   clusterName: test1
   infrastructureRef:
     apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
     kind: Metal3Machine
     name: test1-controlplane-jfnpf
     namespace: metal3
     uid: 1cc69570-8c3d-4fbd-a8d6-c7d25118cb31
   providerID: metal3://d99b9c2b-5aee-45e6-9249-e7b42f7ccc76
   version: v1.17.0
 status:
   addresses:
   - address: 192.168.111.21
     type: InternalIP
   - address: 172.22.0.91
     type: InternalIP
   - address: node-1
     type: Hostname
   - address: node-1
     type: InternalDNS
   bootstrapReady: true
   failureMessage: 'Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1alpha3,
     Kind=Metal3Machine with name "test1-controlplane-jfnpf": Failed to associate
     the BaremetalHost to the Metal3Machine'
   failureReason: CreateError
   infrastructureReady: true
   lastUpdated: "2020-03-22T08:34:41Z"
   nodeRef:
     name: node-1
     uid: 49aa2fb7-b645-4da3-9600-64e31074bf81
   phase: Failed
- apiVersion: cluster.x-k8s.io/v1alpha3
 kind: Machine
 metadata:
   creationTimestamp: "2020-03-22T08:26:21Z"
   finalizers:
   - machine.cluster.x-k8s.io
   generateName: test1-md-0-78d55dc456-
   generation: 3
   labels:
     cluster.x-k8s.io/cluster-name: test1
     machine-template-hash: "3481187012"
     nodepool: nodepool-0
   name: test1-md-0-78d55dc456-5b86v
   namespace: metal3
   ownerReferences:
   - apiVersion: cluster.x-k8s.io/v1alpha3
     blockOwnerDeletion: true
     controller: true
     kind: MachineSet
     name: test1-md-0-78d55dc456
     uid: 7b2e3dbc-8af2-4f43-8945-6968bc12bd76
   resourceVersion: "9982"
   selfLink: /apis/cluster.x-k8s.io/v1alpha3/namespaces/metal3/machines/test1-md-0-78d55dc456-5b86v
   uid: 3826ad1f-b1ed-4504-b265-2dde934055e9
 spec:
   bootstrap:
     configRef:
       apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
       kind: KubeadmConfig
       name: test1-md-0-5hhsj
       namespace: metal3
       uid: 2d396de3-c3f4-45cc-baff-d915ebcfb20d
     dataSecretName: test1-md-0-5hhsj
   clusterName: test1
   infrastructureRef:
     apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
     kind: Metal3Machine
     name: test1-md-0-hr5jw
     namespace: metal3
     uid: 3477a996-d531-4901-a20e-bd9c93b1a8e6
   providerID: metal3://6fc41e7b-cc2b-4a06-a9d4-883c563d0c47
   version: v1.17.0
 status:
   addresses:
   - address: 172.22.0.87
     type: InternalIP
   - address: 192.168.111.20
     type: InternalIP
   - address: node-0
     type: Hostname
   - address: node-0
     type: InternalDNS
   bootstrapReady: true
   failureMessage: 'Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1alpha3,
     Kind=Metal3Machine with name "test1-md-0-hr5jw": Failed to associate the BaremetalHost
     to the Metal3Machine'
   failureReason: CreateError
   infrastructureReady: true
   lastUpdated: "2020-03-22T08:45:24Z"
   nodeRef:
     name: node-0
     uid: 8d146172-c593-455e-bf3c-c5c1dcb1bb73
   phase: Failed
kind: List
metadata:
 resourceVersion: ""
 selfLink: ""

The problem is that CAPM3 reports an error when it fails to associate the machine (i.e. requeues) and CAPI picks it up, but does not pick up the change back to normal state. CAPM3 needs to be modified to not set the Metal3Machine to error when it is a transient error.

This will be tackled in metal3-io/cluster-api-provider-metal3#30
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from metal3-dev-env.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.