Skip to content

Distinguish between a failed machine creation and a failed node creation #1064

@Kostov6

Description

@Kostov6

How to categorize this issue?

/area ops-productivity
/kind bug

What happened:

❯ k -n ... get machine <machine> -o yaml | yq .status
currentStatus:
  lastUpdateTime: "2026-01-16T10:04:42Z"
  phase: Pending
  timeoutActive: true
lastOperation:
  description: Creating machine on cloud provider
  lastUpdateTime: "2026-01-16T10:04:42Z"
  state: Processing
  type: Create

❯ aws ec2 describe-instances ... --query 'Reservations[].Instances[].[Tags[?Key==`Name`] | [0].Value, State.Name]'
[
    [
        "<machine>",
        "running"
    ]
]

❯ k get nodes
(empty)

What you expected to happen:

The above example shows a successfully created machine on the cloud provider but a failure in kubelet registering the node. In this case a more appropriate description would be Waiting for kubelet to create node object.

This message will improve debugging failing shoot clusters as it is much more accurate

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/ops-productivityOperator productivity related (how to improve operations)kind/bugBuglifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions