FG
☁️ Cloud & DevOpsAmazon

BeforeSuite {Kubernetes e2e suite}

Freshabout 21 hours ago
Mar 14, 20260 views
Confidence Score55%
55%

Problem

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gce-scalability/10992/ Failed: BeforeSuite {Kubernetes e2e suite} [code block] Previous issues for this test: #26135 #26236 #27920 #28492 #29970 #30075 #32980 #33313

Error Output

Error waiting for all pods to be running and ready: 1 / 418 pods in namespace "kube-system" are NOT in the desired state in 10m0s

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Increase Pod Readiness Timeout for Kubernetes E2E Tests

Medium Risk

The error indicates that not all pods in the 'kube-system' namespace are transitioning to a running and ready state within the default timeout of 10 minutes. This can occur due to resource constraints, slow pod initialization, or issues with the underlying infrastructure. Given the scale of the test (418 pods), it is likely that the default timeout is insufficient for all pods to become ready.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Identify Resource Constraints

    Check the resource allocation (CPU, memory) for the nodes in the cluster to ensure they can handle the load of 418 pods. Use the following command to check node resource utilization.

    bash
    kubectl top nodes
  2. 2

    Increase Pod Readiness Timeout

    Modify the e2e test configuration to increase the pod readiness timeout. This can be done by setting the 'podReadyTimeout' parameter to a higher value (e.g., 20 minutes) in the test configuration file.

    bash
    export POD_READY_TIMEOUT=1200
  3. 3

    Scale Up Cluster Resources

    If resource constraints are identified, consider scaling up the cluster by adding more nodes or increasing the size of existing nodes. Use the following command to add nodes to your cluster.

    bash
    gcloud container clusters resize [CLUSTER_NAME] --node-pool [NODE_POOL_NAME] --num-nodes [NEW_NODE_COUNT]
  4. 4

    Monitor Pod Status

    After making the changes, monitor the status of the pods in the 'kube-system' namespace to ensure they transition to the 'Running' and 'Ready' state. Use the following command to check pod status.

    bash
    kubectl get pods -n kube-system
  5. 5

    Re-run E2E Tests

    Once the pods are confirmed to be running and ready, re-run the Kubernetes e2e tests to verify that the issue has been resolved.

    bash
    make test-e2e

Validation

Confirm that all pods in the 'kube-system' namespace are in the 'Running' and 'Ready' state before re-running the e2e tests. Check the test results to ensure that the 'BeforeSuite' step completes successfully without errors.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

kubernetesk8scontainerspriority/critical-urgentkind/flake