FG
☁️ Cloud & DevOpsAmazon

ci-kubernetes-e2e-gce-examples: broken test run

Freshabout 21 hours ago
Mar 14, 20260 views
Confidence Score55%
55%

Problem

https://storage.googleapis.com/k8s-gubernator/triage/index.html#451144a9be5d4451ad3c https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-examples/4198/ Multiple broken tests: Failed: [k8s.io] [Feature:Example] [k8s.io] Hazelcast should create and scale hazelcast {Kubernetes e2e suite} [code block] Issues about this test specifically: #27850 #30672 #33271 Failed: [k8s.io] [Feature:Example] [k8s.io] CassandraStatefulSet should create statefulset {Kubernetes e2e suite} [code block] Issues about this test specifically: #36323 #36469 #38222 Failed: [k8s.io] [Feature:Example] [k8s.io] Cassandra should create and scale cassandra {Kubernetes e2e suite} [code block] Issues about this test specifically: #27978 #28817 #39574 Failed: Test {e2e.go} [code block] Issues about this test specifically: #33361 #38663 #39788 #39877 #40371 #40469 #40478 #40483 #40668 #41048 #43025 Previous issues for this suite: #36930 #37437 #38393 #38578 #39379 #40341 #43015

Error Output

error:
    <*errors.errorString | 0xc4211e6310>: {

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Fix E2E Test Failures in Kubernetes Example Suite

Medium Risk

The failures in the Hazelcast and Cassandra tests are likely due to environmental inconsistencies and resource allocation issues in the GCE (Google Compute Engine) environment. These tests depend on specific configurations and resource limits that may not be met, leading to timeouts and unexpected errors during execution.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Increase Resource Limits for Test Pods

    Update the resource requests and limits for the Hazelcast and Cassandra test pods to ensure they have sufficient CPU and memory during the test execution. This can help prevent timeouts and resource starvation.

    yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: hazelcast-test
    spec:
      containers:
      - name: hazelcast
        image: hazelcast/hazelcast
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1"
    
  2. 2

    Review and Update Test Configuration

    Check the configuration files for the Hazelcast and Cassandra tests to ensure they are using the latest settings and parameters. This includes verifying the image versions and any environment variables that may affect test execution.

    bash
    kubectl get configmap hazelcast-config -o yaml
  3. 3

    Run Tests with Increased Timeouts

    Modify the test suite to increase the timeout values for the tests that are failing. This can help accommodate any delays in resource provisioning or scaling operations that may occur in the GCE environment.

    bash
    go test -timeout 30m ./...  # Increase timeout to 30 minutes
  4. 4

    Check for Known Issues and Patches

    Review the linked issue numbers for known bugs or patches that may address the failures. Apply any relevant patches or updates to the test suite or Kubernetes environment.

    bash
    git cherry-pick <commit-hash>  # Apply specific patch if available
  5. 5

    Re-run the E2E Test Suite

    After applying the fixes, re-run the Kubernetes E2E test suite to verify that the changes have resolved the issues. Monitor the logs for any remaining errors or failures.

    bash
    kubectl apply -f test-suite.yaml && kubectl logs -f test-suite-pod

Validation

Confirm that all tests in the Kubernetes E2E suite pass without errors. Review the logs for any remaining issues and ensure that the resource allocation changes have taken effect by checking the pod status and resource usage.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

kubernetesk8scontainersarea/examplekind/flakearea/example/cassandra