Skip to main content

Troubleshooting

This section will help you troubleshoot common issues that can arise during the installation of Pangea's Private Cloud services.

Pre-Installation Checklist

Before starting the installation, ensure your environment is properly configured. Here's what to check and why:

1. Required Tools

  • AWS CLI: Required for ECR access and image pulls.
  • Helm: Needed for package management and deployment.
brew list awscli >/dev/null 2>&1 || brew install awscli
brew list helm >/dev/null 2>&1 || brew install helm

2. AWS Configuration

  • Verify that AWS credentials are set up:
aws configure list

3. Kubernetes Access

  • Check that the cluster is responsive:
kubectl cluster-info

4. pangeacluster.yml Check

  • Ensure pangeacluster.yml is not in your current directory to avoid interference with the installation.
  • Suggested action:
rm *.yml

Common Installation Issues

Namespace Management

Problem

  • Installation fails due to namespace conflicts.

Symptoms

  1. Error message about namespace already existing:
    ERROR: INSTALLATION FAILED: Unable to continue with install: ClusterRole "manager-role" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "pangea-private-beta1": current value is "pangea-private-beta1"
  2. Namespace mismatch errors:
    ERROR: the namespace from the provided object "pangea-private-beta2" does not match the namespace "pangea-private-beta1". You must pass "--namespace=pangea-private-beta1" to perform this operation."

Resolution Steps

  1. Check existing namespaces:
    kubectl get namespaces
  2. Clean up existing installation:
    ./pangea-private-cloud.sh -n <namespace> -u
    Warning: Use the uninstall command cautiously, as it will remove the entire namespace.
  3. Verify namespace cleanup:
    kubectl get all -n <namespace>
  4. Start fresh installation:
    ./pangea-private-cloud.sh -n <namespace>

Best Practices

  • Use one deployment per team.
  • Use descriptive names (e.g., pangea-team-dev).
  • Document namespace assignments.

Service Fails to Install

Problem

  • Pangea cluster pods (authn/embargo/gateway) are not in a running state.

Symptoms

  • No "pangea-cluster" pods running:

    kubectl get pods -n <namespace>
  • Expected Output Expected output postgres

Resolution Steps

  1. Monitor pod status:
    kubectl get pods -n <namespace>
  2. Identify failing jobs:
    kubectl get job -n <namespace>
  3. Delete the failed job:
    kubectl delete job pangea-cluster-authn-<hex string>-<hex string> -n <namespace>

File Intel Initial Database Sync

Problem

  • File-intel service takes significant time to sync (10-20 hours).

Resolution Steps

  1. Check maintenance jobs:
    kubectl get cronjobs -n <namespace>
  • Expected Output: Expected output file-intel
  1. Verify job status:
    kubectl get pods -n <namespace>
  • Expected Output: Expected output file-intel pods
  1. If jobs are missing:
    • Uninstall and reinstall:
      ./pangea-private-cloud.sh -n <namespace> -u

Database Initialization Delays

Problem

  • Postgres operator initialization takes up to 10 minutes.

Symptoms

  • Script appears stuck at "Waiting for pod postgres-0 to run..."
  • Database connection failures.
  • Services failing to start.

Resolution Steps

  1. Monitor Postgres pod status:
    kubectl get pods -n <namespace> | grep postgres
  2. Check pod events for issues:
    kubectl describe pod postgres-0 -n <namespace>
  3. View Postgres logs:
    kubectl logs postgres-0 -n <namespace>

Note: Postgres initialization may require patience. Avoid interrupting the script during this phase.

Resource Investigation

When services aren't working as expected, investigate their state and logs.

Pod Details Investigation

  • Reveals resource constraints, configuration problems, and node assignment issues.
kubectl describe pod <pod-name> -n <namespace>

Service Log Analysis

  • Shows application-level errors, connection issues, and initialization problems.
kubectl logs <pod-name> -n <namespace> --previous

Event Timeline Review

  • Shows chronological order of issues and reveals cascading failures.
kubectl get events -n <namespace> --sort-by='.metadata.creationTimestamp'

Configuration Verification

List Secrets

kubectl get secrets -n <namespace>

List Custom Resources

kubectl get crd -n <namespace> | grep pangea

Clean Uninstallation Process

1. Uninstall Services

  • Triggers graceful shutdown of services:
    ./pangea-private-cloud.sh -n <namespace> -u

2. Verify Resource Cleanup

  • Check that all resources and secrets are removed:
    kubectl get all -n <namespace>
    kubectl get secrets -n <namespace>

3. Force Cleanup (if needed)

  • Forcefully remove namespace:
    kubectl delete namespace <namespace> --force

Was this article helpful?

Contact us