Preventing pod preemption in Kubernetes environments for Dagster jobs

Last updated: October 10, 2025

When running Dagster jobs in Kubernetes environments like GKE Autopilot, pod preemption can cause job failures. You can prevent this by configuring tolerations and nodeSelector settings in your Helm chart configuration.

Configuration placement

These configurations should be placed under podSpecConfig (not jobSpecConfig) in your Helm chart, as tolerations and nodeSelector are fields on the pod spec (spec.template.spec) rather than the job spec (job.spec).

Example configuration

Add the following to your Helm chart values:

workspace:
  runK8sConfig:
    podSpecConfig:
      tolerations:
        - key: "example-key"
          operator: "Equal"
          value: "example-value"
          effect: "NoSchedule"
      nodeSelector:
        workload-type: "batch"

Benefits

This approach provides several advantages:

  • Cross-platform compatibility: Works across different cluster autoscalers and cloud providers, not just specific solutions like Karpenter

  • Granular control: Offers fine-grained control over pod scheduling and eviction behavior

  • GKE Autopilot support: Particularly effective in GKE Autopilot environments where pod preemption is common

  • Standard Kubernetes: Uses standard Kubernetes tolerations and nodeSelector configurations

Use cases

This configuration is especially useful when:

  • Running long-running Dagster jobs that shouldn't be interrupted

  • Working in environments with aggressive cluster autoscaling

  • Needing to ensure job completion without preemption-related failures

  • Managing workloads in shared Kubernetes clusters