Preventing pod preemption in Kubernetes environments for Dagster jobs
Last updated: October 10, 2025
When running Dagster jobs in Kubernetes environments like GKE Autopilot, pod preemption can cause job failures. You can prevent this by configuring tolerations and nodeSelector settings in your Helm chart configuration.
Configuration placement
These configurations should be placed under podSpecConfig (not jobSpecConfig) in your Helm chart, as tolerations and nodeSelector are fields on the pod spec (spec.template.spec) rather than the job spec (job.spec).
Example configuration
Add the following to your Helm chart values:
workspace:
runK8sConfig:
podSpecConfig:
tolerations:
- key: "example-key"
operator: "Equal"
value: "example-value"
effect: "NoSchedule"
nodeSelector:
workload-type: "batch"
Benefits
This approach provides several advantages:
Cross-platform compatibility: Works across different cluster autoscalers and cloud providers, not just specific solutions like Karpenter
Granular control: Offers fine-grained control over pod scheduling and eviction behavior
GKE Autopilot support: Particularly effective in GKE Autopilot environments where pod preemption is common
Standard Kubernetes: Uses standard Kubernetes tolerations and nodeSelector configurations
Use cases
This configuration is especially useful when:
Running long-running Dagster jobs that shouldn't be interrupted
Working in environments with aggressive cluster autoscaling
Needing to ensure job completion without preemption-related failures
Managing workloads in shared Kubernetes clusters