Add annotation to skip rescheduling simulation for pods that don't need to survive node deletion

/area cluster-autoscaler

When scaling down nodes with unmanaged pods (no Deployment, ReplicaSet, or other controller), the autoscaler still runs rescheduling  
  simulation to find destination nodes for these pods. This causes issues: 

  1. Unmanaged pods with safe-to-evict: true are evictable, but simulation still runs
  2. The simulation cascade effect (pods virtually scheduled on remaining nodes) causes later nodes to fail simulation
  3. Scale-down becomes slow or blocked, even though these pods don't need to be rescheduled - they can simply be terminated

  Current behavior:
  - safe-to-evict: true → Pod can be evicted, but still needs destination in simulation
  - DaemonSet pods → Excluded from PodsToReschedule, no simulation needed
  - Mirror pods → SkipDrain status, no simulation needed 
  
  Proposed solution:
  Add a new annotation:                                                                                                                 
  cluster-autoscaler.kubernetes.io/skip-reschedule: "true"

  Pods with this annotation would:                                                                                                      
  - Be evictable (same as safe-to-evict: true)                                                                                          
  - Receive SkipDrain status (like mirror pods)                                                                                         
  - Not be added to PodsToReschedule                                                                                                    
  - Not require destination simulation                                                                                                  

  Use case:                                                                                                                             
  Unmanaged pods that are intentionally ephemeral - they don't need to survive node deletion. The pod will simply be terminated when the node is removed, and that's acceptable. There's no controller to reschedule them anyway.                                             
                                                                                                                                        
  Implementation:                                                                                                                       
                                                                                                                                        
  In simulator/drainability/rules/safetoevict/rule.go, add check:                                                                       
  if pod.Annotations["cluster-autoscaler.kubernetes.io/skip-reschedule"] == "true" {                                                    
      return drainability.NewSkipStatus()                                                                                               
  }                 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add annotation to skip rescheduling simulation for pods that don't need to survive node deletion #9110

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add annotation to skip rescheduling simulation for pods that don't need to survive node deletion #9110

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions