Question
Answer and Explanation
To stop an Argo workflow when an error occurs in a step, you can leverage Argo's built-in features for error handling and workflow control. Here's how you can achieve this:
1. Using `failFast` in Workflow Definition:
- The `failFast` parameter, when set to `true` at the workflow level, will cause the entire workflow to terminate immediately upon the failure of any step. This is the simplest way to stop a workflow on the first error.
- Example in YAML:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: error-handling-
spec:
failFast: true
entrypoint: main
templates:
- name: main
steps:
- - name: step1
template: task1
- - name: step2
template: task2
- name: task1
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["exit 1"]
- name: task2
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["echo 'This step will not run'"]
- In this example, if `task1` fails (exits with a non-zero code), the entire workflow will stop, and `task2` will not be executed.
2. Using `onExit` Handlers:
- You can define an `onExit` template that will be executed when a workflow or a step completes, regardless of its success or failure. This can be used to perform cleanup or logging, but it doesn't directly stop the workflow on error.
- Example in YAML:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: error-handling-
spec:
entrypoint: main
onExit: cleanup
templates:
- name: main
steps:
- - name: step1
template: task1
- - name: step2
template: task2
- name: task1
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["exit 1"]
- name: task2
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["echo 'This step will not run'"]
- name: cleanup
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["echo 'Cleanup task executed'"]
- In this case, `cleanup` will run after the workflow completes, regardless of whether `task1` or `task2` failed. However, it doesn't stop the workflow mid-execution.
3. Using `when` Conditions:
- You can use `when` conditions to conditionally execute steps based on the status of previous steps. However, this doesn't stop the workflow; it just skips steps based on conditions.
- Example in YAML:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: error-handling-
spec:
entrypoint: main
templates:
- name: main
steps:
- - name: step1
template: task1
- - name: step2
template: task2
when: " == Succeeded"
- name: task1
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["exit 1"]
- name: task2
container:
image: alpine:latest
command: ["sh", "-c"]
args: ["echo 'This step will not run'"]
- In this example, `step2` will only run if `step1` succeeds. If `step1` fails, `step2` will be skipped, but the workflow will still complete.
Recommendation:
- For most cases, using `failFast: true` at the workflow level is the most straightforward way to stop an Argo workflow immediately when an error occurs in any step. This ensures that you don't waste resources on subsequent steps that depend on the failed step.
- If you need more complex error handling, consider using `onExit` for cleanup and logging, but remember that it won't stop the workflow mid-execution. `when` conditions are useful for conditional execution but not for immediate termination on error.