Skip to content

πŸ”Ž MonitorsΒΆ

Monitors run custom releases validations and can rollback releases.

Monitors flowΒΆ

flowchart LR helmwave_up[helmwave up] exit0[exit 0] exit1[exit 1] helmwave_up --> release1[upgrade release 1] helmwave_up --> release2[upgrade release 2] helmwave_up --> release3[upgrade release 3] release1 -- succeeded --> monitor1_start release2 -- succeeded --> monitor1_start release2 -- succeeded --> monitor2_start release3 -- succeeded --> monitor2_start monitor1_failed -.rollback release.->release_rollback1[rollback release 1] monitor1_failed -.rollback release.->release_rollback2[rollback release 2] monitor2_failed -.rollback release.->release_rollback2[rollback release 2] monitor2_failed -.rollback release.->release_rollback3[rollback release 3] release_rollback1 -.-> exit1 release_rollback2 -.-> exit1 release_rollback3 -.-> exit1 monitor1_succeeded -.-> exit0 monitor2_succeeded -.-> exit0 subgraph monitor1[Monitor 1] monitor1_start[Monitor start] monitor1_iteration[Monitor iteration] monitor1_failed[Monitor failed] monitor1_succeeded[Monitor succeeded] monitor1_start --> monitor1_iteration monitor1_iteration --next iteration--> monitor1_iteration monitor1_iteration --failure threshold or total timeout-->monitor1_failed monitor1_iteration --success threshold-->monitor1_succeeded end subgraph monitor2[Monitor 2] monitor2_start[Monitor start] monitor2_iteration[Monitor iteration] monitor2_failed[Monitor failed] monitor2_succeeded[Monitor succeeded] monitor2_start --> monitor2_iteration monitor2_iteration --next iteration--> monitor2_iteration monitor2_iteration --failure threshold or total timeout-->monitor2_failed monitor2_iteration --success threshold-->monitor2_succeeded end
  • Each monitor starts when its all dependant releases succeeded
  • Each monitor runs its iterations every iterval with iteration_timeout
  • Consecutive successful iterations are counted towards success_threshold
  • Consecutive failed iterations are counted towards failure_threshold
  • After all monitors exited dependant releases do actions for their failed monitors

DemoΒΆ

asciicast

helmwave.yml
registries:
  - host: registry-1.docker.io

monitors:
  - name: nats-up-metric
    type: prometheus
    total_timeout: 1m # fail if it flaps between success and failure for so long
    iteration_timeout: 1s
    interval: 2s
    success_threshold: 5
    failure_threshold: 5
    prometheus:
      url: http://localhost:9090
      expr: |
        up == 1
  - name: nats-delivered-metric
    type: prometheus
    total_timeout: 1m # fail if it flaps between success and failure for so long
    iteration_timeout: 5s
    interval: 10s
    success_threshold: 5
    failure_threshold: 5
    prometheus:
      url: http://localhost:9090
      expr: |
        sum(rate(nats_consumer_delivered_consumer_seq[15s])) > 0

.options: &options
  namespace: nats
  create_namespace: true
  wait: true
  timeout: 1m
  max_history: 3 # best practice
  chart:
    # For example, we will use bitnami/nats chart, because it's small and fast
    name: oci://registry-1.docker.io/bitnamicharts/nats
    version: 7.8.3 # best practice

releases:
  - name: nats
    <<: *options
    monitors:
      - name: nats-up-metric
      - name: nats-delivered-metric
$ helmwave build --diff-mode none
[INFO]: πŸ”¨ Building releases...
[INFO]: πŸ”¨ Building values...
[INFO]: πŸ”¨ no values provided
    release: nats@nats
[INFO]: πŸ”¨ Building repositories...
[INFO]: πŸ”¨ Building registries...
[INFO]: πŸ—„ registry has been added to the plan
    registry: registry-1.docker.io
[INFO]: πŸ”¨ Building charts...
[INFO]: Pulled: registry-1.docker.io/bitnamicharts/nats:7.8.3
[INFO]: Digest: sha256:5f80350b8a85177e4a9c7ed968f77c47bedcc461418172fb66594bc61fa1ffac
[INFO]: πŸ”¨ Building manifests...
[INFO]: ❎ skipping updating dependencies for remote chart
    release: nats@nats
[INFO]: Pulled: registry-1.docker.io/bitnamicharts/nats:7.8.3
[INFO]: Digest: sha256:5f80350b8a85177e4a9c7ed968f77c47bedcc461418172fb66594bc61fa1ffac
[INFO]: βœ…  manifest done
    release: nats@nats
[INFO]: πŸ”¨ Building graphs...
[INFO]: show graph:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ nats@nats β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

[INFO]: πŸ— Plan
    registries: 
      - registry-1.docker.io
    releases: 
      - nats@nats
    repositories: 
      - 
[INFO]: πŸ†š Skip diffing
[INFO]: πŸ— Planfile is ready!
[INFO]: πŸ— Plan
    releases: 
      - nats@nats
    repositories: 
      - 
    registries: 
      - registry-1.docker.io
[INFO]: πŸ—„ sync repositories...
[INFO]: πŸ—„ sync registries...
[INFO]: πŸ›₯ sync releases...
[INFO]: πŸ›₯ deploying... 
    release: nats@nats
[INFO]: βœ…
    release: nats@nats
[INFO]: monitor succeeded
    monitor: nats-up-metric
    streak: 1/5
[INFO]: monitor succeeded
    monitor: nats-up-metric
    streak: 2/5
[INFO]: monitor succeeded
    streak: 3/5
    monitor: nats-up-metric
[INFO]: monitor succeeded
    streak: 4/5
    monitor: nats-up-metric
[INFO]: monitor did not succeed
    monitor: nats-delivered-metric
    streak: 1/5
    error: result is empty
[INFO]: monitor succeeded
    monitor: nats-up-metric
    streak: 5/5
[INFO]: βœ…
    monitor: nats-up-metric
[INFO]: monitor did not succeed
    monitor: nats-delivered-metric
    streak: 2/5
    error: result is empty
[INFO]: monitor did not succeed
    error: result is empty
    monitor: nats-delivered-metric
    streak: 3/5
[INFO]: monitor did not succeed
    error: result is empty
    streak: 4/5
    monitor: nats-delivered-metric
[INFO]: monitor did not succeed
    monitor: nats-delivered-metric
    streak: 5/5
    error: result is empty
[ERROR]: ❌ monitor failed
    monitor: nats-delivered-metric
    error: monitor triggered failure threshold
[ERROR]: monitors failed, need to take actions
    error: one of goroutines in waitgroup sent error: 1 error occurred:
    * monitor triggered failure threshold


[INFO]: chose action to perform for failed monitors
    action: rollback
    release: nats@nats
[INFO]: Releases Success 1 / 1
[INFO]: Monitors Success 1 / 2
          NAME          |             ERROR               
------------------------+---------------------------------
  nats-delivered-metric | monitor triggered failure       
                        | threshold                       
[FATAL]: deploy failed