データチームの@komi_edtr_1230です。

heyのデータチームは普段のデータ分析業務に利用するデータを整備するべくKubernetes上にワークフローエンジンを立ててバッチ処理を毎日走らせているのですが、そのワークフローエンジンにはDigdagを利用しています。

今までは比較的安定してデータ連携ができておりデータの健全性を保つことができていたのですが、日々heyのサービスが拡大するにつれてデータ量も増加し、ビジネス的な要件も複雑化していったためDigdagでは少々厳しく感じるケースが増えてきました。

具体的にどのような点で困っていたかというと

とても重たい処理でバッチがコケた際にログが見にくくデバッグが大変
コンピュータリソースを最大限に活用できておらず並列処理のメリットが活かしきれない

などがありました。

Digdagはオンプレ環境を前提としたワークフローエンジンのため、Kubernetesと併用しようとすると少しトリッキーなことをしなければならず(Digdag独自のワークスペースを持っているためファイル管理などが少々特殊)、同時にDigdagはJavaで実装されているためメモリ管理についてもDockerとJVMで設定がバッティングする箇所があることからパフォーマンスチューニングも厄介となりえます。

これまではこうした問題を解決するべくDigdagの設定を弄り回したりワークフローファイルを分割したりしてデバッグしやすさを保持したりとなんとか延命措置を図っていたのですが、そもそもでワークフローエンジンに期待していることなんてジョブをキックしてくれることとそのログ集積以外何もないので、もっと運用コストが軽いワークフローエンジンを利用するのがベストだろうという判断のもとDigdagから新しいワークフローエンジンへ乗り換えることを検討し始めました。

その結果、Argo Workflowsを利用することに決めました。

Why Argo Workflows

Argo WorkflowsというのはGitOpsで話題のArgo CDと同じプロジェクトで開発されているもので、クラウドネイティブなワークフローエンジンです。

Argo WorkflowsはKubernetes上で動作することを前提としたワークフローエンジンで、これの何がすごいかというと、ジョブをキックする際は各ワークフローを別のPodとして切り出して動いてくれるのです。

つまり10個ワークフローを走らせる際、それぞれ別のPodとして動くので独立性が担保され、かつそれぞれが別個のランタイムとして起動するのでスケーリングが容易となります。

Digdagにも同様の機構があるのですが、Digdagではジョブが既存のDigdag Pod内で実行されるため、コンピュータリソースを確実に有効活用してくれるわけではありません。(DigdagもKubernetesCommandExecutorの開発に取り組んでいるとのことですが1年近く停滞しているようです)

その点においてArgoはコンピュータリソースをフル活用してくれるため、まさしくクラウドネイティブというわけです。

セットアップ

さて、そんな素敵なArgo Workflowsを利用し始めたいわけですが、まだArgo Workflows自体が普及し切っていないことから参考記事がまだまだ少なく、今回セットアップするにあたってかなり沼にハマり込んでしまいした。(今回の記事を書き始めたモチベーションがこれです)

f:id:komi1230:20211116173531p:plain — Argo Workflowsを走らせた図

そんなわけで、今後Argo Workflowsを同様にセットアップしたい人向けに注意点をつらつらと書いていこうと思います。

前提

今回セットアップする環境として

GCPを利用
KubernetesにはGKE Autopilot
SSL証明書にはGKEのマネージド証明書を利用
Ingressを立てる
Identity Aware Proxyを利用
アカウント認証にWorkload Identityを使用
マニフェスト管理にはKustomizeを利用
Argoはnamespace-install

という具合でいきます。

また、データチームでは開発環境とステージング環境、本番環境で分けており(データ連携で本番壊したら何が起きるか想像するだけでも怖いですね)、それぞれKustomizeを用いて以下のようなディレクトリ構成としています。

build
├── base
│   ├── argo-server/
│   ├── crds/
│   └── workflow-controller/
└── overlays
    ├── experimental/
    ├── production/
    └── staging/

マニフェスト持ってくる

本家のマニフェストをコピーしてきます。

github.com

namespace-installと普通のinstallの違いはargo serverコマンドやWorkflow Controllerコマンドに対して--namespacedというオプションをつけるかどうかで、今回はnamespace-installなので自分でargsのところに--namespacedというのをつけてあげます。

また、今後Ingressを立てたりマネージド証明書を利用したり、IAP(Identity Aware Proxy)を使ったりするのでingress.yamlやcertificate.yaml、backend-config.yamlを用意します。

最終的に以下のような具合でマニフェストを各ディレクトリに配置しました。

.
├── README.md
└── build
    ├── base
    │   ├── argo-server
    │   │   ├── argo-server-deployment.yaml
    │   │   ├── argo-server-role.yaml
    │   │   ├── argo-server-rolebinding.yaml
    │   │   ├── argo-server-sa.yaml
    │   │   ├── argo-server-service.yaml
    │   │   └── kustomization.yaml
    │   ├── backend-config.yaml
    │   ├── certificate.yaml
    │   ├── crds
    │   │   ├── README.md
    │   │   ├── argoproj.io_clusterworkflowtemplates.yaml
    │   │   ├── argoproj.io_cronworkflows.yaml
    │   │   ├── argoproj.io_workfloweventbindings.yaml
    │   │   ├── argoproj.io_workflows.yaml
    │   │   ├── argoproj.io_workflowtasksets.yaml
    │   │   ├── argoproj.io_workflowtemplates.yaml
    │   │   └── kustomization.yaml
    │   ├── ingress.yaml
    │   ├── kustomization.yaml
    │   ├── namespace.yaml
    │   └── workflow-controller
    │       ├── kustomization.yaml
    │       ├── workflow-controller-configmap.yaml
    │       ├── workflow-controller-deployment.yaml
    │       ├── workflow-controller-metrics-service.yaml
    │       ├── workflow-controller-role.yaml
    │       ├── workflow-controller-rolebinding.yaml
    │       └── workflow-controller-sa.yaml
    └── overlays
        ├── experimental
        │   ├── argo-server-sa.yaml
        │   ├── certificate.yaml
        │   ├── ingress.yaml
        │   ├── kustomization.yaml
        │   └── workflow-controller-sa.yaml
        ├── production
        └── staging

CRDS(Custom Resource Definitions)については本家のリポジトリではminimalの方を利用しました。

overlays/以下の内容について、

yaml	目的
`argo-server-sa.yaml`と`workflow-controller-sa.yaml`	Workload IdentityでKubernetesのサービスアカウントとGCPのサービスアカウントと連携するため。GCPのサービスアカウントの名前は`argo-sa@hoge-project-[ENVIRONMENT].iam.gserviceaccount.com`という名前のため
`certificate.yaml`	各環境のドメインは`argo.staging.example.com`や`argo.example.com`となるため、それぞれとマニフェストのマッピング
`ingress.yaml`	各ドメインや静的IPのマッピング
`kustomization.yaml`	`env:staging`ラベルを当てたり色々

というような意図で用意しました。

IAPを使うとIngressのヘルスチェックが通らない

さて、準備が完了したはずで早速立ててみると...なんとDeploymentやServiceはちゃんと立っていますがIngressからのヘルスチェックが通らないのです。

IAPについて、IAPを利用するとリクエストヘッダにトークンを仕込まないとサーバーにリクエストが届かないという機構になっています。

実はIAPの仕様なのかわからないのですがこのリクエストヘッダを仕込むタイミングでIAP自体がTLS終端も担っているらしく、IAPの背後はHTTPSではなくHTTPを喋っているようなのです。

一方でArgoはHTTP/2の世界をデフォルトとして通信してくるのですがHTTP/2はTLS1.2以上が必要となるので(HTTP/2の仕様としては一応HTTP通信もいけるんですけどね)、このIAPとArgoは噛み合わせが悪いのです。

github.com

結果的にカスタムでargo server --secure=falseというオプションを有効化した上でHTTPを利用できるようにし、readinessProbeでヘルスチェックにもHTTPを使用するようにします。

これにてようやくIngressを使うことができ、argo.[ENVIRONMENT].example.comにアクセスできるようになります。

最終的にIngressのマニフェストは以下のようになりました。

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /$2
    kubernetes.io/ingress.global-static-ip-name: argo-${ENVIRONMENT}   # Argoのドメインに使う静的IP
    networking.gke.io/managed-certificates: argo-certificate   # Argo用の証明書
  labels:
    env: ${ENVIRONMENT}
  name: argo-ingress
  namespace: argo
spec:
  rules:
  - host: argo.${ENVIRONMENT}.example.com
    http:
      paths:
      - backend:
          service:
            name: argo-server
            port:
              name: web
        path: /*
        pathType: ImplementationSpecific

${ENVIRONMENT}のところは適宜読み替えてください。

Workflow ControllerのConfigMapわからん問題

さて、これにてArgo WorkflowsのWeb UIにブラウザから開けるようになりましたが、Argoの旨みはコンピュータリソースを上手に使ってくれるところです。

今回の設定では、そうしたリソース活用をどのように行なってくれるかや、同時に完了したワークフローのログやステータスを外部のリソース(GCS, Cloud SQLなど)に保存してくれるようにしたいです。

ということでその設定をするべく、base/workflow-controller/workflow-controller-configmap.yamlに色々書いておく必要があります。

結論としてConfigMapは以下のようになりました。

apiVersion: v1
kind: ConfigMap
metadata:
  name: workflow-controller-configmap
data:
  # instanceID is a label selector to limit the controller's watch to a specific instance. It
  # contains an arbitrary value that is carried forward into its pod labels, under the key
  # workflows.argoproj.io/controller-instanceid, for the purposes of workflow segregation. This
  # enables a controller to only receive workflow and pod events that it is interested about,
  # in order to support multiple controllers in a single cluster, and ultimately allows the
  # controller itself to be bundled as part of a higher level application. If omitted, the
  # controller watches workflows and pods that *are not* labeled with an instance id.
  # instanceID: my-ci-controller

  # Parallelism limits the max total parallel workflows that can execute at the same time
  # (available since Argo v2.3). Controller must be restarted to take effect.
  parallelism: "10"

  # Limit the maximum number of incomplete workflows in a namespace.
  # Intended for cluster installs that are multi-tenancy environments, to prevent too many workflows in one
  # namespace impacting others.
  # >= v3.2
  namespaceParallelism: "10"

  # Globally limits the rate at which pods are created.
  # This is intended to mitigate flooding of the Kubernetes API server by workflows with a large amount of
  # parallel nodes.
  resourceRateLimit: |
    limit: 10
    burst: 1

  # Whether or not to emit events on node completion. These can take a up a lot of space in
  # k8s (typically etcd) resulting in errors when trying to create new events:
  # "Unable to create audit event: etcdserver: mvcc: database space exceeded"
  # This config item allows you to disable this.
  # (since v2.9)
  nodeEvents: |
    enabled: true

  # uncomment flowing lines if workflow controller runs in a different k8s cluster with the
  # workflow workloads, or needs to communicate with the k8s apiserver using an out-of-cluster
  # kubeconfig secret
  # kubeConfig:
  #   # name of the kubeconfig secret, may not be empty when kubeConfig specified
  #   secretName: kubeconfig-secret
  #   # key of the kubeconfig secret, may not be empty when kubeConfig specified
  #   secretKey: kubeconfig
  #   # mounting path of the kubeconfig secret, default to /kube/config
  #   mountPath: /kubeconfig/mount/path
  #   # volume name when mounting the secret, default to kubeconfig
  #   volumeName: kube-config-volume

  # links: |
  #   # Adds a button to the workflow page. E.g. linking to you logging facility.
  #   - name: Example Workflow Link
  #     scope: workflow
  #     url: http://logging-facility?namespace=${metadata.namespace}&workflowName=${metadata.name}&startedAt=${status.startedAt}&finishedAt=${status.finishedAt}
  #   # Adds a button next to the pod.  E.g. linking to you logging facility but for the pod only.
  #   - name: Example Pod Link
  #     scope: pod
  #     url: http://logging-facility?namespace=${metadata.namespace}&podName=${metadata.name}&startedAt=${status.startedAt}&finishedAt=${status.finishedAt}
  #   - name: Pod Logs
  #     scope: pod-logs
  #     url: http://logging-facility?namespace=${metadata.namespace}&podName=${metadata.name}&startedAt=${status.startedAt}&finishedAt=${status.finishedAt}
  #   - name: Event Source Logs
  #     scope: event-source-logs
  #     url: http://logging-facility?namespace=${metadata.namespace}&podName=${metadata.name}&startedAt=${status.startedAt}&finishedAt=${status.finishedAt}
  #   - name: Sensor Logs
  #     scope: sensor-logs
  #     url: http://logging-facility?namespace=${metadata.namespace}&podName=${metadata.name}&startedAt=${status.startedAt}&finishedAt=${status.finishedAt}
  #   # Adds a button to the bottom right of every page to link to your organisation help or chat.
  #   - name: Get help
  #     scope: chat
  #     url: http://my-chat

  # artifactRepository defines the default location to be used as the artifact repository for
  # container artifacts.
  artifactRepository: |
    # archiveLogs will archive the main container logs as an artifact
    archiveLogs: true
    gcs:
      bucket: argo-log-${ENVIRONMENT}
      # keyFormat is a format pattern to define how artifacts will be organized in a bucket.
      # It can reference workflow metadata variables such as workflow.namespace, workflow.name,
      # pod.name. Can also use strftime formating of workflow.creationTimestamp so that workflow
      # artifacts can be organized by date. If omitted, will use `{{workflow.name}}/{{pod.name}}`,
      # which has potential for have collisions.
      # The following example pattern organizes workflow artifacts under a "my-artifacts" sub dir,
      # then sub dirs for year, month, date and finally workflow name and pod.
      # e.g.: my-artifacts/2018/08/23/my-workflow-abc123/my-workflow-abc123-1234567890
      keyFormat: "argo\
        /{{workflow.creationTimestamp.Y}}\
        /{{workflow.creationTimestamp.m}}\
        /{{workflow.creationTimestamp.d}}\
        /{{workflow.name}}\
        /{{pod.name}}"

  # Specifies the container runtime interface to use (default: emissary)
  # must be one of: docker, kubelet, k8sapi, pns, emissary
  # It has lower precedence than either `--container-runtime-executor` and `containerRuntimeExecutors`.
  containerRuntimeExecutor: emissary

  # Specifies the executor to use.
  #
  # You can use this to:
  # * Tailor your executor based on your preference for security or performance.
  # * Test out an executor without committing yourself to use it for every workflow.
  #
  # To find out which executor was actually use, see the `wait` container logs.
  #
  # The list is in order of precedence; the first matching executor is used.
  # This has precedence over `containerRuntimeExecutor`.
  containerRuntimeExecutors: |
    - name: emissary
      selector:
        matchLabels:
          workflows.argoproj.io/container-runtime-executor: emissary
    - name: pns
      selector:
        matchLabels:
          workflows.argoproj.io/container-runtime-executor: pns

  # Specifies the location of docker.sock on the host for docker executor (default: /var/run/docker.sock)
  # (available since Argo v2.4)
  dockerSockPath: /var/run/docker.sock

  # kubelet port when using kubelet executor (default: 10250)
  kubeletPort: "10250"

  # disable the TLS verification of the kubelet executor (default: false)
  kubeletInsecure: "false"

  # The command/args for each image, needed when the command is not specified and the emissary executor is used.
  # https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary
  images: |
    argoproj/argosay:v1:
      command: [cowsay]
    argoproj/argosay:v2:
      command: [/argosay]
    docker/whalesay:latest:
      command: [cowsay]
    python:alpine3.6:
      command: [python3]

  # executor controls how the init and wait container should be customized
  # (available since Argo v2.3)
  # executor: |
  #   imagePullPolicy: IfNotPresent
  #   resources:
  #     requests:
  #       cpu: 2
  #       memory: 2048Mi
  #   # args & env allows command line arguments and environment variables to be appended to the
  #   # executor container and is mainly used for development/debugging purposes.
  #   args:
  #   - --loglevel
  #   - debug
  #   - --gloglevel
  #   - "6"
  #   env:
  #   # ARGO_TRACE enables some tracing information for debugging purposes. Currently it enables
  #   # logging of S3 request/response payloads (including auth headers)
  #   - name: ARGO_TRACE
  #     value: "1"

  # metricsConfig controls the path and port for prometheus metrics. Metrics are enabled and emitted on localhost:9090/metrics
  # by default.
  metricsConfig: |
    # Enabled controls metric emission. Default is true, set "enabled: false" to turn off
    enabled: true
    # Path is the path where metrics are emitted. Must start with a "/". Default is "/metrics"
    path: /metrics
    # Port is the port where metrics are emitted. Default is "9090"
    port: 8080
    # MetricsTTL sets how often custom metrics are cleared from memory. Default is "0", metrics are never cleared
    metricsTTL: "10m"
    # IgnoreErrors is a flag that instructs prometheus to ignore metric emission errors. Default is "false"
    ignoreErrors: false
    # Use a self-signed cert for TLS, default false
    secure: false
    # DEPRECATED: Legacy metrics are now removed, this field is ignored
    disableLegacy: false

  # telemetryConfig controls the path and port for prometheus telemetry. Telemetry is enabled and emitted in the same endpoint
  # as metrics by default, but can be overridden using this config.
  telemetryConfig: |
    enabled: true
    path: /telemetry
    port: 8080
    secure: true  # Use a self-signed cert for TLS, default false

  # enable persistence using postgres
  persistence: |
    connectionPool:
      maxIdleConns: 100
      maxOpenConns: 0
      connMaxLifetime: 0s # 0 means connections don't have a max lifetime
    #  if true node status is only saved to the persistence DB to avoid the 1MB limit in etcd
    nodeStatusOffLoad: false
    # save completed workloads to the workflow archive
    archive: true
    # the number of days to keep archived workflows (the default is forever)
    # archiveTTL: 180d
    # skip database migration if needed.
    # skipMigration: true
    # LabelSelector determines the workflow that matches with the matchlabels or matchrequirements, will be archived.
    # https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
    # archiveLabelSelector:
    #   matchLabels:
    #     workflows.argoproj.io/archive-strategy: "always"
    # Optional name of the cluster I'm running in. This must be unique for your cluster.
    clusterName: data-platform-${ENVIRONMENT}
    postgresql:
      host: 127.0.0.1
      port: 5432
      database: postgres
      tableName: argo_workflows
      # the database secrets must be in the same namespace of the controller
      userNameSecret:
        name: argo-postgres-config
        key: username
      passwordSecret:
        name: argo-postgres-config
        key: password
      ssl: false
      # sslMode must be one of: disable, require, verify-ca, verify-full
      # you can find more information about those ssl options here: https://godoc.org/github.com/lib/pq
      sslMode: disable
    # Optional config for mysql:
    # mysql:
    #   host: localhost
    #   port: 3306
    #   database: argo
    #   tableName: argo_workflows
    #   userNameSecret:
    #     name: argo-mysql-config
    #     key: username
    #   passwordSecret:
    #     name: argo-mysql-config
    #     key: password

  # Default values that will apply to all Workflows from this controller, unless overridden on the Workflow-level
  # See more: docs/default-workflow-specs.md
  # workflowDefaults: |
  #   metadata:
  #     annotations:
  #       argo: workflows
  #     labels:
  #       foo: bar
  #   spec:
  #     ttlStrategy:
  #       secondsAfterSuccess: 5
  #     parallelism: 3

  # SSO Configuration for the Argo server.
  # You must also start argo server with `--auth-mode sso`.
  # https://argoproj.github.io/argo-workflows/argo-server-auth-mode/
  # sso: |
  #   # This is the root URL of the OIDC provider (required).
  #   issuer: https://issuer.root.url/
  #   # Some OIDC providers have alternate root URLs that can be included. These should be reviewed carefully. (optional)
  #   issuerAlias: https://altissuer.root.url
  #   # This defines how long your login is valid for (in hours). (optional)
  #   # If omitted, defaults to 10h. Example below is 10 days.
  #   sessionExpiry: 240h
  #   # This is name of the secret and the key in it that contain OIDC client
  #   # ID issued to the application by the provider (required).
  #   clientId:
  #     name: client-id-secret
  #     key: client-id-key
  #   # This is name of the secret and the key in it that contain OIDC client
  #   # secret issued to the application by the provider (required).
  #   clientSecret:
  #     name: client-secret-secret
  #     key: client-secret-key
  #   # This is the redirect URL supplied to the provider (optional). It must
  #   # be in the form <argo-server-root-url>/oauth2/callback. It must be
  #   # browser-accessible. If omitted, will be automatically generated.
  #   redirectUrl: https://argo-server/oauth2/callback
  #   # Additional scopes to request. Typically needed for SSO RBAC. >= v2.12
  #   scopes:
  #    - groups
  #    - email
  #   # RBAC Config. >= v2.12
  #   rbac:
  #     enabled: false

  # workflowRestrictions restricts the Workflows that the controller will process.
  # Current options:
  #   Strict: Only Workflows using "workflowTemplateRef" will be processed. This allows the administrator of the controller
  #     to set a "library" of templates that may be run by its operator, limiting arbitrary Workflow execution.
  #   Secure: Only Workflows using "workflowTemplateRef" will be processed and the controller will enforce
  #     that the WorkflowTemplate that is referenced hasn't changed between operations. If you want to make sure the operator of the
  #     Workflow cannot run an arbitrary Workflow, use this option.
  # workflowRestrictions: |
  #   templateReferencing: Strict

うん、めちゃくちゃ多いですね。

基本的にコメントを読めば何の設定をしているかわかるしパラメータをいじるのも簡単なので大部分は割愛しますが、いくつかだけポイントをピックアップします。

まず最も大切なのがcontainerRuntimeExecutorで、GKE AutopilotでArgo Workflowsを使うにはcontainerRuntimeExecutorに emissaryを設定することです。これを設定しないとワークフローをPodとして切り出してくれません。(現在はデフォルトでemissaryになっているらしい？)

また、persistenceについて、これはワークフローのステータスをどこかに外部に保持しておきたいときに必要で(デフォルトだとPodが棄却されたタイミングでその結果も消える)、Cloud SQLなり外部のPostgreSQLかMySQLを設定する必要があります。今回自分たちはCloud SQLを用意し、DeploymentにCloud SQL Proxyをサイドカーで立てることによって127.0.0.1:5432でPostgreSQLに繋がるように設定しています。

最後にartifactRepositoryについて、これはワークフローのログをGCS/S3に保存するための設定です。注意点として、先ほどのpersistenceはワークフローの結果だけを保存するものですがartifactRepositoryはワークフローのログを集積する設定で、これらは異なります。

ArgoにはSSOを設定することもできるのですが、自分たちはIAPなどその他の設定でセキュリティ的に十分だと考えているためArgoサーバーのセキュリティ設定は--auth-mode=serverとしています。そのためConfigMapではSSOの設定は一切しませんでした。

RBACで権限足りなくてPodアップデートできない

そんなこんなでちゃんと立ち上がり、画面をポチポチしてArgoでジョブを走らせ...られませんでした。もう何度目の罠でしょう？

ジョブがずっとfailしているので調べてみたところ、Podに対する権限が足りないということで、RBACの設定でPodリソースに対してupdate/patchの権限が足りていなかったようです。

- apiGroups:
      - ""
    resources:
      - pods
      - pods/exec
      - pods/log
    verbs:
      - get
      - list
      - watch
      - delete
      - patch   # Add
      - update  # Add

これについては本家の方にPRを出しました。

github.com

また、ワークフローのマニフェストにはちゃんとspec.serviceAccountNameに自分のKubernetesサービスアカウントの名前を書いておくことが必要です(でないとデフォルトのサービスアカウントになってしまい権限が足りません)

これにてようやくArgo Workflowsを動かすことができます。