Run Commands

Helm

series

Deploy Target HCVs

About #

The Deploy Target section defines where you’re deploying Pachyderm; this is typically located at the top of your values.yaml file.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
# Deploy Target configures the storage backend and cloud provider settings (storage classes, etc). 
# options:  GOOGLE, AMAZON, MINIO, MICROSOFT, CUSTOM or LOCAL.
deployTarget: "LOCAL"
Global HCVs

About #

The Global section configures the connection to the PostgreSQL database. By default, it uses the included Postgres service.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
global:
  postgresql:
    postgresqlAuthType: "md5" # sets the auth type used with postgres & pg-bounder; options include "md5" and "scram-sha-256"
    postgresqlUsername: "pachyderm" # defines the username to access the pachyderm and dex databases
    postgresqlExistingSecretName: "" # leave blank if using password
    postgresqlExistingSecretKey: "" # leave blank if using password
    postgresqlDatabase: "pachyderm" # defines the database name where pachyderm data will be stored
    postgresqlHost: "postgres" # defines the postgresql database host to connect to
    postgresqlPort: "5432"  # defines he postgresql database port to connect to
    postgresqlSSL: "disable" # defines the SSL mode used to connect pg-bouncer to postgrs
    postgresqlSSLCACert: "" # defines the CA Certificate required to connect to Postgres
    postgresqlSSLSecret: "" # defines the TLS Secret with cert/key to connect to Postgres
    identityDatabaseFullNameOverride: "" # defines the DB name that dex connects to; defaults to "Dex"
  imagePullSecrets: [] # allows you to pull images from private repositories; also added to pipeline workers

  # Example:
  # imagePullSecrets:
  #   - regcred

  customCaCerts: false # loads the cert file in pachd-tls-cert as the root cert for pachd, console, and enterprise-server 
  proxy: "" # sets server address for outbound cluster traffic
  noProxy: "" # if proxy is set, allows a comma-separated list of destinations that bypass the proxy
  securityContexts: # set security context runAs users. If running on openshift, set enabled to false as openshift creates its own contexts.
    enabled: true
Console HCVs

About #

Console is the Graphical User Interface (GUI) for Pachyderm. Users that would prefer to navigate and manage through their project resources visually can connect to Console by authenticating against your configured OIDC. For personal-machine installations of Pachyderm, a user may access Console without authentication via localhost.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
console:
  enabled: true # deploys Console UI
  annotations: {}
  image: # defines which image to use for the console; replicates the --console-image & --registry arguments to pachctl
    repository: "pachyderm/haberdashery" # defines image repo location
    pullPolicy: "IfNotPresent"
    tag: "2.3.3-1" # defines the image repo to pull from
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  podLabels: {} # specifies labels to add to the console pod.
  resources: # specifies the resource request and limits; unset by default.
    {}

    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
    
  config: # defines primary configuration settings, including authentication.
    reactAppRuntimeIssuerURI: ""  # defines the pachd oauth address accessible to outside clients.
    oauthRedirectURI: "" #  defines the oauth callback address within console that the pachd oauth service would redirect to.
    oauthClientID: "console" # defines the client identifier for the Console with pachd
    oauthClientSecret: "" # defines the secret configured for the client with pachd; if blank, autogenerated.
    oauthClientSecretSecretName: "" # uses the value of an existing k8s secret by pulling from the `OAUTH_CLIENT_SECRET` key.
    graphqlPort: 4000 # defines the http port that the console service will be accessible on.
    pachdAddress: "pachd-peer:30653"
    disableTelemetry: false # disables analytics and error data collection

  service:
    annotations: {}
    labels: {} # specifies labels to add to the console service.
    type: ClusterIP # specifies the Kubernetes type of the console service; default is `ClusterIP`.
Enterprise Server HCVs

About #

Enterprise Server is a production management layer that centralizes the licensing registration of multiple Pachyderm clusters for Enterprise use and the setup of user authorization/authentication via OIDC.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:

enterpriseServer:
  enabled: true
  affinity: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  nodeSelector: {}
  service:
    type: ClusterIP
    apiGRPCPort: 31650
    prometheusPort: 31656
    oidcPort: 31657
    identityPort: 31658
    s3GatewayPort: 31600
  tls:
    enabled: false
  resources:
    {}
    
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"

  podLabels: {} # specifies labels to add to the pachd pod.
  clusterDeploymentID: ""
  image:
    repository: "pachyderm/pachd"
    pullPolicy: "IfNotPresent"
    tag: "" #  defaults to the chart’s specified appVersion.
ETCD HCVs

About #

The ETCD section configures the ETCD cluster in the deployment.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

etcd:
  affinity: {}
  annotations: {}
  dynamicNodes: 1 # sets the number of nodes in the etcd StatefulSet;  analogous to the --dynamic-etcd-nodes argument to pachctl
  image:
    repository: "pachyderm/etcd"
    tag: "v3.5.1"
    pullPolicy: "IfNotPresent"
  maxTxnOps: 10000 # sets the --max-txn-ops in the container args
  priorityClassName: ""
  nodeSelector: {}
  podLabels: {} # specifies labels to add to the etcd pod.
  
  resources: # specifies the resource request and limits
    {}

    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"

  storageClass: "" #  defines what existing storage class to use; analogous to --etcd-storage-class argument to pachctl 
  storageSize: 10Gi # specifies the size of the volume to use for etcd.
  service:
    annotations: {} # specifies annotations to add to the etcd service.
    labels: {} # specifies labels to add to the etcd service.
    type: ClusterIP # specifies the Kubernetes type of the etcd service.
  tolerations: []
Ingress HCVs

About #

⚠️

ingress will be removed from the helm chart once the deployment of Pachyderm with a proxy becomes mandatory.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
ingress:
  enabled: true
  annotations: {}
  host: ""
  uriHttpsProtoOverride: false # if true, adds the https protocol to the ingress URI routes without configuring certs
  tls:
    enabled: true
    secretName: ""
Loki HCVs

About #

Loki Stack contains values that are passed to the loki-stack subchart. For more details on each service, see their official documentation:

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

loki-stack:
  loki:
    serviceAccount:
      automountServiceAccountToken: false
    persistence:
      enabled: true
      accessModes:
        - ReadWriteOnce
      size: 10Gi
      # More info for setting up storage classes on various cloud providers:
      # AWS: https://docs.aws.amazon.com/eks/latest/userguide/storage-classes.html
      # GCP: https://cloud.google.com/compute/docs/disks/performance#disk_types
      # Azure: https://docs.microsoft.com/en-us/azure/aks/concepts-storage#storage-classes
      storageClassName: ""
      annotations: {}
      priorityClassName: ""
      nodeSelector: {}
      tolerations: []
    config:
      limits_config:
        retention_period: 24h
        retention_stream:
          - selector: '{suite="pachyderm"}'
            priority: 1
            period: 168h # = 1 week
  grafana:
    enabled: false
  promtail:
    config:
      clients:
        - url: "http://{{ .Release.Name }}-loki:3100/loki/api/v1/push"
      snippets:
        # The scrapeConfigs section is copied from loki-stack-2.6.4
        # The pipeline_stages.match stanza has been added to prevent multiple lokis in a cluster from mixing their logs.
        scrapeConfigs: |
          - job_name: kubernetes-pods
            pipeline_stages:
              {{- toYaml .Values.config.snippets.pipelineStages | nindent 4 }}
              - match:
                  selector: '{namespace!="{{ .Release.Namespace }}"}'
                  action: drop
            kubernetes_sd_configs:
              - role: pod
            relabel_configs:
              - source_labels:
                  - __meta_kubernetes_pod_controller_name
                regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
                action: replace
                target_label: __tmp_controller_name
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_name
                  - __meta_kubernetes_pod_label_app
                  - __tmp_controller_name
                  - __meta_kubernetes_pod_name
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: app
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_instance
                  - __meta_kubernetes_pod_label_release
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: instance
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_component
                  - __meta_kubernetes_pod_label_component
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: component
              {{- if .Values.config.snippets.addScrapeJobLabel }}
              - replacement: kubernetes-pods
                target_label: scrape_job
              {{- end }}
              {{- toYaml .Values.config.snippets.common | nindent 4 }}
              {{- with .Values.config.snippets.extraRelabelConfigs }}
              {{- toYaml . | nindent 4 }}
              {{- end }}
        pipelineStages:
          - cri: {}
        common:
          # This is copy and paste of existing actions, so we don't lose them.
          # Cf. https://github.com/grafana/loki/issues/3519#issuecomment-1125998705
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_node_name
            target_label: node_name
          - action: replace
            source_labels:
              - __meta_kubernetes_namespace
            target_label: namespace
          - action: replace
            replacement: $1
            separator: /
            source_labels:
              - namespace
              - app
            target_label: job
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_name
            target_label: pod
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_container_name
            target_label: container
          - action: replace
            replacement: /var/log/pods/*$1/*.log
            separator: /
            source_labels:
              - __meta_kubernetes_pod_uid
              - __meta_kubernetes_pod_container_name
            target_label: __path__
          - action: replace
            regex: true/(.*)
            replacement: /var/log/pods/*$1/*.log
            separator: /
            source_labels:
              - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
              - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
              - __meta_kubernetes_pod_container_name
            target_label: __path__
          - action: keep
            regex: pachyderm
            source_labels:
              - __meta_kubernetes_pod_label_suite
          # this gets all kubernetes labels as well
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
    livenessProbe:
      failureThreshold: 5
      tcpSocket:
        port: http-metrics
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
PachD HCVs

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
pachd:
  enabled: true
  preflightChecks:
    enabled: true # runs kube validation preflight checks.
  affinity: {}
  annotations: {}
  clusterDeploymentID: "" # sets Pachyderm cluster ID.
  configJob:
    annotations: {}
  goMaxProcs: 0 # passed as GOMAXPROCS to the pachd container.
  image:
    repository: "pachyderm/pachd"
    pullPolicy: "IfNotPresent"
    tag: "" # sets worker image tag; defaults to appVersion.
  logFormat: "json"
  logLevel: "info"
  lokiDeploy: true 
  lokiLogging: true
  metrics:
    enabled: true
    endpoint: "" # provide the URL of the metrics endpoint.
  priorityClassName: ""
  nodeSelector: {}
  podLabels: {} # adds labels to the pachd pod.
  replicas: 1 # sets the number of pachd running pods
  resources: #  specifies the resource requests & limits
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"

  requireCriticalServersOnly: false

  externalService:
    enabled: false # Creates a service that's safe to expose.
    loadBalancerIP: ""
    apiGRPCPort: 30650
    s3GatewayPort: 30600
    annotations: {}

  service:
    labels: {} # adds labels to the pachd service.
    type: "ClusterIP" # specifies pachd service's Kubernetes type
    annotations: {}
    apiGRPCPort: 30650
    prometheusPort: 30656
    oidcPort: 30657
    identityPort: 30658
    s3GatewayPort: 30600

    #apiGrpcPort:
    #  expose: true
    #  port: 30650

  activateEnterpriseMember: false # connects to an existing enterprise server.
  activateAuth: true # bootstraps auth via the config job.
  enterpriseLicenseKey: "" # activates enterprise if provided. 
  enterpriseLicenseKeySecretName: "" # pulls value from k8s secret key "enterprise-license-key"
  rootToken: "" # autogenerated if not provided; stored in k8s secret "pachyderm-bootstrap-config.rootToken"
  rootTokenSecretName: "" # passes rooToken value from k8s secret key "root-token"
  enterpriseSecret: "" # autogenerated if not provided; stored in k8s secret "pachyderm-bootstrap-config.enterpriseSecret"
  enterpriseSecretSecretName: "" # passes value from k8s secret key "enterprise-secret"
  oauthClientID: pachd
  oauthClientSecret: "" # autogenerated if not provided; stored in k8s secret "pachyderm-bootstrap-config.authConfig.clientSecret"
  oauthClientSecretSecretName: ""  # passes value from k8s secret key "pachd-oauth-client-secret"
  oauthRedirectURI: ""
  enterpriseServerToken: "" # authenticates to a enterprise server & registers this cluster as a member if activateEnterpriseMember is true.
  enterpriseServerTokenSecretName: "" # passes value from k8s secret key "enterprise-server-token" if activateEnterpriseMember is true. 
  enterpriseServerAddress: ""
  enterpriseCallbackAddress: ""
  localhostIssuer: "" # Indicates to pachd whether dex is embedded in its process; "true", "false", or ""
  pachAuthClusterRoleBindings: {} # map initial users to their list of roles.
  
  #   robot:wallie:
  #   - repoReader
  #   robot:eve:
  #   - repoWriter
 
  additionalTrustedPeers: [] # configures identity service to recognize trusted peers.

  #   - example-app

  serviceAccount:
    create: true
    additionalAnnotations: {}
    name: "pachyderm" 

  storage:
    backend: "" # options: GOOGLE, AMAZON, MINIO, MICROSOFT or LOCAL
    amazon:
      bucket: "" # sets the S3 bucket to use.
      cloudFrontDistribution: "" # sets the CloudFront distribution in the storage secrets. 
      customEndpoint: ""
      disableSSL: false
      id: "" #  sets the Amazon access key ID
      logOptions: "" # case-sensitive comma-separated list: 'Debug', 'Signing', 'HTTPBody', 'RequestRetries', 'EventStreamBody', or 'all'
      maxUploadParts: 10000
      verifySSL: true
      partSize: "5242880" # sets part size for object storage uploads; must be a string.
      region: "" # sets AWS region
      retries: 10
      reverse: true
      secret: ""  # sets the Amazon secret access key to use.
      timeout: "5m" #  sets the timeout for object storage requests.
      token: "" # sets the Amazon token to use.
      uploadACL: "bucket-owner-full-control" 
    google:
      bucket: ""
      cred: ""  # sets GCP service account private key as string. 

      # cred: |
      #  {
      #    "type": "service_account",
      #    "project_id": "…",
      #    "private_key_id": "…",
      #    "private_key": "-----BEGIN PRIVATE KEY-----\n…\n-----END PRIVATE KEY-----\n",
      #    "client_email": "…@….iam.gserviceaccount.com",
      #    "client_id": "…",
      #    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      #    "token_uri": "https://oauth2.googleapis.com/token",
      #    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      #    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/…%40….iam.gserviceaccount.com"
      #  }

    local:
      hostPath: "" # path where PFS metadata is stored; must end with "/".
      requireRoot: true # root required for hostpath, but we run rootless in CI
    microsoft:
      container: ""
      id: ""
      secret: ""
    minio:
      bucket: "" # sets bucket name. 
      endpoint: "" # format: hostname:port
      id: "" # username/id with readwrite access to the bucket.
      secret: "" # the secret/password of the user with readwrite access to the bucket.
      secure: "false" # enables https for minio if "true"
      signature: "" # Enables S3v2 support by setting signature to "1"; being deprecated. 
    putFileConcurrencyLimit: 100 # sets the maximum number of files to upload or fetch from remote sources uploadConcurrencyLimit sets the maximum number of concurrent; analogous to --put-file-concurrency-limit argument to pachctl
    uploadConcurrencyLimit: 100  # object storage uploads per Pachd instance; analogous to  --upload-concurrency-limit argument to pachctl
    compactionShardSizeThreshold: 0 # the total size of the files in a shard.
    compactionShardCountThreshold: 0 # the total number of files in a shard.
    memoryThreshold: 0
    levelFactor: 0
    maxFanIn: 10
    maxOpenFileSets: 50
    # diskCacheSize and memoryCacheSize are defined in units of 8 Mb chunks. The default is 100 chunks which is 800 Mb.
    diskCacheSize: 100
    memoryCacheSize: 100

  ppsWorkerGRPCPort: 1080
  storageGCPeriod: 0 # the number of seconds between PFS's garbage collection cycles; <0 disables garbage collection; 0 defaults to pachyderm's internal config.
  storageChunkGCPeriod: 0 # the number of seconds between chunk garbage collection cycles; <0 disables chunk garbage collection; 0 defaults to pachyderm's internal config.
  # There are three options for TLS:
  # 1. Disabled
  # 2. Enabled, existingSecret, specify secret name
  # 3. Enabled, newSecret, must specify cert, key and name
  tls:
    enabled: false
    secretName: ""
    newSecret:
      create: false
      crt: ""
      key: ""
  tolerations: []
  worker:
    image:
      repository: "pachyderm/worker"
      pullPolicy: "IfNotPresent"
      # Worker tag is set under pachd.image.tag (they should be kept in lock step)
    serviceAccount:
      create: true
      additionalAnnotations: {}
      name: "pachyderm-worker"  # sets the name of the worker service account; analogous to --worker-service-account argument to pachctl.
  rbac:
    create: true # indicates whether RBAC resources should be created; analogous to --no-rbac to pachctl
  # Set up default resources for pipelines that don't include any requests or limits.  The values
  # are k8s resource quantities, so "1Gi", "2", etc.  Set to "0" to disable setting any defaults.
  defaultPipelineCPURequest: ""
  defaultPipelineMemoryRequest: ""
  defaultPipelineStorageRequest: ""
  defaultSidecarCPURequest: ""
  defaultSidecarMemoryRequest: ""
  defaultSidecarStorageRequest: ""
PachW HCVs

About #

PachW enables fine-grained control of where compaction and object-storage interaction occur by running storage tasks in a dedicated Kubernetes deployment. Users can configure PachW’s min and max replicas as well as define nodeSelectors, tolerations, and resource requests. Using PachW allows power users to save on costs by claiming fewer resources and running storage tasks on less expensive nodes.

⚠️

If you are upgrading to 2.5.0+ for the first time and you wish to use PachW, you must calculate how many maxReplicas you need. By default, PachW is set to maxReplicas:1 β€” however, that is not sufficient for production runs.

maxReplicas #

You should set the maxReplicas value to at least match the number of pipeline replicas that you have. For high performance, we suggest taking the following approach:

number of pipelines * highest parallelism spec * 1.5 = maxReplicas

Let’s say you have 6 pipelines. One of these pipelines has a parallelism spec value of 6, and the rest are 5 or fewer.

6 * 6 * 1.5 = 54

minReplicas #

Workloads that constantly process storage and compaction tasks because they are committing rapidly may want to increase minReplicas to have instances on standby.

nodeSelectors #

Workloads that utilize GPUs and other expensive resources may want to add a node selector to scope PachW instances to less expensive nodes.

Values #

Options:
pachw:
  inheritFromPachd: true # defaults below configuration options like 'resources' and 'tolerations' to  values from pachd
  maxReplicas: 1
  minReplicas: 0
  inSidecars: false
  #tolerations: []
  #affinity: {}
  #nodeSelector: {}
Kube Event Tail HCVs

About #

Kube Event Tail deploys a lightweight app that watches Kubernetes events and echoes them into logs.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
kubeEventTail:
  enabled: true
  clusterScope: false # if true, watches just events in its namespace 
  image:
    repository: pachyderm/kube-event-tail
    pullPolicy: "IfNotPresent"
    tag: "v0.0.6"
  resources:
    limits:
      cpu: "1"
      memory: 100Mi
    requests:
      cpu: 100m
      memory: 45Mi 
PGBouncer HCVs

About #

The PGBouncer section configures a PGBouncer Postgres connection pooler.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.


pgbouncer:
  service:
    type: ClusterIP # defines the Kubernetes service type.
  annotations: {}
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  image:
    repository: pachyderm/pgbouncer
    tag: 1.16.1-debian-10-r82
  resources: # defines resources in standard kubernetes format; unset by default.
    {}

    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"

  maxConnections: 10000 # defines the maximum number of concurrent connections into pgbouncer.
  defaultPoolSize: 80 # specifies the maximum number of concurrent connections from pgbouncer to the postgresql database.
PostgreSQL Subchart HCVs

About #

The PostgresQL section controls the Bitnami PostgreSQL subchart. Pachyderm runs on Kubernetes, is backed by an object store of your choice, and comes with a bundled version of PostgreSQL (metadata storage) by default.

We recommended disabling this bundled PostgreSQL and using a managed database instance (such as RDS, CloudSQL, or PostgreSQL Server) for production environments.

See storage class details for your provider:

  • AWS | Min: 500Gi (GP2) / 1,500 IOP
  • GCP | Min: 50Gi / 1,500 IOPS
  • Azure | Min: 256Gi / 1,100 IOPS

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
postgresql:
  enabled: false # if false, you must specify a PostgreSQL database server connection @ global.postgresql
CloudSQL Auth Proxy HCVs

About #

The CloudSQL Auth Proxy section configures the CloudSQL Auth Proxy for deploying Pachyderm on GCP with CloudSQL.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:

cloudsqlAuthProxy:
  connectionName: "" # may be found by running `gcloud sql instances describe INSTANCE_NAME --project PROJECT_ID`
  serviceAccount: ""   #  defines the account used to connect to the cloudSql instance
  iamLogin: false
  port: 5432   # the cloudql database port to expose. The default is `5432`
  enabled: true # controls whether to deploy the cloudsqlAuthProxy. Default is false.
  image:
    repository: "gcr.io/cloudsql-docker/gce-proxy" # the image repo to pull from; replicates --registry to pachctl
    pullPolicy: "IfNotPresent"
    tag: "1.23.0" # the image repo to pull from; replicates the --dash-image argument to pachctl deploy.
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  podLabels: {}  # specifies labels to add to the dash pod.
  resources: {} # specifies the resource request and limits.

  #  requests:
  #    # The proxy's memory use scales linearly with the number of active
  #    # connections. Fewer open connections will use less memory. Adjust
  #    # this value based on your application's requirements.
  #    memory: ""
  #    # The proxy's CPU use scales linearly with the amount of IO between
  #    # the database and the application. Adjust this value based on your
  #    # application's requirements.
  #    cpu: ""

  service:
    labels: {} #  specifies labels to add to the cloudsql auth proxy service.
    type: ClusterIP # specifies the Kubernetes type of the cloudsql auth proxy service. The default is `ClusterIP`.
OpenID Connect HCVs

About #

The OIDC section of the helm chart enables you to set up authentication through upstream IDPs. To use authentication, you must have an Enterprise license.

We recommend setting up this section alongside the Enterprise Server section of your Helm chart so that you can easily scale multiple clusters using the same authentication configurations.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

Options:
oidc:
  issuerURI: "" # inferred if running locally or using proxy
  requireVerifiedEmail: false # if true, email verification is required to authenticate
  IDTokenExpiry: 24h # if set, specifies the duration where OIDC ID Tokens are valid; parsed into golang's time.Duration: https://pkg.go.dev/time#example-ParseDuration
  RotationTokenExpiry: 48h # If set, enables OIDC rotation tokens, and specifies the duration where they are valid.
  userAccessibleOauthIssuerHost: "" # (Optional) Only set in cases where the issuerURI is not user accessible (ie. localhost install)
  mockIDP: true # if true, ignores upstreamIDPs in favor of a placeholder IDP with the username/password of "admin"/"password"
Test Connection HCVs

About #

The Test Connection section is used by Pachyderm to test the connection during installation. This config is used by organizations that do not have permission to pull Docker images directly from the Internet, and instead need to mirror locally.

Values #

The following section contains a series of tabs for commonly used configurations for this section of your values.yml Helm chart.

testConnection:
  image:
    repository: alpine
    tag: latest
Proxy HCVs

About #

Proxy is a service that handles all Pachyderm traffic (S3, Console, OIDC, Dex, GRPC) on a single port; It’s great for exposing you cluster directly to the Internet.

Values #


# The proxy is a service to handle all Pachyderm traffic (S3, Console, OIDC, Dex, GRPC) on a single
# port; good for exposing directly to the Internet.
proxy:
  # If enabled, create a proxy deployment (based on the Envoy proxy) and a service to expose it.  If
  # ingress is also enabled, any Ingress traffic will be routed through the proxy before being sent
  # to pachd or Console.
  enabled: true
  # The external hostname (including port if nonstandard) that the proxy will be reachable at.
  # If you have ingress enabled and an ingress hostname defined, the proxy will use that.
  # Ingress will be deprecated in the future so configuring the proxy host instead is recommended.
  host: ""
  # The number of proxy replicas to run.  1 should be fine, but if you want more for higher
  # availability, that's perfectly reasonable.  Each replica can handle 50,000 concurrent
  # connections.  There is an affinity rule to prefer scheduling the proxy pods on the same node as
  # pachd, so a number here that matches the number of pachd replicas is a fine configuration.
  # (Note that we don't guarantee to keep the proxy<->pachd traffic on-node or even in-region.)
  replicas: 1
  # The envoy image to pull.
  image:
    repository: "envoyproxy/envoy-distroless"
    tag: "v1.27.1"
    pullPolicy: "IfNotPresent"
  # Set up resources.  The proxy is configured to shed traffic before using 500MB of RAM, so that's
  # a resonable memory limit.  It doesn't need much CPU.
  resources:
    requests:
      cpu: 100m
      memory: 512Mi
    limits:
      memory: 512Mi
  # Any additional labels to add to the pods.  These are also added to the deployment and service
  # selectors.
  labels: {}
  # Any additional annotations to add to the pods.
  annotations: {}
  # A nodeSelector statement for each pod in the proxy Deployment, if desired.
  nodeSelector: {}
  # A tolerations statement for each pod in the proxy Deployment, if desired.
  tolerations: []
  # A priority class name for each pod in the proxy Deployment, if desired.
  priorityClassName: ""
  # Configure the service that routes traffic to the proxy.
  service:
    # The type of service can be ClusterIP, NodePort, or LoadBalancer.
    type: ClusterIP
    # If the service is a LoadBalancer, you can specify the IP address to use.
    loadBalancerIP: ""
    # The port to serve plain HTTP traffic on.
    httpPort: 80
    # The port to serve HTTPS traffic on, if enabled below.
    httpsPort: 443
    # If the service is a NodePort, you can specify the port to receive HTTP traffic on.
    httpNodePort: 30080
    httpsNodePort: 30443
    # Any additional annotations to add.
    annotations: {}
    # Any additional labels to add to the service itself (not the selector!).
    labels: {}
    # The proxy can also serve each backend service on a numbered port, and will do so for any port
    # not numbered 0 here.  If this service is of type NodePort, the port numbers here will be used
    # for the node port, and will need to be in the node port range.
    legacyPorts:
      console: 0 # legacy 30080, conflicts with default httpNodePort
      grpc: 0 # legacy 30650
      s3Gateway: 0 # legacy 30600
      oidc: 0 # legacy 30657
      identity: 0 # legacy 30658
      metrics: 0 # legacy 30656
    # externalTrafficPolicy determines cluster-wide routing policy; see "kubectl explain
    # service.spec.externalTrafficPolicy".
    externalTrafficPolicy: ""
  # Configuration for TLS (SSL, HTTPS).
  tls:
    # If true, enable TLS serving.  Enabling TLS is incompatible with support for legacy ports (you
    # can't get a generally-trusted certificate for port numbers), and disables support for
    # cleartext communication (cleartext requests will redirect to the secure server, and HSTS
    # headers are set to prevent downgrade attacks).
    #
    # Note that if you are planning on putting the proxy behind an ingress controller, you probably
    # want to configure TLS for the ingress controller, not the proxy.  This is intended for the
    # case where the proxy is exposed directly to the Internet.  (It is possible to have your
    # ingress controller talk to the proxy over TLS, in which case, it's fine to enable TLS here in
    # addition to in the ingress section above.)
    enabled: false
    # The secret containing "tls.key" and "tls.crt" keys that contain PEM-encoded private key and
    # certificate material.  Generate one with "kubectl create secret tls <name> --key=tls.key
    # --cert=tls.cert".  This format is compatible with the secrets produced by cert-manager, and
    # the proxy will pick up new data when cert-manager rotates the certificate.
    secretName: ""
    # If set, generate the secret from values here.  This is intended only for unit tests.
    secret: {}
preflightCheckJob:
  # If true, install a Kubernetes job that runs preflight checks from the configured Pachyderm
  # release.
  enabled: false

  # The version to preflight.  It is totally fine if this is newer than the currently-running pachd
  # version.
  image:
    repository: "pachyderm/pachd"
    pullPolicy: "IfNotPresent"
    tag: ""

  # misc k8s settings
  affinity: {}
  annotations: {}
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  priorityClassName: ""
  podLabels: {}
  nodeSelector: {}
  tolerations: []

  # logging settings
  sqlQueryLogs: false
  disableLogSampling: false

Helm Series

Deploy Target HCVs
Global HCVs
Console HCVs
Enterprise Server HCVs
ETCD HCVs
Ingress HCVs
Loki HCVs
PachD HCVs
PachW HCVs
Kube Event Tail HCVs
PGBouncer HCVs
PostgreSQL Subchart HCVs
CloudSQL Auth Proxy HCVs
OpenID Connect HCVs
Test Connection HCVs
Proxy HCVs