Skip to main content

Documentation Index

Fetch the complete documentation index at: https://openmetadata-feat-feat-gkerunnermwaa.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

GKE on Google Cloud Platform Deployment

OpenMetadata supports the Installation and Running of Application on Google Kubernetes Engine through Helm Charts. However, there are some additional configurations which needs to be done as prerequisites for the same.
Google Kubernetes Engine (GKE) Auto Pilot Mode is not compatible with one of OpenMetadata Dependencies - ElasticSearch. The reason being that ElasticSearch Pods require Elevated permissions to run initContainers for changing configurations which is not allowed by GKE AutoPilot PodSecurityPolicy.
All the code snippets in this section assume the default namespace for kubernetes.

Prerequisites

Cloud Database with CloudSQL and ElasticCloud for GCP as Search Engine

It is recommended to use GCP Cloud SQL services for Database and Elastic Cloud GCP for Search Engine for Production. We support -
  • Cloud SQL (MySQL) engine version 8 or higher
  • Cloud SQL (postgreSQL) engine version 12 or higher
  • ElasticSearch Engine version 8.X (upto 8.10.X)
We recommend -
  • CloudSQL to be Multi Zone Available
  • Elastic Cloud Environment with multiple zones and minimum 2 nodes
Make sure to increase sort_buffer_size (for MySQL) or work_mem (for PostgreSQL) to the recommended value of 20MB or more using flags. This is especially important when running migrations to prevent Out of Sort Memory Error. You can revert the setting once the migrations are complete.
Once you have the Database and Search Engine configured and available, update the helm values below for OpenMetadata kubernetes deployments to connect with Database and ElasticSearch.
# openmetadata-values.prod.yaml
...
openmetadata:
  config:
    elasticsearch:
      host: <ELASTIC_CLOUD_SERVICE_ENDPOINT_WITHOUT_HTTPS>
      searchType: elasticsearch
      port: 443
      scheme: https
      connectionTimeoutSecs: 5
      socketTimeoutSecs: 60
      keepAliveTimeoutSecs: 600
      batchSize: 10
      auth:
        enabled: true
        username: <ELASTIC_CLOUD_USERNAME>
        password:
          secretRef: elasticsearch-secrets
          secretKey: openmetadata-elasticsearch-password
    database:
      host: <GCP_CLOUD_SQL_ENDPOINT_IP>
      port: 3306
      driverClass: com.mysql.cj.jdbc.Driver
      dbScheme: mysql
      dbUseSSL: true
      databaseName: <GCP_CLOUD_SQL_DATABASE_NAME>
      auth:
        username: <GCP_CLOUD_SQL_DATABASE_USERNAME>
        password:
          secretRef: mysql-secrets
          secretKey: openmetadata-mysql-password
  ...
For Database as PostgreSQL, the use the below config for database values -
# openmetadata-values.prod.yaml
...
openmetadata:
  config:
    ...
    database:
      host: <GCP_CLOUD_SQL_ENDPOINT_IP>
      port: 5432
      driverClass: org.postgresql.Driver
      dbScheme: postgresql
      dbUseSSL: true
      databaseName: <GCP_CLOUD_SQL_DATABASE_NAME>
      auth:
        username: <GCP_CLOUD_SQL_DATABASE_USERNAME>
        password:
          secretRef: sql-secrets
          secretKey: openmetadata-sql-password
Make sure to create CloudSQL and ElasticSearch credentials as Kubernetes Secrets mentioned here.Also, disable MySQL and ElasticSearch from OpenMetadata Dependencies Helm Charts as mentioned in the FAQs here.
For the Airflow orchestrator NFS setup, persistent volume configuration, permissions, and troubleshooting, see the GKE Airflow Orchestrator guide.