This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Configuration

Learn how to configure the extensions for Apache Iceberg.

    Here, you will find the configuration properties for the Apache Iceberg extension.

    Configuration

    The Apache Iceberg extension connects to an Iceberg catalog. You configure it through the Jikkou client configuration property jikkou.provider.iceberg.

    Example (JDBC catalog with PostgreSQL):

    jikkou {
      provider.iceberg {
        enabled = true
        type = io.streamthoughts.jikkou.iceberg.IcebergExtensionProvider
        config = {
          # Required — type of Iceberg catalog.
          # Accepted values: rest, hive, jdbc, glue, nessie, hadoop
          catalogType = "jdbc"
    
          # The catalog name used to identify this catalog instance.
          catalogName = "default"
    
          # The URI of the catalog endpoint (REST URL, Hive Metastore URI, JDBC URL, etc.)
          catalogUri = "jdbc:postgresql://localhost:5432/iceberg"
    
          # The warehouse root location (e.g., local path, S3 bucket path, HDFS path)
          warehouse = "/tmp/iceberg-warehouse"
    
          # Extra catalog-specific properties passed directly to CatalogUtil.buildIcebergCatalog()
          catalogProperties {
            jdbc.user     = "iceberg"
            jdbc.password = "iceberg"
          }
    
          # Enable verbose debug logging for catalog operations (default: false)
          debugLoggingEnabled = false
        }
      }
    }
    

    Configuration Properties

    PropertyTypeRequiredDefaultDescription
    catalogTypeStringyesIceberg catalog type: rest, hive, jdbc, glue, nessie, hadoop
    catalogNameStringnodefaultThe catalog instance name
    catalogUriStringnoCatalog endpoint URI (REST API URL, Hive Metastore thrift URI, JDBC URL, Nessie server URL)
    warehouseStringnoWarehouse root location (e.g., s3://bucket/warehouse, /tmp/iceberg)
    catalogPropertiesMapnoAdditional catalog properties forwarded verbatim to the Iceberg CatalogUtil
    debugLoggingEnabledBooleannofalseEnable debug-level logging for catalog operations

    Catalog Types

    JDBC Catalog (PostgreSQL)

    Stores catalog metadata (namespaces, table specs) in a relational database. The PostgreSQL JDBC driver is bundled in the Jikkou CLI distribution.

    config = {
      catalogType = "jdbc"
      catalogUri  = "jdbc:postgresql://localhost:5432/iceberg"
      warehouse   = "/tmp/iceberg-warehouse"
      catalogProperties {
        jdbc.user     = "iceberg"
        jdbc.password = "iceberg"
      }
    }
    

    REST Catalog

    Connects to any Iceberg REST Catalog API (e.g., Polaris, Gravitino, Unity Catalog):

    config = {
      catalogType = "rest"
      catalogUri  = "https://polaris.example.com/api/catalog"
      warehouse   = "s3://my-bucket/warehouse"
      catalogProperties {
        rest.signing-name   = "execute-api"
        rest.signing-region = "us-east-1"
      }
    }
    

    Hive Metastore

    Connects to an Apache Hive Metastore (requires iceberg-hive-metastore on the classpath):

    config = {
      catalogType = "hive"
      catalogUri  = "thrift://hive-metastore:9083"
      warehouse   = "hdfs://namenode:8020/user/hive/warehouse"
    }
    

    AWS Glue

    Connects to AWS Glue Data Catalog (requires iceberg-aws on the classpath):

    config = {
      catalogType = "glue"
      warehouse   = "s3://my-bucket/warehouse"
      catalogProperties {
        glue.region = "us-east-1"
      }
    }
    

    Nessie

    Nessie exposes a standard Iceberg REST catalog endpoint at /iceberg. Using catalogType = "rest" is recommended because it relies only on iceberg-core (always bundled in the Jikkou CLI). The catalogType = "nessie" variant requires the optional iceberg-nessie JAR on the classpath.

    # Recommended: use Nessie's built-in Iceberg REST endpoint
    config = {
      catalogType = "rest"
      catalogUri  = "http://nessie:19120/iceberg"
      warehouse   = "s3://my-bucket/warehouse"
      catalogProperties {
        prefix = "main"   # Nessie branch
      }
    }
    

    Controller Settings

    The table and view controllers expose additional options to control reconciliation behaviour. These are set inside the provider config block.

    Table Controller

    PropertyTypeDefaultDescription
    delete-orphansBooleanfalseDrop tables that exist in the catalog but are not defined in any resource
    delete-orphan-columnsBooleanfalseDrop columns present in the live table but absent from the spec
    delete-purgeBooleanfalsePurge underlying data files when dropping a table (irreversible)
    tables.deletion.excludeList<Pattern>[]Regex patterns — matching table names are never deleted

    View Controller

    PropertyTypeDefaultDescription
    delete-orphansBooleanfalseDrop views that exist in the catalog but are not defined in any resource
    views.deletion.excludeList<Pattern>[]Regex patterns — matching view names are never deleted

    Example:

    jikkou {
      provider.iceberg {
        enabled = true
        type = io.streamthoughts.jikkou.iceberg.IcebergExtensionProvider
        config = {
          catalogType = "rest"
          catalogUri  = "http://localhost:8181"
          warehouse   = "s3://my-bucket/warehouse"
    
          # Table reconciliation safety settings
          delete-orphans        = false
          delete-orphan-columns = false
          delete-purge          = false
    
          # Never delete tables whose name starts with "audit_"
          tables.deletion.exclude = ["^audit_.*"]
    
          # Never delete views whose name starts with "v_core_"
          views.deletion.exclude = ["^v_core_.*"]
        }
      }
    }