Skip to main content

Configuring Standalone NucliaDB

NucliaDB allows all configuration to be done with environment variables or CLI arguments.

FILE_BACKEND

Type: Enum: (gcs, s3, local)

CLI Arg: --file-backend

Default: None

Description: File backend storage type

GCS_BASE64_CREDS

Type: str

CLI Arg: --gcs-base64-creds

Default: None

Description: GCS JSON credentials of a service account encoded in Base64: https://cloud.google.com/iam/docs/service-account-overview

GCS_BUCKET

Type: str

CLI Arg: --gcs-bucket

Default: None

Description: GCS Bucket name where files are stored: https://cloud.google.com/storage/docs/buckets

GCS_LOCATION

Type: str

CLI Arg: --gcs-location

Default: None

Description: GCS Bucket location: https://cloud.google.com/storage/docs/locations

GCS_PROJECT

Type: str

CLI Arg: --gcs-project

Default: None

Description: Google Cloud Project ID: https://cloud.google.com/resource-manager/docs/creating-managing-projects

GCS_BUCKET_LABELS

Type: str

CLI Arg: --gcs-bucket-labels

Default: {}

Description: Map of labels with which GCS buckets will be labeled with: https://cloud.google.com/storage/docs/tags-and-labels

GCS_ENDPOINT_URL

Type: str

CLI Arg: --gcs-endpoint-url

Default: https://www.googleapis.com

S3_CLIENT_ID

Type: str

CLI Arg: --s3-client-id

Default: None

S3_CLIENT_SECRET

Type: str

CLI Arg: --s3-client-secret

Default: None

S3_SSL

Type: bool

CLI Arg: --s3-ssl

Default: True

S3_VERIFY_SSL

Type: bool

CLI Arg: --s3-verify-ssl

Default: True

S3_MAX_POOL_CONNECTIONS

Type: int

CLI Arg: --s3-max-pool-connections

Default: 30

S3_ENDPOINT

Type: str

CLI Arg: --s3-endpoint

Default: None

S3_REGION_NAME

Type: str

CLI Arg: --s3-region-name

Default: None

S3_BUCKET

Type: str

CLI Arg: --s3-bucket

Default: None

LOCAL_FILES

Type: str

CLI Arg: --local-files

Default: None

Description: If using LOCAL file_backend storage, directory where files should be stored

UPLOAD_TOKEN_EXPIRATION

Type: int

CLI Arg: --upload-token-expiration

Default: 3

Description: Number of days that uploaded files are kept in Nulia's processing engine

DRIVER_PG_URL

Type: str

CLI Arg: --driver-pg-url

Default: None

Description: PostgreSQL DSN. The connection string to the PG server. Example: postgres://nucliadb:nucliadb@postgres:5432/nucliadb. See the complete PostgreSQL documentation: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING

DRIVER

Type: Enum: (redis, tikv, pg, local)

CLI Arg: --driver

Default: None

Description: K/V storage driver

DRIVER_REDIS_URL

Type: str

CLI Arg: --driver-redis-url

Default: None

Description: Redis URL. Example: redis://localhost:6379

DRIVER_TIKV_URL

Type: str

CLI Arg: --driver-tikv-url

Default: None

Description: TiKV PD (Placement Dricer) URL. The URL to the cluster manager of TiKV. Example: tikv-pd.svc:2379

DRIVER_LOCAL_URL

Type: str

CLI Arg: --driver-local-url

Default: None

Description: Local path to store data on file system. Example: /nucliadb/data/main

DATA_PATH

Type: str

CLI Arg: --data-path

Default: ./data/node

Description: Path to node index files

NUA_API_KEY

Type: str

CLI Arg: --nua-api-key

Default: None

Description: Nuclia Understanding API Key. Read how to generate a NUA Key here: https://docs.nuclia.dev/docs/guides/using/understanding/intro#get-a-nua-key

ZONE

Type: str

CLI Arg: --zone

Default: None

Description: Nuclia Understanding API Zone ID

HTTP_PORT

Type: int

CLI Arg: --http-port

Default: 8080

Description: HTTP Port

INGEST_GRPC_PORT

Type: int

CLI Arg: --ingest-grpc-port

Default: 8030

Description: Ingest GRPC Port

TRAIN_GRPC_PORT

Type: int

CLI Arg: --train-grpc-port

Default: 8031

Description: Train GRPC Port

STANDALONE_NODE_PORT

Type: int

CLI Arg: --standalone-node-port

Default: 10009

Description: Node GRPC Port

AUTH_POLICY

Type: Enum: (upstream_naive, upstream_auth_header, upstream_oauth2, upstream_basicauth)

CLI Arg: --auth-policy

Default: upstream_naive

Description: Auth policy to use for http requests.

  • upstream_naive will assume X-NUCLIADB-ROLES and X-NUCLIADB-USER http headers are set by a trusted upstream proxy. This can also be used for testing locally with no auth proxy, manually supplying headers.
  • upstream_auth_header will assume request is validated upstream and upstream passes header defined in auth_policy_header setting.
  • upstream_oauth2 will assume Bearer token is validated upstream and is passed down in Authorization header.
  • upstream_basicauth will assume Basic Auth is validated upstream and is passed down in Authorization header.

AUTH_POLICY_USER_HEADER

Type: str

CLI Arg: --auth-policy-user-header

Default: X-NUCLIADB-USER

Description: Header to read user id from. Only used for upstream_naive and upstream_auth_header auth policy.

AUTH_POLICY_ROLES_HEADER

Type: str

CLI Arg: --auth-policy-roles-header

Default: X-NUCLIADB-ROLES

Description: Only used for upstream_naive auth policy.

AUTH_POLICY_USER_DEFAULT_ROLES

Type: Enum: (MANAGER, READER, WRITER)

CLI Arg: --auth-policy-user-default-roles

Default: [<NucliaDBRoles.READER: 'READER'>, <NucliaDBRoles.WRITER: 'WRITER'>, <NucliaDBRoles.MANAGER: 'MANAGER'>]

Description: Default role to assign to user that is authenticated upstream. Not used with upstream_naive auth policy.

AUTH_POLICY_ROLE_MAPPING

Type: json

CLI Arg: --auth-policy-role-mapping

Default: None

Description: Role mapping for upstream_auth_header, upstream_oauth2 and upstream_basicauth auth policies. Allows mapping different properties from the auth request to a role. Available roles are: READER, WRITER, MANAGER. Examples:

  • {"user": {"john@doe.com": ["READER", "WRITER"]}} will map the user john@doe.com to the role MANAGER on upstream_auth_header policies.
  • {"group": {"managers": "MANAGER"}} will map the users that have a group claim of managers on the jwt provided by upstream to the role MANAGER on upstream_oauth2 policies.

JWK_KEY

Type: str

CLI Arg: --jwk-key

Default: None

Description: JWK key used for temporary token generation and validation.

CLUSTER_DISCOVERY_MODE

Type: Enum: (default, manual, kubernetes, single_node)

CLI Arg: --cluster-discovery-mode

Default: default