Skip to main content

Installing NucliaDB

There are 2 primary ways NucliaDB is distributed right now: a python package and a docker container.

You can install NucliaDB in a few different ways:

  • Docker: The easiest way to get up and running.
  • Python PIP: Best for when you are working in Jupyter notebooks or with ad-hoc environments.
  • Kubernetes: Deploy NucliaDB in Kubernetes with our helm chart.

Make sure to configure your NUA_API_KEY environment variable with your provisioned NUA key.

Read the Configuration docs for a complete set of environment variables available to configure NucliaDB.

PostgreSQL + S3

For simple on-premise NucliaDB installations, it is recommended to utilize PostgreSQL as the key-value driver and any S3-compatible service for BLOB file storage (e.g: Minio).

This doc will not describe how to install PosgreSQL or Minio; however, it will assume you have them installed somewhere to reference through configs.

Install with Python + PIP

Python version: 3.11

Check out PyPI for the list of available nucliadb versions: https://pypi.org/project/nucliadb/

pip install --upgrade pip wheel
pip install nucliadb

Then, to run(assuming you are using PG as key-value backend and a Minio as file backend):

nucliadb --driver PG --file-backend S3 \
--driver-pg-url=postgresql://postgres:password@HOSTNAME:5432/postgres \
--s3-endpoint=http://<minio_server_ip>:9000 \
--s3-client-id=admin \
--s3-client-secret=12345678 \
--s3-bucket=nucliadb-{kbid} \
--nua-api-key="MY_KEY"

Install with docker

Check out DockerHub for the list of available image tags: https://hub.docker.com/r/nuclia/nucliadb

To pull the latest docker image:

docker pull nuclia/nucliadb:latest

To run, standard docker again(with PG and S3 configured):

docker run -it \
-p 8080:8080 \
-v nucliadb-standalone:/data \
-e NUA_API_KEY=MY_KEY \
-e DRIVER=PG \
-e DRIVER_PG_URL=postgresql://postgres:password@HOSTNAME:5432/postgres \
-e FILE_BACKEND=S3 \
-e S3_ENDOINT=http://<minio_server_ip>:9000 \
-e S3_CLIENT_ID=admin \
-e S3_CLIENT_SECRET=12345678 \
-e S3_BUCKET=nucliadb-{kbid} \
nuclia/nucliadb:latest

Accessing the UI

Once you have NucliaDB running, you can access the Admin UI by going to http://[host-ip-or-address]:8080/admin.

For example, if you are running locally with docker, you can access the Admin UI by going to http://localhost:8080/admin.

Deploying on-premise

The recommended way to use NucliaDB is with our cloud offering. It is the most scalable, feature rich and simple way to work with Nuclia. However, if you have use-cases where you want to run the database storage layer, NucliaDB supports a standalone installation.

The open source NucliaDB project recommends an on-premise installation using a PostgreSQL database as a persistent store for resource metadata and S3/GCS compatible layers for BLOB file data.

Once you have PostgreSQL and Minio running, run the nucliadb command with:

docker run -it \
-p 8080:8080 \
-v nucliadb-standalone:/data \
-e NUA_API_KEY=MY_KEY \
-e DRIVER=PG \
-e DRIVER_PG_URL=postgresql://postgres:password@HOSTNAME:5432/postgres \
-e FILE_BACKEND=S3 \
-e S3_ENDOINT=http://<minio_server_ip>:9000 \
-e S3_CLIENT_ID=admin \
-e S3_CLIENT_SECRET=12345678 \
-e S3_BUCKET=nucliadb-{kbid} \
nuclia/nucliadb:latest

Read our docs on integrating with different cloud environments or installing with helm.

Architecture

How the NucliaDB Standalone on-premise is designed.

NucliaDB Architecture

nucliadb command args

NucliaDB command line arguments:

  • --driver-pg-url: Database connection string for Postgres
  • --file-backend(LOCAL, GCS, S3, AZURE): Blob data storage type. Defaults to LOCAL. Checkout the file storage documentation page for more details on the supported storage types.
  • --data-path: Path to storage index data. Defaults to ./data/node.
  • --nua-api-key: Nuclia Understanding API Key.
  • --zone: Nuclia Understanding API Zone ID.

Use nucliadb --help or the read the Configuration docs for a complete set of options.

Do NOT use default LOCAL file backend drivers for production.

Specs heavily depends on the workload you intend to use with NucliaDB.

Start with:

  • CPU: 4
  • RAM: 16gb
  • Disk: 50gb