Skip to main content

Installing NucliaDB

There are 2 primary ways NucliaDB is distributed right now: a python package and a docker container.

You can install NucliaDB in a few different ways:

  • Docker: The easiest way to get up and running.
  • Python PIP: Best for when you are working in Jupyter notebooks or with ad-hoc environments.
  • Virtual Machine Installation Script: If you are running NucliaDB in a provisioned VM, use this script to get everything running quickly.
  • Kubernetes: Deploy NucliaDB in Kubernetes with our helm chart.

Make sure to configure your NUA_API_KEY environment variable with your provisioned NUA key.

Read the Configuration docs for a complete set of environment variables available to configure NucliaDB.

PostgreSQL

It is recommended to utilize PostgreSQL as the key-value and blob storage driver for simple on-premise NucliaDB installations.

This doc will not describe how to install PosgreSQL; however, it will assume you have it installed somewhere to reference through configs.

Install with Python + PIP

Python version: 3.11

Check out PyPI for the list of available nucliadb versions: https://pypi.org/project/nucliadb/

pip install --upgrade pip wheel
pip install nucliadb

Then, to run(assuming you are using PG as a backend):

nucliadb --driver PG --file-backend PG \
--driver-pg-url=postgresql://postgres:password@HOSTNAME:5432/postgres \
--nua-api-key="MY_KEY"

To run locally with local file storage(NOT RECOMMENDED FOR PRODUCTION):

nucliadb

Install with docker

Check out DockerHub for the list of available image tags: https://hub.docker.com/r/nuclia/nucliadb

To pull the latest docker image:

docker pull nuclia/nucliadb:latest

To run, standard docker again(with PG configured):

docker run -it \
-p 8080:8080 \
-v nucliadb-standalone:/data \
-e NUA_API_KEY=MY_KEY \
-e DRIVER=PG \
-e FILE_BACKEND=PG \
-e DRIVER_PG_URL=postgresql://postgres:password@HOSTNAME:5432/postgres \
nuclia/nucliadb:latest

To run locally with local file storage(NOT RECOMMENDED FOR PRODUCTION):

docker run -it \
-p 8080:8080 \
-v nucliadb-standalone:/data \
-e NUA_API_KEY=MY_KEY \
nuclia/nucliadb:latest

Accessing the UI

Once you have NucliaDB running, you can access the Admin UI by going to http://[host-ip-or-address]:8080/admin.

For example, if you are running locally with docker, you can access the Admin UI by going to http://localhost:8080/admin.

Deploying on-premise

The recommended way to use NucliaDB is with our cloud offering. It is the most scalable, feature rich and simple way to work with Nuclia. However, if you have use-cases where you want to run the database storage layer, NucliaDB supports a standalone installation.

The open source NucliaDB project recommends an on-premise installation using a PostgreSQL database as a persistent store for resource metadata and S3/GCS compatible layers for BLOB file data. However, for simplicity, the following example also uses PostgreSQL as BLOB file backend.

Once you have a PostgreSQL server running, run the nucliadb command with:

docker run -it \
-p 8080:8080 \
-v nucliadb-standalone:/data \
-e NUA_API_KEY=MY_KEY \
-e DRIVER=PG \
-e FILE_BACKEND=PG \
-e DRIVER_PG_URL=postgresql://postgres:password@HOSTNAME:5432/postgres \
nuclia/nucliadb:latest

Read our docs on integrating with different cloud environments or installing with helm.

Architecture

How the NucliaDB Standalone on-premise is designed.

NucliaDB Architecture

nucliadb command args

NucliaDB command line arguments:

  • --driver(LOCAL|PG|REDIS): Storage driver. Defaults to LOCAL.
  • --driver-pg-url: Database connection string for redis if used
  • --driver-redis-url: Database connection string for redis if used
  • --file-backend(LOCAL, PG, GCS, S3): Blob data storage type. Defaults to LOCAL.
  • --data-path: Path to storage index data. Defaults to ./data/node.
  • --nua-api-key: Nuclia Understanding API Key.
  • --zone: Nuclia Understanding API Zone ID.

Use nucliadb --help or the read the Configuration docs for a complete set of options.

Do NOT use default LOCAL storage and file backend drivers for production.

Specs heavily depends on the workload you intend to use with NucliaDB.

Start with:

  • CPU: 4
  • RAM: 16gb
  • Disk: 50gb