Index a batch of text resources

This guide will help you index a batch of text resources with their associated metadata, listed in a CSV file.

Prerequisites

API Key: Obtain a contributor or writer API key as detailed here.
Python 3: Ensure Python 3 is installed on your system:
```
python --version
```
Nuclia SDK: Install the Nuclia package in your environment::
```
pip install nuclia
```

Run the script

Assuming your text data comes with the following metadata:

a country
an URL (that's an example, you can adapt it to your own data)

Use the following Python script to upload your text contents to your Nuclia knowledgebox:

import csv
from nuclia import sdk
import sys

KNOWLEDGE_BOX = "https://<zone>.nuclia.cloud/api/v1/kb/<your-knowledge-box-id>"
API_KEY = "<your-api-key-with-contributor-access>"

sdk.NucliaAuth().kb(url=KNOWLEDGE_BOX, token=API_KEY)

def upload(row):
    sdk.NucliaResource().create(
        slug=row['id'],
        title=row['id'],
        texts={"text": {"format": "PLAIN", "body": row["text"]}},
        usermetadata={
            "classifications": [{"labelset":"country", "label": row["country"]}]
        },
        origin={
          "url": row["url"],
        },
    )

def read_file(path):
    with open(path) as csvfile:
        reader = csv.DictReader(csvfile, delimiter=';', quotechar='"')
        for row in reader:
            upload(row)

if __name__ == "__main__":
    file_path = sys.argv[1]
    read_file(file_path)

To execute the script and start uploading a website, run the following command:

python3 script.py /path/to/csv

For more information on the Nuclia Python SDK, see the Nuclia Python SDK documentation.

Prerequisites​

Run the script​

Prerequisites

Run the script