Skip to main content

Index Google Drive files in Nuclia with Google Colab

Google Colab is a free cloud service that allows you to run Python code in a Jupyter notebook environment. It is particularly useful for machine learning and data analysis tasks.

One of the advantages of using Google Colab is that it integrates seamlessly with Google Drive, allowing you to access and import files directly from your Google Drive account.

In this tutorial, we will show you how to import files from Google Drive into Nuclia using Google Colab.

Step 1: Open Google Colab

First, open Google Colab by going to https://colab.research.google.com/.

Step 2: Create a new notebook

Click on the "New notebook" button to create a new notebook.

Step 3: Mount Google Drive

Run the following code in a code cell to mount your Google Drive:

from google.colab import drive
drive.mount('/content/drive')

This will prompt you to authorize Google Colab to access your Google Drive. Follow the instructions to complete the authorization process.

Step 4: Install the Nuclia SDK

Run the following code in a code cell to install the Nuclia Python SDK:

!pip install nuclia

Step 5: Authenticate with Nuclia

Run the following code in a code cell to authenticate with Nuclia:

from nuclia import sdk
KB_URL="https://<zone>.nuclia.cloud/api/v1/kb/<YOUR_KB_ID>"
API_KEY="<YOUR_API_KEY>"
sdk.NucliaAuth().kb(url=KB_URL, token=API_KEY)

Note: See how to obtain an API key.

Step 6: Import files from Google Drive

To import a single file, run the following code in a code cell to import a file from Google Drive into Nuclia:

sdk.NucliaUpload().file(path="/content/drive/MyDrive/<FILE_PATH>")

To import a directory, run the following code in a code cell to import a directory from Google Drive into Nuclia:

import glob

upload = sdk.NucliaUpload()
for filepath in glob.glob('/content/drive/MyDrive/<FOLDER PATH>/*', recursive = True):
upload.file(path=filepath)

More options

Set metadata

You can set various metadata for your Nuclia resources. For example, a title, a summary, or labels:

sdk.NucliaUpload().file(
path="/content/drive/MyDrive/<FILE_PATH>",
title="My File",
summary="This is a file",
usermetadata={
"classifications": [{"labelset": "priority", "label": "high"}]
},
)

Check the Nuclia API documentation for more information on metadata.

Set a slug

A slug is a user-defined unique identifier for your Nuclia resource. It is used to reference the resource in the Nuclia API. You can set a slug for the imported file by the slug parameter:

sdk.NucliaUpload().file(
path="/content/drive/MyDrive/<FILE_PATH>",
slug="my-file",
)

For more information on the Nuclia Python SDK, see the Nuclia Python SDK documentation.