Index Google Drive files in Nuclia with Google Colab
Google Colab is a free cloud service that allows you to run Python code in a Jupyter notebook environment. It is particularly useful for machine learning and data analysis tasks.
One of the advantages of using Google Colab is that it integrates seamlessly with Google Drive, allowing you to access and import files directly from your Google Drive account.
In this tutorial, we will show you how to import files from Google Drive into Nuclia using Google Colab.
Step 1: Open Google Colab
First, open Google Colab by going to https://colab.research.google.com/.
Step 2: Create a new notebook
Click on the "New notebook" button to create a new notebook.
Step 3: Mount Google Drive
Run the following code in a code cell to mount your Google Drive:
from google.colab import drive
drive.mount('/content/drive')
This will prompt you to authorize Google Colab to access your Google Drive. Follow the instructions to complete the authorization process.
Step 4: Install the Nuclia SDK
Run the following code in a code cell to install the Nuclia Python SDK:
!pip install nuclia
Step 5: Authenticate with Nuclia
Run the following code in a code cell to authenticate with Nuclia:
from nuclia import sdk
KB_URL="https://<zone>.nuclia.cloud/api/v1/kb/<YOUR_KB_ID>"
API_KEY="<YOUR_API_KEY>"
sdk.NucliaAuth().kb(url=KB_URL, token=API_KEY)
Note: See how to obtain an API key.
Step 6: Import files from Google Drive
To import a single file, run the following code in a code cell to import a file from Google Drive into Nuclia:
sdk.NucliaUpload().file(path="/content/drive/MyDrive/<FILE_PATH>")
To import a directory, run the following code in a code cell to import a directory from Google Drive into Nuclia:
import glob
upload = sdk.NucliaUpload()
for filepath in glob.glob('/content/drive/MyDrive/<FOLDER PATH>/*', recursive = True):
upload.file(path=filepath)
More options
Set metadata
You can set various metadata for your Nuclia resources. For example, a title, a summary, or labels:
sdk.NucliaUpload().file(
path="/content/drive/MyDrive/<FILE_PATH>",
title="My File",
summary="This is a file",
usermetadata={
"classifications": [{"labelset": "priority", "label": "high"}]
},
)
Check the Nuclia API documentation for more information on metadata.
Set a slug
A slug is a user-defined unique identifier for your Nuclia resource. It is used to reference the resource in the Nuclia API. You can set a slug for the imported file by the slug
parameter:
sdk.NucliaUpload().file(
path="/content/drive/MyDrive/<FILE_PATH>",
slug="my-file",
)
For more information on the Nuclia Python SDK, see the Nuclia Python SDK documentation.