Add Metadata
Adding metadata to your resources is crucial for enhancing search accuracy, filtering, and overall data management within Nuclia. Metadata provides additional context and attributes to your data, making it more searchable and organized.
What is Metadata?
Metadata is information that describes other data. In the context of Nuclia, metadata can include tags, labels, classifications, access control information, and other attributes that help categorize and manage your resources.
Types of Metadata
- System Metadata: Automatically extracted by Nuclia, such as file type, language, and creation date.
- User-Defined Metadata: Custom metadata that you can define and attach to your data, such as tags, labels, and classifications.
How to Add Metadata
- Dashboard
- API
- CLI
- Python SDK
In the Nuclia Dashboard, you can manually add or edit metadata for your data entries:
- Upload or Select Data: Upload new data or select existing data in your knowledgebox.
- Add Metadata: Navigate to the metadata section and add or edit fields such as metadata or access groups.
- Save Changes: Ensure you save your changes to update the metadata.
curl --location --request PATCH 'https://<zone>.nuclia.cloud/api/v1/kb/<your-knowledge-box-id>/resource/<your_resource_id>' \
--header 'X-NUCLIA-SERVICEACCOUNT: Bearer YOUR_API_KEY' \
--header 'content-type: application/json' \
--data '{
"extra": {
"metadata": {
"example-key": "example-value"
}
},
"security": {
"access_groups": [
"group1",
"group2"
]
},
"origin": {
"path": "/a/b"
}
}'
nuclia knowledgebox resource update --rid <your-resource-id> --extra '{"metadata":{"example-key": "example-value"}}' --security '{"access_groups": ["group1", "group2"]}' --origin '{"path": "/a/b"}'
from nuclia import sdk
# Nuclia knowledge box URL and API key
KNOWLEDGE_BOX_URL = "https://<zone>.nuclia.cloud/api/v1/kb/<your-knowledge-box-id>"
API_KEY = "<your-api-key>"
# Authenticate with the Nuclia SDK
sdk.NucliaAuth().kb(url=KNOWLEDGE_BOX_URL, token=API_KEY)
# Update the resource with new metadata
sdk.NucliaResource().update(
rid="<your_resource_id>",
metadata={"example-key": "example-value"},
security={"access_groups": ["group1", "group2"]},
origin={"path": "/a/b"},
)
Key Metadata Fields
- origin.path: Stores the original file path for hierarchical resources. Useful for partial match filtering.
- usermetadata.classifications: Allows setting custom classifications or labels for resources.
- security.access_groups: Specifies user groups that can access the resource. Used for filtering based on group access. Note: This is for filtering purposes and does not enforce security. Filtering works on group intersection (e.g., if a resource has
security.access_groups
set to["group1", "group2"]
, it will be returned if you filter on["group1"]
,["group2","group3"]
,["group1","group2","group3"]
, etc.).
Efficient Filtering
Filtering is based on metadata attached to your data (see Filtering). It helps with:
- Narrow down search results.
- Control access to data.
- Manage the data lifecycle (e.g., archiving unused data).
Best Practices for Adding Metadata
- Be Consistent: Use consistent naming conventions and formats for your metadata to ensure uniformity.
- Use Descriptive Labels: Choose clear and descriptive labels for your metadata to make it easier to understand and search.
- Leverage System Metadata: Utilize the automatically extracted metadata for basic categorization and enhance it with custom metadata as needed.
- Review and Update Regularly: Periodically review and update metadata to ensure it remains accurate and relevant.
By effectively adding and managing metadata, you can significantly improve the searchability, organization, and usability of your data within Nuclia.