Skip to main content

Custom knowledge graph

Building a custom knowledge graph can improve the performance of your search engine by providing a more structured and relevant representation of your data. This can be particularly useful for complex queries or when dealing with large datasets.

There are several ways to build a custom knowledge graph with Nuclia:

  • Graph extraction agents
  • Manual graph creation through the API

The first option is detailed in the Data Augmentation Agents section. The second option is detailed below.

Manual graph creation through the API

If you already have some structured data describing the relationships between some business-oriented entities, typically like an ontology, you can push this data to Nuclia by creating a resource that will act as a container for your custom graph.

A custom graph is a set of relations, it can be stored in the usermetadata.relations attribute of one or several resources.

A graph involving entities will have the following format:

[
{
"from": {
"value": "Alice",
"type": "entity",
"group": "PERSON"
},
"to": {
"value": "Italian",
"type": "entity",
"group": "LANGUAGE"
},,
"label": "speaks",
"relation": "ENTITY"
},
{
"from": {
"value": "Bob",
"type": "entity",
"group": "PERSON"
},
"to": {
"value": "Kiswahili",
"type": "entity",
"group": "LANGUAGE"
},,
"label": "speaks",
"relation": "ENTITY"
},
{
"from": {
"value": "Alice",
"type": "entity",
"group": "PERSON"
},
"to": {
"value": "Bob",
"type": "entity",
"group": "PERSON"
},
"label": "is friend of",
"relation": "ENTITY"
}
]

A relation can also be created between a resource and an entity:

[
{
"from": {
"value": "bc218b49700b4a5c9d5ea8a7cfcc8b6f",
"type": "resource"
},
"to": {
"value": "Kiswahili",
"type": "entity",
"group": "LANGUAGE"
},
"relation": "ABOUT"
}
]

And relations can be used to declare synonyms:

[
{
"to": {
"value": "United Kingdom",
"type": "entity",
"group": "COUNTRY"
},
"to": {
"value": "UK",
"type": "entity",
"group": "COUNTRY"
},
"relation": "SYNONYM"
}
]

Storing the graph can be done directly through the API. Example:

POST {kb-path}/resources
{
"slug": "my-custom-graph",
"usermetadata": {
"relations": [
{
"to": {
"value": "United Kingdom",
"type": "entity",
"group": "COUNTRY"
},
"to": {
"value": "UK",
"type": "entity",
"group": "COUNTRY"
},
"relation": "SYNONYM"
}
]
}
}

As the data format is a bit verbose, when creating entity-to-entity relations only, you can use the update_graph method of the Nuclia CLI/SDK which use a simpler format:

from nuclia import sdk
kb = sdk.NucliaKB()
kb.add_graph(slug="my-custom-graph", graph=[
{
"source": {"group": "People", "value": "Alice"},
"destination": {"group": "People", "value": "Bob"},
"label": "is friend of",
}
])

Injecting entities produced by a generator agent

As mentionned in Data Augmentation Agents, you can use a generator agent to produce entities that are not directly linked to a given word or group of words. To inject these entities in a custom graph, you will need to read the generated JSON text fields and store the corresponding relations in a resource.

Here is an implementation example. In a customer support context, a Kownledge Box contains user feedbacks, and we wish to build a Knowledge Graph showing the relations between products, features, issues and user sentiment. A generator agent has been used to extract these information in a JSON text field (prefixed as info).

import csv
import json
from nuclia import sdk

KNOWLEDGE_BOX = "YOUR-KB-URL"
API_KEY = "YOUR-API-KEY"

sdk.NucliaAuth().kb(url=KNOWLEDGE_BOX, token=API_KEY)

def upload(text):
sdk.NucliaResource().create(
texts={"text": {"format": "PLAIN", "body": text}},
)

all = sdk.NucliaKB().list()
for res in all.resources:
resource = sdk.NucliaResource().get(rid=res.id, show=['values'])
# read the JSON data produced by the generator agent in each resource
data_str = resource.data.texts['da-info-t-text'].value.body
data = json.loads(data_str)
graph = []
product = data.get('product', None)
issue = data.get('issue', None)
sentiment = data.get('sentiment', None)
# build all the needed relations
if product:
graph.append({'source': {"value": product, "group": "name"}, 'destination': {"value": "product", "group": "kind"}, 'label': 'is'})
if feature:
graph.append({'source': {"value": feature, "group": "name"}, 'destination': {"value": "feature", "group": "kind"}, 'label': 'is'})
if issue:
graph.append({'source': {"value": issue, "group": "name"}, 'destination': {"value": "issue", "group": "kind"}, 'label': 'is'})
if sentiment:
graph.append({'source': {"value": sentiment, "group": "name"}, 'destination': {"value": "sentiment", "group": "kind"}, 'label': 'is'})
if product and feature:
graph.append({'source': {"value": product, "group": "product"}, 'destination': {"value": feature, "group": "feature"}, 'label': 'has feature'})
if product and issue:
graph.append({'source': {"value": product, "group": "product"}, 'destination': {"value": issue, "group": "issue"}, 'label': 'has issue'})
if product and sentiment:
graph.append({'source': {"value": product, "group": "product"}, 'destination': {"value": sentiment, "group": "sentiment"}, 'label': 'has perceived sentiment'})
if feature and issue:
graph.append({'source': {"value": feature, "group": "feature"}, 'destination': {"value": issue, "group": "issue"}, 'label': 'has issue'})
if feature and sentiment:
graph.append({'source': {"value": feature, "group": "feature"}, 'destination': {"value": sentiment, "group": "sentiment"}, 'label': 'has perceived sentiment'})
if issue and sentiment:
graph.append({'source': {"value": issue, "group": "issue"}, 'destination': {"value": sentiment, "group": "sentiment"}, 'label': 'has perceived sentiment'})

# store the full graph a unique resource
sdk.NucliaKB().update_graph(uid=res.id, graph=graph, override=True)