kolibrie@docs : ~/docs $ cat knowledge-graph-&-reasoning.md

Knowledge Graph & Reasoning

Introduction to Knowledge Graphs
RDF and Knowledge Graph Components
Loading RDF Data
Querying Knowledge Graphs
The Reasoner
Defining Rules
Forward Chaining
Backward Chaining
Integrity Constraints
N3 Logic Rules
ML Integration in Rules
Benchmarks

Introduction to Knowledge Graphs

A knowledge graph is a structured representation of interconnected data. It captures entities, relationships between entities, and attributes in a meaningful network, making it easier to explore and derive insights from complex datasets.

RDF and Knowledge Graph Components

In RDF (Resource Description Framework), a knowledge graph consists of:

Entities: Represented by subjects and objects (e.g., people, organizations).
Relationships: Represented by predicates connecting entities.
Attributes: Properties providing additional details about entities.

Loading RDF Data

Create a SparqlDatabase and load your RDF data:

use kolibrie::SparqlDatabase;
use kolibrie::execute_query::execute_query;

fn main() {
    let rdf_data = r#"
    <?xml version="1.0" encoding="UTF-8"?>
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:foaf="http://xmlns.com/foaf/0.1/"
             xmlns:ex="http://example.org/">
      <rdf:Description rdf:about="http://example.org/alice">
        <foaf:name>Alice Smith</foaf:name>
        <ex:worksAt rdf:resource="http://example.org/company1"/>
        <foaf:knows rdf:resource="http://example.org/bob"/>
      </rdf:Description>
      <rdf:Description rdf:about="http://example.org/bob">
        <foaf:name>Bob Johnson</foaf:name>
        <ex:worksAt rdf:resource="http://example.org/company2"/>
      </rdf:Description>
    </rdf:RDF>
    "#;

    let mut db = SparqlDatabase::new();
    db.parse_rdf(rdf_data);
}

Querying Knowledge Graphs

SPARQL queries retrieve entities and their relationships.

Basic Query

Retrieve people and their workplaces:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.org/>

SELECT ?person ?name ?company
WHERE {
    ?person foaf:name ?name .
    ?person ex:worksAt ?company
}

Advanced Query

Find people who know each other but work at different companies:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.org/>

SELECT ?person1 ?person2
WHERE {
    ?person1 foaf:knows ?person2 .
    ?person1 ex:worksAt ?company1 .
    ?person2 ex:worksAt ?company2 .
    FILTER(?company1 != ?company2)
}

The Reasoner

Beyond SPARQL queries, Kolibrie includes a Reasoner that lets you define logic rules and automatically derive new facts from existing data. The Reasoner operates independently of the query engine — use it when you need rule-based inference that produces persistent new triples, not just query-time results.

Creating a Reasoner and Adding Facts

Rust:

use datalog::reasoning::Reasoner;

let mut kg = Reasoner::new();
kg.add_abox_triple("Alice", "hasParent", "Bob");
kg.add_abox_triple("Bob",   "hasParent", "Charlie");

Python:

import py_kolibrie

graph = py_kolibrie.PyKnowledgeGraph()
graph.add_abox_triple("Alice", "hasParent", "Bob")
graph.add_abox_triple("Bob",   "hasParent", "Charlie")

Defining Rules

A rule consists of one or more premise triple patterns and one or more conclusion triple patterns. When all premises match, the conclusion triples are asserted.

Example rule: “If X has a parent Y and Y has a parent Z, then X has a grandparent Z.”

Rust:

use shared::terms::Term;
use shared::rule::Rule;

// Encode the predicate strings into the dictionary
let mut dict = kg.dictionary.write().unwrap();
let parent_id      = dict.encode("hasParent");
let grandparent_id = dict.encode("hasGrandparent");
drop(dict);

let rule = Rule {
    premise: vec![
        (Term::Variable("X".into()), Term::Constant(parent_id), Term::Variable("Y".into())),
        (Term::Variable("Y".into()), Term::Constant(parent_id), Term::Variable("Z".into())),
    ],
    negative_premise: vec![],
    conclusion: vec![(
        Term::Variable("X".into()),
        Term::Constant(grandparent_id),
        Term::Variable("Z".into()),
    )],
    filters: vec![],
};

kg.add_rule(rule);

Python:

has_parent      = graph.encode_term("hasParent")
has_grandparent = graph.encode_term("hasGrandparent")

rule = py_kolibrie.PyRule(
    premise=[
        py_kolibrie.PyTriplePattern(
            py_kolibrie.PyTerm.Variable("X"),
            py_kolibrie.PyTerm.Constant(has_parent),
            py_kolibrie.PyTerm.Variable("Y")),
        py_kolibrie.PyTriplePattern(
            py_kolibrie.PyTerm.Variable("Y"),
            py_kolibrie.PyTerm.Constant(has_parent),
            py_kolibrie.PyTerm.Variable("Z")),
    ],
    filters=[],
    conclusion=[py_kolibrie.PyTriplePattern(
        py_kolibrie.PyTerm.Variable("X"),
        py_kolibrie.PyTerm.Constant(has_grandparent),
        py_kolibrie.PyTerm.Variable("Z"),
    )],
)
graph.add_rule(rule)

Forward Chaining

Forward chaining starts from the known facts and applies rules repeatedly until no new facts can be derived.

Inference Methods

Method	Use Case
`infer_new_facts()`	Small datasets; basic forward chaining
`infer_new_facts_semi_naive()`	Larger datasets; efficient incremental reasoning
`infer_new_facts_semi_naive_parallel()`	Large-scale; multi-threaded inference

Rust — run inference and print derived facts:

let inferred = kg.infer_new_facts_semi_naive();

let dict = kg.dictionary.read().unwrap();
for triple in &inferred {
    let s = dict.decode(triple.subject).unwrap_or("?");
    let p = dict.decode(triple.predicate).unwrap_or("?");
    let o = dict.decode(triple.object).unwrap_or("?");
    println!("{s} {p} {o}");
}

Output:

Alice hasGrandparent Charlie

Python:

inferred = graph.infer_new_facts()
for subject, predicate, obj in inferred:
    print(f"{subject} {predicate} {obj}")

Querying After Inference

Query the ABox (instance-level facts) after inference has run:

let results = kg.query_abox(Some("Alice"), Some("hasGrandparent"), None);

# Returns all (subject, predicate, object) tuples in the ABox
all_facts = graph.query_abox()

Backward Chaining

Backward chaining works in reverse — given a goal pattern, Kolibrie proves whether it holds by working backwards through the rules. This is useful for answering specific queries rather than materializing all possible derivations.

Rust:

use shared::terms::Term;

let grandparent_id = kg.dictionary.write().unwrap().encode("hasGrandparent");

let query_pattern = (
    Term::Variable("X".into()),
    Term::Constant(grandparent_id),
    Term::Variable("Z".into()),
);

let results = kg.backward_chaining(&query_pattern);
// results: Vec<HashMap<String, Term>>
for binding in &results {
    println!("{:?}", binding);
}

Integrity Constraints

Integrity constraints are rules whose conclusion signals an inconsistency. When a constraint fires, Kolibrie can automatically repair the knowledge graph by removing one of the conflicting triples.

Example: “No entity can be both a Professor and a Student.”

Rust:

let isa_id       = dict.encode("isA");
let professor_id = dict.encode("Professor");
let student_id   = dict.encode("Student");

// A constraint conclusion uses sentinel (0, 0, 0) to signal violation
let constraint = Rule {
    premise: vec![
        (Term::Variable("X".into()), Term::Constant(isa_id), Term::Constant(professor_id)),
        (Term::Variable("X".into()), Term::Constant(isa_id), Term::Constant(student_id)),
    ],
    negative_premise: vec![],
    conclusion: vec![(Term::Constant(0), Term::Constant(0), Term::Constant(0))],
    filters: vec![],
};
kg.add_constraint(constraint);

// Inference with automatic repair of violations
let inferred = kg.infer_new_facts_semi_naive_with_repairs();

Python:

isa_id       = graph.encode_term("isA")
professor_id = graph.encode_term("Professor")
student_id   = graph.encode_term("Student")

constraint = py_kolibrie.PyRule(
    premise=[
        py_kolibrie.PyTriplePattern(
            py_kolibrie.PyTerm.Variable("X"),
            py_kolibrie.PyTerm.Constant(isa_id),
            py_kolibrie.PyTerm.Constant(professor_id)),
        py_kolibrie.PyTriplePattern(
            py_kolibrie.PyTerm.Variable("X"),
            py_kolibrie.PyTerm.Constant(isa_id),
            py_kolibrie.PyTerm.Constant(student_id)),
    ],
    filters=[],
    conclusion=[py_kolibrie.PyTriplePattern(
        py_kolibrie.PyTerm.Constant(0),
        py_kolibrie.PyTerm.Constant(0),
        py_kolibrie.PyTerm.Constant(0),
    )],
)
graph.add_constraint(constraint)
inferred = graph.infer_new_facts_semi_naive_with_repairs()

Inconsistency-Tolerant Querying

Query for answers that are consistent under all possible repairs (IAR semantics):

let results = kg.query_with_repairs(&query_pattern);

results = graph.query_with_repairs(query_pattern)

N3 Logic Rules

Kolibrie parses N3 notation directly. N3 rules use the => arrow and can be sent to the HTTP server via the n3logic field:

@prefix ex: <http://example.org/> .

{ ?X ex:hasParent ?Y .
  ?Y ex:hasParent ?Z . }
=> { ?X ex:hasGrandparent ?Z . } .

Send via the HTTP server:

curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{
    "rdf": "<your RDF/XML data>",
    "n3logic": "@prefix ex: <http://example.org/> . { ?X ex:hasParent ?Y . ?Y ex:hasParent ?Z . } => { ?X ex:hasGrandparent ?Z . } .",
    "sparql": "PREFIX ex: <http://example.org/> SELECT ?x ?z WHERE { ?x ex:hasGrandparent ?z }"
  }'

ML Integration in Rules

Rules can invoke ML models inside reasoning using the ML.PREDICT() syntax. Predictions from Python ML frameworks become first-class facts in the knowledge graph:

RULE :TemperatureForecast :-
CONSTRUCT { ?room ex:predictedTemp ?predicted_temp . }
WHERE {
  ?room sensor:temperature ?temp ;
        sensor:humidity    ?humidity ;
        sensor:occupancy   ?occupancy .
}
ML.PREDICT(MODEL "temperature_predictor",
    INPUT {
        SELECT ?room ?temp ?humidity ?occupancy
        WHERE {
            ?room sensor:temperature ?temp ;
                  sensor:humidity    ?humidity ;
                  sensor:occupancy   ?occupancy .
        }
    },
    OUTPUT ?predicted_temp
)

The model name maps to a registered Python callable. Kolibrie calls the model with the bound variables from the INPUT subquery and asserts the OUTPUT value into the CONSTRUCT pattern.

Benchmarks

Kolibrie’s reasoning and query performance has been evaluated against established systems.

WatDiv 10M triple dataset (20 runs per query pattern):

Sub-millisecond to low-millisecond query times across all WatDiv patterns (L, S, F, C types)
Outperforms Blazegraph, QLever, and Oxigraph (RocksDB) consistently across query categories

Deep Taxonomy Reasoning (hierarchy depths from 10 to 10,000 levels):

Logarithmic scaling with hierarchy depth
Sub-second response times at 10,000 levels
Faster than Apache Jena and the EYE reasoner at all tested depths

→ Go to Performance & Optimization

Stream Processing (RSP-QL)

Performance & Optimization

Table of Contents

Introduction to Knowledge Graphs

RDF and Knowledge Graph Components

Loading RDF Data

Querying Knowledge Graphs

Basic Query

Advanced Query

The Reasoner

Creating a Reasoner and Adding Facts

Defining Rules

Forward Chaining

Inference Methods

Querying After Inference

Backward Chaining

Integrity Constraints

Inconsistency-Tolerant Querying

N3 Logic Rules

ML Integration in Rules

Benchmarks

Ask AI about Kolibrie Docs