user@kolibrie:~/docs$
kolibrie@docs : ~/docs $ cat core-concepts.md

Core Concepts

This page introduces the foundational concepts behind Kolibrie: the RDF data model, the SPARQL query language, and the features Kolibrie brings on top of them.


RDF: The Data Model

RDF (Resource Description Framework) represents information as triples: a subject, a predicate, and an object. Every fact in your dataset is a triple.

The same fact — “Alice knows Bob” — expressed in three common RDF serializations:

RDF/XML:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:ex="http://example.org/">
  <rdf:Description rdf:about="http://example.org/Alice">
    <ex:knows rdf:resource="http://example.org/Bob"/>
  </rdf:Description>
</rdf:RDF>

Turtle:

@prefix ex: <http://example.org/> .
ex:Alice ex:knows ex:Bob .

N-Triples:

<http://example.org/Alice> <http://example.org/knows> <http://example.org/Bob> .

Kolibrie accepts all three formats (plus N3), so you can load data in whichever serialization you already have.


SPARQL: The Query Language

SPARQL (SPARQL Protocol and RDF Query Language) matches patterns against your RDF triples to retrieve or modify data. A complete annotated example:

PREFIX ex:  <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?person ?salary
WHERE {
    ?person ex:hasOccupation "Engineer" .
    ?person ex:salary ?salary .
    FILTER (?salary > 60000)
}
ORDER BY DESC(?salary)
LIMIT 10

Key clauses:

ClausePurpose
PREFIXDefine namespace shortcuts
SELECTChoose which variables to return
WHERESpecify triple patterns to match
FILTERApply conditions to restrict results
BINDCompute and assign new variables
GROUP BYAggregate results by a variable
INSERT / DELETEAdd or remove triples
VALUESProvide inline data bindings
ORDER BYSort results
LIMIT / OFFSETPage through results

What Kolibrie Adds

Kolibrie is a complete semantic data platform built around SPARQL. Here is what it supports beyond a basic query engine:

Multi-Format RDF Parsing

Load data from RDF/XML, Turtle, N3, N-Triples, or their RDF-star variants. File-based loading is automatically parallelized.

Full SPARQL 1.1

SELECT, INSERT, DELETE, FILTER, BIND, GROUP BY, VALUES, ORDER BY, LIMIT, OFFSET, CONCAT, nested queries, and user-defined functions (UDFs) — the full language, not a subset.

Stream Processing (RSP-QL)

Write continuous queries over timestamped RDF streams using sliding windows. Kolibrie evaluates RSTREAM, ISTREAM, and DSTREAM operators as new events arrive.

→ Stream Processing

Knowledge Graph Reasoning

Define logic rules and let Kolibrie derive new facts automatically using forward chaining, backward chaining, or semi-naive evaluation. Supports integrity constraints, inconsistency repair, and probabilistic reasoning.

→ Knowledge Graph & Reasoning

ML Integration

Call machine learning models directly inside reasoning rules using the ML.PREDICT() syntax. Predictions from Python ML frameworks become first-class facts in your knowledge graph.

Deployment Options

ModeHow
Native Rust libraryAdd as a Cargo dependency
Python bindingspip install maturin && maturin develop
HTTP server + web UIcargo run --bin kolibrie-http-server
Docker (CPU)docker compose up --build
Docker (GPU/CUDA)docker compose --profile gpu up --build

RDF-star / SPARQL-star

RDF-star (formerly RDF*) allows you to annotate existing triples — that is, use a triple itself as the subject or object of another triple. This is useful for provenance, confidence scores, and metadata.

Example: stating that Alice knows Bob with a confidence of 0.95:

@prefix ex: <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<< ex:Alice ex:knows ex:Bob >> ex:confidence "0.95"^^xsd:decimal .

Kolibrie parses and stores quoted triples natively. You can query them using SPARQL-star syntax in your WHERE clauses.


Data Flow at a Glance

A typical Kolibrie workflow:

  1. Load — parse RDF data from files or strings in any supported format
  2. Query — run SPARQL SELECT, INSERT, or DELETE statements; or build queries programmatically using the QueryBuilder API
  3. Reason — optionally apply rules to derive new facts from existing ones
  4. Stream — for live data, register RSP-QL windows and push timestamped triples as events arrive
  5. Integrate — expose everything through the REST API, or consume results from Python

→ Continue to the SPARQL Tutorial