user@kolibrie:~/docs$
kolibrie@docs : ~/docs $ cat performance-&-optimization.md

Performance & Optimization

Table of Contents

  1. Query Optimization

  2. Cost-Based Query Optimizer

  3. Performance Tuning

  4. Benchmarks

  5. Best Practices


Query Optimization

Index Usage

Kolibrie automatically maintains three index types to cover common access patterns:

IndexPatternBest For
SPO (Subject–Predicate–Object)?s ex:p ex:oSubject-centric lookups
POS (Predicate–Object–Subject)ex:p ex:o ?sPredicate-based searches
OSP (Object–Subject–Predicate)ex:o ?s ex:pObject-centric traversal

Indexes are managed automatically — no manual configuration is required. After a large batch of inserts, you can explicitly rebuild all indexes:

db.build_all_indexes();

Query Planning

Kolibrie’s query planner optimizes SPARQL queries automatically by:

  1. Analyzing triple patterns — identifying the most selective patterns first
  2. Optimizing join order — minimizing the size of intermediate result sets
  3. Cost estimation — using pre-computed statistics to choose the cheapest execution path

Cost-Based Query Optimizer

Kolibrie includes Streamertail, a cost-based optimizer that selects physical execution plans using a Volcano-style plan search. It builds statistics from your dataset on first use and caches them across queries in the same session.

Priming the Optimizer

For sessions with many queries, pre-build the statistics immediately after loading data:

use kolibrie::SparqlDatabase;
use kolibrie::execute_query::execute_query;

let mut db = SparqlDatabase::new();
db.parse_rdf_from_file("large_dataset.rdf");

// Build statistics once — cached for all subsequent queries
db.get_or_build_stats();

// Run many queries ...
for row in execute_query("SELECT ...", &mut db) { /* ... */ }

After bulk data modifications, invalidate the cache so the optimizer uses fresh statistics:

db.invalidate_stats_cache();

Advanced: Explicit Plan Control

Power users can interact with the optimizer directly:

use kolibrie::streamertail_optimizer::optimizer::Streamertail;

let mut optimizer = Streamertail::new(&db);
// Inspect or override the chosen physical plan for a specific query

Performance Tuning

Memory Management

For large datasets, use file-based loading instead of in-memory string parsing. parse_rdf_from_file() reads the file incrementally and is automatically parallelized:

use kolibrie::SparqlDatabase;

let mut db = SparqlDatabase::new();
db.parse_rdf_from_file("large_dataset.rdf");

For in-memory data, use the appropriate format-specific method:

db.parse_turtle(&turtle_string);
db.parse_ntriples_and_add(&ntriples_string);

Parallel Processing

Multi-threaded RDF parsing is automatic when using parse_rdf_from_file(). Kolibrie uses Rayon to distribute work across all available CPU cores with no configuration required.

For explicit multi-threaded query execution:

use kolibrie::execute_query::execute_query_rayon_parallel2_volcano;

let results = execute_query_rayon_parallel2_volcano(sparql_query, &mut db);

GPU Acceleration

CUDA-based GPU acceleration is available as an experimental feature for eligible join operations.

Recommended — Docker GPU profile:

docker compose --profile gpu up --build

Requires: NVIDIA GPU and NVIDIA Container Toolkit.

Manual build (Unix):

export LD_LIBRARY_PATH=<cuda_lib_path>:$LD_LIBRARY_PATH
cmake .
cmake --build .

Manual build (Windows):

cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release .
cmake --build .

Note: CUDA support is experimental. For production workloads, multi-threaded CPU processing is recommended.


Benchmarks

Kolibrie has been evaluated against established RDF systems on standard benchmarks.

Query Performance — WatDiv 10M

20 runs per query pattern on the WatDiv 10M triple dataset:

  • Sub-millisecond to low-millisecond query times across all query types (L, S, F, C patterns)
  • Consistently faster than Blazegraph, QLever, and Oxigraph (RocksDB) on all tested patterns

Reasoning Performance — Deep Taxonomy

Hierarchy reasoning at depths from 10 to 10,000 levels:

  • Logarithmic scaling with hierarchy depth
  • Sub-second response time at 10,000 hierarchy levels
  • Outperforms Apache Jena and the EYE reasoner at all tested depths

Best Practices

  1. Use specific predicates — clearly defined predicates allow the query planner to select the most efficient index.
  2. Always use LIMIT — limit result sets to reduce memory consumption and execution time for exploratory queries.
  3. Structure queries for index coverage — place the most selective triple pattern first in your WHERE clause.
  4. Batch INSERT operations — group inserts to avoid repeated index rebuilds.
  5. Prime statistics before query-heavy sessions — call db.get_or_build_stats() after loading data and before running many queries.
  6. Use parse_rdf_from_file() for large files — automatically parallelized; significantly faster than parsing strings for files over a few MB.
  7. Invalidate cache after bulk writes — call db.invalidate_stats_cache() after large INSERT or DELETE operations to keep the optimizer’s estimates accurate.
  8. Choose inference strategy by scale — use infer_new_facts_semi_naive_parallel() for large knowledge graphs; use infer_new_facts() for small graphs during development.

→ Go to API Reference