Architecture · Master Class

Seven layers. Measured to the microsecond.

This is the technical showcase of Ri.NET. Not marketing. Not aspiration. The computational topology of a civic intelligence operating system, documented from ingestion substrate to interface orchestration, with measured performance characteristics, operational workflows, and forensic-grade traceability.

Design principles

Ri.NET was not architected to win benchmarks. It was architected to not lie. Every design choice in the following seven layers flows from one invariant: no layer accepts as input what the layer below cannot prove. The consequence is a system that is simultaneously slower than alternatives at the lowest level and faster than alternatives at the highest, because nothing above the ingestion substrate re-validates what the substrate already guaranteed.

Five design principles shape the architecture:

  • I
    Provenance over performance.Every output carries a verifiable chain of custody. When a query returns an answer, the layers below can reconstruct the fifteen documents, three transforms, and two entity resolutions that produced it. This costs roughly 12% in raw throughput. It is the reason the system is usable in regulatory contexts.
  • II
    Temporal by default, timeless never.Nothing in the graph is "true now." Everything is "true during interval [t1, t2]." A query about 2019 returns the 2019 world. A query without a time anchor returns the present world but explicitly labels itself as temporally unconstrained. The absence of a temporal qualifier is itself a qualifier.
  • III
    Resolution is reversible.When the entity engine merges two records into a single canonical identity, it retains the merge proof. If new evidence contradicts the merge, the engine unmerges, recomputes dependents, and rebuilds affected graph slices. The system does not accumulate identity errors.
  • IV
    Agents execute, humans decide.Six hundred and sixty-six autonomous agents operate the platform. Not one of them makes a judgment call. They ingest, they normalize, they flag, they escalate. Decisions remain human. This is a deliberate architectural choice, not a limitation of current AI capability.
  • V
    Zero-trust internal fabric.No service inside Ri.NET trusts any other service by default. Every internal call carries a cryptographic assertion of origin, scope, and intent. Lateral movement is architecturally impossible, not merely discouraged. The cost is approximately 8% end-to-end latency overhead. The benefit is that a compromise of any single layer does not propagate.

System topology

The full computational topology, rendered below, shows every layer, every data store, every agent class, and the critical pathways between them. Solid arrows indicate primary data flow. Dashed arrows indicate audit and telemetry feedback.

Sovereign Data Sources · 40+ endpoints Sudski reg. e-Oglasna RGFI / FINA EU-SIMAP EUR-Lex NN · Zakoni Cadastre News APIs 01 Ingestion Substrate Collector agents · fault recovery · source-specific parsing · provenance anchoring ~180K rec/day 02 Normalization & Ontology 47 transform modules · schema harmonization · unit/currency conversion · encoding repair HRK→EUR 100% 03 Entity Resolution Engine Deterministic (OIB/MB/VAT) + probabilistic reconciliation · reversible merges 1.8M entities 04 Ri.NET Vector Cortex 7 semantic collections · multilingual embeddings 6.87M indexed vectors · 1024-dim Domain-adapted rerankers RAGAS 3.7/5 · hallucination 0.3% 05 Temporal Graph Fabric 164,583 nodes · 891,247 edges Every edge carries validity interval Time-range queries at depth 3 p50 186ms · p95 487ms 06 Agent Swarm Fabric · 600+ 1 orchestrator · 5 cognitive domains · 60 coordinators · 600 workers Sandboxed · immutable audit log · blockchain-anchored (Polygon PoS) 24/7 autonomous 07 Interface & Orchestration Query routing · auth · rate limits · DABI conversational · REST API · frontends p50 142ms Regulators · Journalists · Government Officials · Citizens · API Consumers audit / provenance feedback
Ri.NET system topology — data flow (solid) and audit feedback (dashed)
01

Ingestion Substrate

Bottom of the stack

The ingestion layer runs continuously against more than forty sovereign data sources. It is the only layer in Ri.NET that touches the outside world. Everything above it assumes the outside world is, for practical purposes, a lie — and operates accordingly.

Each source is owned by a dedicated collector agent. The collector knows the source's API contract, pagination quirks, rate limits, authentication scheme, encoding anomalies, and failure modes. It does not assume stability. It assumes that any request can fail, any response can be malformed, and any source can silently change its schema on a Tuesday morning. The system survives all three.

Active sources
40+
sovereign endpoints
Records/day
~180K
validated + persisted
Peak throughput
14.8K
records per minute
Retry policy
3 × exp
then human escalation
Source change detection
< 1h
schema drift alarm
Provenance anchoring
100%
every ingested row

Ingestion workflow

  1. 1
    Source polling or webhook subscription.Each collector runs on its own schedule. High-frequency sources (news, procurement announcements) are webhook-driven. Low-frequency sources (statute amendments, annual financial filings) run on cron.
  2. 2
    Raw persistence before parsing.The raw response is persisted before any parsing attempt. If parsing fails, the raw bytes remain available for inspection and retry. This is non-negotiable.
  3. 3
    Source-specific parser.Collector-owned parser transforms raw response into a canonical ingestion event. Parsers are individually tested, versioned, and rollbackable.
  4. 4
    Checksum + provenance record.Every ingestion event receives a deterministic checksum, a source URL, a timestamp, a parser version, and a collector identity. This bundle is the provenance anchor.
  5. 5
    Handoff to Normalization (Layer 02).Validated events are pushed onto a bounded queue consumed by the normalization layer. If Layer 02 is slow, ingestion backpressures rather than dropping. Nothing is ever lost silently.
  6. 6
    Failure escalation.If a collector fails three retries with exponential backoff, the failure is logged, a Telegram alert fires, and the source is marked as degraded. A human decides whether to patch the parser or wait for the source to recover.
Technical invariant: A collector that cannot parse a response does not guess. It persists the raw bytes, flags the failure with full context, and stops. Silent data corruption at the ingestion layer is the single most expensive class of bug in data platforms. Ri.NET architecturally cannot produce it.
02

Normalization & Ontology

Noise to substance

Raw records are unusable. A procurement contract from 2018 reports values in kuna. The same contract referenced from a 2024 statute uses euro. A corporate name appears as "d.o.o.," "d. o. o.," "DOO," or omitted entirely. Dates arrive as ISO-8601, US format, European format, and — in one memorable source — as Croatian text ("petnaesti svibnja dvije tisuće dvadeset i druge"). This layer harmonizes all of it.

Normalization operates through forty-seven independent transformation modules. Each module owns one narrow class of transformation: currency conversion (including the 2023 Croatian HRK → EUR transition applied retroactively across all pre-2023 financial records), company name canonicalization, date parsing, address geocoding, OIB checksum validation, encoding repair for historical records stored in CP1250 before UTF-8 became standard. Modules compose. The pipeline is explicit.

Transform modules
47
independent units
Pre-2023 HRK→EUR
100%
1.86M records
Encoding repaired
15+ tbl
CP1250 → UTF-8
Schema drift events caught
340+
in Q1 2026 alone

The ontology discipline

Ontology in Ri.NET is not imposed. It is discovered. Entity types, relationship types, and constraints are derived from the actual statistical distribution of ingested data, then validated against annotations from domain experts, then frozen as the canonical schema. Nothing moves into the graph until the ontology has mapped it to a type the query layer already understands.

Operational note: When a new source produces a field that does not fit any existing ontology slot, normalization does not invent a new slot. It flags the field for human review. Schema evolution is deliberate. Schema sprawl is not.
03

Entity Resolution Engine

Reversible identity

The same company appears in a procurement record as "Tvornica XYZ d.o.o.," in a court register as "TVORNICA XYZ D.O.O.," in a financial statement as "Tvornica XYZ," in a news article as "Tvornica," in an older ownership disclosure as "Tvornica Y. i sinovi (prije: Tvornica XYZ)," and in three places where the OIB is misspelled by one digit. Entity resolution decides whether these six references describe one entity, two entities, or something in between.

Resolved entities
1.8M
companies + institutions
OIB-verified
92%
deterministic chain
Probabilistic resolution
96.4%
precision @ 0.85 threshold
Human-review flagged
~3.6%
below confidence threshold
Unmerge events
tracked
with full dependency rebuild
Avg aliases per entity
3.2
across all sources

Resolution signal hierarchy

  1. D.1
    Deterministic identifier (hardest signal).OIB, MB, VAT, passport, ISIN. Match → merge with confidence 1.0. Mismatch → refuse to merge regardless of other signals. No probabilistic overrides for hard identifiers.
  2. D.2
    Derivative identifier.Email domain, website, registered office address (street-level match). Weight: 0.75 when combined with D.3.
  3. P.1
    Name similarity (string level).Levenshtein + Jaro-Winkler + phonetic normalization. Suffix handling (d.o.o., j.d.o.o., d.d.) is non-distinguishing. Weight: 0.45 max, saturates quickly.
  4. P.2
    Temporal co-occurrence.Two records mentioned in the same document on the same date with similar names: strong signal. Weight: 0.30 per co-occurrence, capped at 0.65 total.
  5. P.3
    Structural co-occurrence.Shared board members, shared registered address, shared bank account (when disclosed). Weight: 0.20–0.40 depending on signal density.
  6. P.4
    Embedding-space proximity.Vector similarity in the semantic cortex over company descriptions, articles mentioning the entity, and declared activity codes. Weight: 0.15, used only as tiebreaker.
Reversibility guarantee: Every merge retains its proof trail. If new evidence contradicts a merge, the engine produces an unmerge operation, recomputes all downstream dependents (graph edges touching the merged node, vector embeddings that referenced the canonical form, query caches), and rebuilds affected slices. The system does not silently accumulate identity errors. It corrects them.
04

Ri.NET Vector Cortex

Semantic reasoning substrate

The cortex is not a retrieval index. It is a reasoning substrate. Retrieval is the first of several operations that happen at this layer before an answer leaves. The downstream layers (Graph Fabric, Agent Swarm, Interface) never see raw retrieval results. They see reconciled retrieval results: passages bound to resolved entities, reranked by domain-specific relevance, cross-validated against the temporal graph, and assembled into citation bundles.

Collections
7
domain-adapted
Indexed vectors
6.87M
1024-dim dense
Languages
47
multilingual model
Query latency
94ms
p50, full pipeline
RAGAS
3.7/5
6,400 eval pairs
Hallucination rate
0.3%
verified sampling
Citation accuracy
94.1%
correct article cited
Reindex cycle
nightly
incremental, zero downtime

The seven collections

CollectionDomainVectors
LegalStatutes, regulations, ordinances, amendments307,329
Corporate KnowledgeCompanies, institutions, roles, activities4,690,000
Entities (v2)Resolved entity descriptions + aliases339,000
ProcurementTender announcements, bids, awards, contracts273,118
Press & NewsJournalistic coverage, sentiment-tagged847,000
Court & ComplianceCourt notices, judgments, anomaly findings312,000
Conversational MemoryDABI Q&A pairs, retrieval training set10,664

Query pipeline inside the cortex

Query natural language + filters Embed 1024-dim query vec Route to N collections Retrieve top-K per collection parallel Rerank domain-adapted cross-encoder Bind to resolved entities + citation bundle Temporal check Graph Fabric → 05 Cortex Query Pipeline · median 94ms end-to-end retrieve + rerank + bind + validate
Cortex query pipeline — five operations, one round trip
What "reasoning substrate" means operationally: The cortex does not return "here are fifteen passages that look relevant." It returns "here are fifteen passages, each bound to the entity it describes, each cross-validated against the temporal graph to confirm the statement was true at the query's time anchor, each carrying a citation with document ID, page, paragraph, and extraction confidence." The distinction is the distinction between search and reasoning.
05

Temporal Graph Fabric

Time-first relationship store

Most graphs are timeless. They encode "A is connected to B." Ri.NET's graph encodes "A was connected to B from January 2019 until October 2022, then the connection type changed from 'board member' to 'advisor' until July 2024, at which point B exited A entirely." This changes everything about how queries work.

Nodes
164,583
typed + temporal
Edges
891,247
all with intervals
Edge types
34
ontology-defined
p50 traversal
186ms
depth 3, time-bounded
p95 traversal
487ms
complex patterns
Time-range queries
native
no post-filtering

Temporal edge schema

2019 2021 2023 2025 now Person Ivo I. Company Tvornica X director_of valid: 2019-01-03 → 2022-10-14 advisor valid: 2022-10-15 → 2024-07-01 (closed) active relation type transition closed/archived
A relationship between two nodes across six years — three distinct edges, all preserved

Query primitives

The graph exposes four primary query primitives. Each respects the temporal constraint by default.

  • Q1
    As-of traversal."Who was on the board of Company X on 15 March 2021?" The graph returns edges whose validity interval contained that date. Closed edges that were active on that date are included. Edges that did not yet exist are excluded.
  • Q2
    Pattern match."Find all cases where a director of company A is also a director of a winning bidder to a procurement by A, within the same calendar year." The graph matches structurally and temporally.
  • Q3
    Shortest path with time bounds."What is the shortest connection between Person X and Institution Y using only relationships that were active between 2020 and 2024?" Returns path + validity timeline for every edge.
  • Q4
    Temporal diff."What changed in Company X's beneficial ownership structure between 2022-01-01 and 2024-12-31?" Returns opened edges, closed edges, and type transitions in the interval.
06

Agent Swarm Fabric

600+ autonomous workers

Six hundred and sixty-six specialized autonomous agents operate Ri.NET under a single central orchestrator. The hierarchy is strict: one orchestrator, five cognitive domains (Legal, Financial, Civic, Sentinel, Journalist), sixty mid-level coordinators, six hundred task-specific workers. The number is not symbolic. It is the measured steady-state fleet size under current operational load.

Orchestrator
1
central router (DABI)
Cognitive domains
5
Legal/Fin/Civic/Sentinel/Jour.
Coordinators
60
mid-level routing
Workers
600
task-specific
Total steady-state
600+
active agents
Uptime
24/7
autonomous
DABI x1 Legal x1 Financial x1 Civic x1 Sentinel x1 Journalist x1 12 coordinators 12 coordinators 12 coordinators 12 coordinators 12 coordinators 120 workers 120 workers 120 workers 120 workers 120 workers 1 + 5 + 60 + 600 = 600+ agents · sandboxed · immutable audit log · blockchain-anchored (Polygon PoS)
Agent swarm hierarchy — hub-and-spoke with cognitive specialization
Why agents never decide: Every agent in Ri.NET has one job and a sharp boundary around that job. An ingestion agent ingests. A resolution agent resolves. An anomaly agent flags. None of them acts on findings. Findings route to humans with appropriate authority. This is not a limitation — it is an architectural commitment to a specific model of AI-assisted governance in which the AI finds, surfaces, cites, and explains, but the human decides. Anyone who promises you otherwise is selling you liability.
07

Interface & Orchestration

The only layer that speaks

Layer 07 is the only component of Ri.NET that external consumers — humans, API clients, frontends — ever touch directly. It routes queries, enforces authentication, applies rate limits, manages sessions, and presents the reasoning substrate beneath through three surfaces: the conversational interface (DABI), the structured REST API, and a set of domain-specific frontends.

End-to-end p50
142ms
entity resolution
End-to-end p95
487ms
complex graph query
Uptime
99.94%
trailing 90 days
Surfaces
3
DABI / API / UI
Response caching
zero
staleness = correctness bug
No response caching: The interface layer caches nothing. Every response is freshly computed from the current state of the substrate. This is a deliberate tradeoff — civic intelligence staleness is a correctness bug, not a performance concern. A two-hour-old answer to "who currently owns this company" is not a slightly slower correct answer. It is a wrong answer.

End-to-end data flow

A single ingestion event — say, a new court notice published on e-Oglasna at 08:14 — flows through the entire stack in median 2.3 seconds from source availability to query-ready state. The sequence:

  1. T+0ms
    Court notice publishedSource webhook fires. Collector agent receives notification.
  2. T+180ms
    Raw fetch completeCollector retrieves the document. Raw bytes persisted. Provenance record created.
  3. T+420ms
    Parse completeSource-specific parser extracts: case number, parties (by name), court, filing date, notice type. Validated against source-specific schema.
  4. T+680ms
    NormalizedDates harmonized, encoding repaired, party names canonicalized. Handoff to entity resolution.
  5. T+1,100ms
    Entities resolvedEach named party resolved to a canonical entity ID. Resolution confidence attached. Low-confidence matches flagged.
  6. T+1,450ms
    Cortex indexedDocument content embedded (1024-dim) and inserted into the Court & Compliance collection. Linked to resolved entities.
  7. T+1,900ms
    Graph updatedNew edges added: (party_A → case_X), (party_B → case_X), with validity starting now. Temporal index updated.
  8. T+2,300ms
    Query-readyThe notice is now retrievable by semantic query, entity lookup, or graph traversal. Any query issued from T+2.3s onward sees the new data.
  9. T+2,600ms
    Agent scan triggeredSentinel agents evaluate the new event against 127 anomaly patterns. If any pattern matches above threshold, an alert is queued for the appropriate human.
  10. T+∞
    Audit anchoredEvery operation logged to the immutable audit trail. Nightly batch anchors the log root to Polygon PoS. The entire chain is verifiable by any third party.

Query lifecycle

Consider the query: "Which companies won procurement contracts above €100K from KBC Rijeka in 2023, where at least one board member was also affiliated with a losing bidder?" The following illustrates the path this question takes through the system.

Step 1 · Parse Intent detection graph + filter + bind Step 2 · Cortex Resolve KBC Rijeka entity + aliases Step 3 · Graph A Find winners contracts >€100K, 2023 Step 4 · Graph B Enumerate losers for each tender Step 5 · Board Fetch boards 2023 with temporal validity Step 6 · Intersect Overlap detection board ∩ board Step 7 · Cite Assemble citations per finding Step 8 · Return Structured response JSON + provenance Total: 8 steps · median end-to-end 487ms (Layer 07 → 04 → 03 → 05 → 05 → 05 → 04 → 07) Result example: 12 contracts · 3 matched overlap patterns · 47 documents cited · 2 flagged for deeper review Each result row carries: tender ID, winning bidder, losing bidder, overlapping director name, board-role-valid-from, source URLs
Compound query lifecycle — one natural-language question, eight layer hops, assembled response with full provenance
What this illustrates: A question a regulator would take two analysts three weeks to answer by hand becomes a 487-millisecond operation. Not because Ri.NET is magic. Because every layer in the stack already did the preparation work.

Security topology

Ri.NET implements zero-trust as an architectural invariant, not a retrofit. No service trusts any other service. Every call carries a cryptographic assertion of origin, scope, and intent. Every operation is logged to an immutable audit trail. The audit trail is anchored nightly to a public blockchain.

External perimeter · TLS 1.3 · HSTS · Cloudflare DDoS · nginx rate limiting Auth & Session · OAuth2 · JWT · per-session keys · MFA enforced Zero-trust Internal · mTLS · SPIFFE identities · scope tokens on every call Data · AES-256 at rest · per-tenant key derivation · audit anchored to Polygon PoS
Concentric security layers — each enforces independently, no single point of compromise

Read the full DPIA summary on the DPIA page. Enterprise and sovereign deployments receive the complete seventeen-page DPIA package as part of onboarding.

Seen enough?

The documentation above describes the system as it operates today. The API exposes it. The demo shows it running against real data.

Register for API access   Request a demo