Seven layers. Measured to the microsecond.
This is the technical showcase of Ri.NET. Not marketing. Not aspiration. The computational topology of a civic intelligence operating system, documented from ingestion substrate to interface orchestration, with measured performance characteristics, operational workflows, and forensic-grade traceability.
Design principles
Ri.NET was not architected to win benchmarks. It was architected to not lie. Every design choice in the following seven layers flows from one invariant: no layer accepts as input what the layer below cannot prove. The consequence is a system that is simultaneously slower than alternatives at the lowest level and faster than alternatives at the highest, because nothing above the ingestion substrate re-validates what the substrate already guaranteed.
Five design principles shape the architecture:
- IProvenance over performance.Every output carries a verifiable chain of custody. When a query returns an answer, the layers below can reconstruct the fifteen documents, three transforms, and two entity resolutions that produced it. This costs roughly 12% in raw throughput. It is the reason the system is usable in regulatory contexts.
- IITemporal by default, timeless never.Nothing in the graph is "true now." Everything is "true during interval [t1, t2]." A query about 2019 returns the 2019 world. A query without a time anchor returns the present world but explicitly labels itself as temporally unconstrained. The absence of a temporal qualifier is itself a qualifier.
- IIIResolution is reversible.When the entity engine merges two records into a single canonical identity, it retains the merge proof. If new evidence contradicts the merge, the engine unmerges, recomputes dependents, and rebuilds affected graph slices. The system does not accumulate identity errors.
- IVAgents execute, humans decide.Six hundred and sixty-six autonomous agents operate the platform. Not one of them makes a judgment call. They ingest, they normalize, they flag, they escalate. Decisions remain human. This is a deliberate architectural choice, not a limitation of current AI capability.
- VZero-trust internal fabric.No service inside Ri.NET trusts any other service by default. Every internal call carries a cryptographic assertion of origin, scope, and intent. Lateral movement is architecturally impossible, not merely discouraged. The cost is approximately 8% end-to-end latency overhead. The benefit is that a compromise of any single layer does not propagate.
System topology
The full computational topology, rendered below, shows every layer, every data store, every agent class, and the critical pathways between them. Solid arrows indicate primary data flow. Dashed arrows indicate audit and telemetry feedback.
Ingestion Substrate
Bottom of the stackThe ingestion layer runs continuously against more than forty sovereign data sources. It is the only layer in Ri.NET that touches the outside world. Everything above it assumes the outside world is, for practical purposes, a lie — and operates accordingly.
Each source is owned by a dedicated collector agent. The collector knows the source's API contract, pagination quirks, rate limits, authentication scheme, encoding anomalies, and failure modes. It does not assume stability. It assumes that any request can fail, any response can be malformed, and any source can silently change its schema on a Tuesday morning. The system survives all three.
Ingestion workflow
- 1Source polling or webhook subscription.Each collector runs on its own schedule. High-frequency sources (news, procurement announcements) are webhook-driven. Low-frequency sources (statute amendments, annual financial filings) run on cron.
- 2Raw persistence before parsing.The raw response is persisted before any parsing attempt. If parsing fails, the raw bytes remain available for inspection and retry. This is non-negotiable.
- 3Source-specific parser.Collector-owned parser transforms raw response into a canonical ingestion event. Parsers are individually tested, versioned, and rollbackable.
- 4Checksum + provenance record.Every ingestion event receives a deterministic checksum, a source URL, a timestamp, a parser version, and a collector identity. This bundle is the provenance anchor.
- 5Handoff to Normalization (Layer 02).Validated events are pushed onto a bounded queue consumed by the normalization layer. If Layer 02 is slow, ingestion backpressures rather than dropping. Nothing is ever lost silently.
- 6Failure escalation.If a collector fails three retries with exponential backoff, the failure is logged, a Telegram alert fires, and the source is marked as degraded. A human decides whether to patch the parser or wait for the source to recover.
Normalization & Ontology
Noise to substanceRaw records are unusable. A procurement contract from 2018 reports values in kuna. The same contract referenced from a 2024 statute uses euro. A corporate name appears as "d.o.o.," "d. o. o.," "DOO," or omitted entirely. Dates arrive as ISO-8601, US format, European format, and — in one memorable source — as Croatian text ("petnaesti svibnja dvije tisuće dvadeset i druge"). This layer harmonizes all of it.
Normalization operates through forty-seven independent transformation modules. Each module owns one narrow class of transformation: currency conversion (including the 2023 Croatian HRK → EUR transition applied retroactively across all pre-2023 financial records), company name canonicalization, date parsing, address geocoding, OIB checksum validation, encoding repair for historical records stored in CP1250 before UTF-8 became standard. Modules compose. The pipeline is explicit.
The ontology discipline
Ontology in Ri.NET is not imposed. It is discovered. Entity types, relationship types, and constraints are derived from the actual statistical distribution of ingested data, then validated against annotations from domain experts, then frozen as the canonical schema. Nothing moves into the graph until the ontology has mapped it to a type the query layer already understands.
Entity Resolution Engine
Reversible identityThe same company appears in a procurement record as "Tvornica XYZ d.o.o.," in a court register as "TVORNICA XYZ D.O.O.," in a financial statement as "Tvornica XYZ," in a news article as "Tvornica," in an older ownership disclosure as "Tvornica Y. i sinovi (prije: Tvornica XYZ)," and in three places where the OIB is misspelled by one digit. Entity resolution decides whether these six references describe one entity, two entities, or something in between.
Resolution signal hierarchy
- D.1Deterministic identifier (hardest signal).OIB, MB, VAT, passport, ISIN. Match → merge with confidence 1.0. Mismatch → refuse to merge regardless of other signals. No probabilistic overrides for hard identifiers.
- D.2Derivative identifier.Email domain, website, registered office address (street-level match). Weight: 0.75 when combined with D.3.
- P.1Name similarity (string level).Levenshtein + Jaro-Winkler + phonetic normalization. Suffix handling (d.o.o., j.d.o.o., d.d.) is non-distinguishing. Weight: 0.45 max, saturates quickly.
- P.2Temporal co-occurrence.Two records mentioned in the same document on the same date with similar names: strong signal. Weight: 0.30 per co-occurrence, capped at 0.65 total.
- P.3Structural co-occurrence.Shared board members, shared registered address, shared bank account (when disclosed). Weight: 0.20–0.40 depending on signal density.
- P.4Embedding-space proximity.Vector similarity in the semantic cortex over company descriptions, articles mentioning the entity, and declared activity codes. Weight: 0.15, used only as tiebreaker.
Ri.NET Vector Cortex
Semantic reasoning substrateThe cortex is not a retrieval index. It is a reasoning substrate. Retrieval is the first of several operations that happen at this layer before an answer leaves. The downstream layers (Graph Fabric, Agent Swarm, Interface) never see raw retrieval results. They see reconciled retrieval results: passages bound to resolved entities, reranked by domain-specific relevance, cross-validated against the temporal graph, and assembled into citation bundles.
The seven collections
| Collection | Domain | Vectors |
|---|---|---|
| Legal | Statutes, regulations, ordinances, amendments | 307,329 |
| Corporate Knowledge | Companies, institutions, roles, activities | 4,690,000 |
| Entities (v2) | Resolved entity descriptions + aliases | 339,000 |
| Procurement | Tender announcements, bids, awards, contracts | 273,118 |
| Press & News | Journalistic coverage, sentiment-tagged | 847,000 |
| Court & Compliance | Court notices, judgments, anomaly findings | 312,000 |
| Conversational Memory | DABI Q&A pairs, retrieval training set | 10,664 |
Query pipeline inside the cortex
Temporal Graph Fabric
Time-first relationship storeMost graphs are timeless. They encode "A is connected to B." Ri.NET's graph encodes "A was connected to B from January 2019 until October 2022, then the connection type changed from 'board member' to 'advisor' until July 2024, at which point B exited A entirely." This changes everything about how queries work.
Temporal edge schema
Query primitives
The graph exposes four primary query primitives. Each respects the temporal constraint by default.
- Q1As-of traversal."Who was on the board of Company X on 15 March 2021?" The graph returns edges whose validity interval contained that date. Closed edges that were active on that date are included. Edges that did not yet exist are excluded.
- Q2Pattern match."Find all cases where a director of company A is also a director of a winning bidder to a procurement by A, within the same calendar year." The graph matches structurally and temporally.
- Q3Shortest path with time bounds."What is the shortest connection between Person X and Institution Y using only relationships that were active between 2020 and 2024?" Returns path + validity timeline for every edge.
- Q4Temporal diff."What changed in Company X's beneficial ownership structure between 2022-01-01 and 2024-12-31?" Returns opened edges, closed edges, and type transitions in the interval.
Agent Swarm Fabric
600+ autonomous workersSix hundred and sixty-six specialized autonomous agents operate Ri.NET under a single central orchestrator. The hierarchy is strict: one orchestrator, five cognitive domains (Legal, Financial, Civic, Sentinel, Journalist), sixty mid-level coordinators, six hundred task-specific workers. The number is not symbolic. It is the measured steady-state fleet size under current operational load.
Interface & Orchestration
The only layer that speaksLayer 07 is the only component of Ri.NET that external consumers — humans, API clients, frontends — ever touch directly. It routes queries, enforces authentication, applies rate limits, manages sessions, and presents the reasoning substrate beneath through three surfaces: the conversational interface (DABI), the structured REST API, and a set of domain-specific frontends.
End-to-end data flow
A single ingestion event — say, a new court notice published on e-Oglasna at 08:14 — flows through the entire stack in median 2.3 seconds from source availability to query-ready state. The sequence:
- T+0msCourt notice publishedSource webhook fires. Collector agent receives notification.
- T+180msRaw fetch completeCollector retrieves the document. Raw bytes persisted. Provenance record created.
- T+420msParse completeSource-specific parser extracts: case number, parties (by name), court, filing date, notice type. Validated against source-specific schema.
- T+680msNormalizedDates harmonized, encoding repaired, party names canonicalized. Handoff to entity resolution.
- T+1,100msEntities resolvedEach named party resolved to a canonical entity ID. Resolution confidence attached. Low-confidence matches flagged.
- T+1,450msCortex indexedDocument content embedded (1024-dim) and inserted into the Court & Compliance collection. Linked to resolved entities.
- T+1,900msGraph updatedNew edges added: (party_A → case_X), (party_B → case_X), with validity starting now. Temporal index updated.
- T+2,300msQuery-readyThe notice is now retrievable by semantic query, entity lookup, or graph traversal. Any query issued from T+2.3s onward sees the new data.
- T+2,600msAgent scan triggeredSentinel agents evaluate the new event against 127 anomaly patterns. If any pattern matches above threshold, an alert is queued for the appropriate human.
- T+∞Audit anchoredEvery operation logged to the immutable audit trail. Nightly batch anchors the log root to Polygon PoS. The entire chain is verifiable by any third party.
Query lifecycle
Consider the query: "Which companies won procurement contracts above €100K from KBC Rijeka in 2023, where at least one board member was also affiliated with a losing bidder?" The following illustrates the path this question takes through the system.
Security topology
Ri.NET implements zero-trust as an architectural invariant, not a retrofit. No service trusts any other service. Every call carries a cryptographic assertion of origin, scope, and intent. Every operation is logged to an immutable audit trail. The audit trail is anchored nightly to a public blockchain.
Read the full DPIA summary on the DPIA page. Enterprise and sovereign deployments receive the complete seventeen-page DPIA package as part of onboarding.
Seen enough?
The documentation above describes the system as it operates today. The API exposes it. The demo shows it running against real data.
Register for API access Request a demo