SEARCHING FOR INFO≈ 3 h a day per worker · Coveo 2025·WORK ABOUT WORK58% of the time · Asana, Anatomy of Work 2023·ITALIAN FIRMS · AI USE16.4% (≥ 10 staff) · Istat 2025·AI PROJECTS · SMB vs LARGE8% vs 71% · PoliMi Observatory 2025·AI ACT ART. 50 · IN FORCE2 Aug 2026 · EU 2024/1689·GARANTE FINE · CAREGGI€80,000 · Provv. 474/2025·AI ACT ART. 50 · FINEup to €15M or 3% turnover · art. 99·ITALIAN AI MARKET 2025€1.8bn · +50% · PoliMi Observatory·SEARCHING FOR INFO≈ 3 h a day per worker · Coveo 2025·WORK ABOUT WORK58% of the time · Asana, Anatomy of Work 2023·ITALIAN FIRMS · AI USE16.4% (≥ 10 staff) · Istat 2025·AI PROJECTS · SMB vs LARGE8% vs 71% · PoliMi Observatory 2025·AI ACT ART. 50 · IN FORCE2 Aug 2026 · EU 2024/1689·GARANTE FINE · CAREGGI€80,000 · Provv. 474/2025·AI ACT ART. 50 · FINEup to €15M or 3% turnover · art. 99·ITALIAN AI MARKET 2025€1.8bn · +50% · PoliMi Observatory·SEARCHING FOR INFO≈ 3 h a day per worker · Coveo 2025·WORK ABOUT WORK58% of the time · Asana, Anatomy of Work 2023·ITALIAN FIRMS · AI USE16.4% (≥ 10 staff) · Istat 2025·AI PROJECTS · SMB vs LARGE8% vs 71% · PoliMi Observatory 2025·AI ACT ART. 50 · IN FORCE2 Aug 2026 · EU 2024/1689·GARANTE FINE · CAREGGI€80,000 · Provv. 474/2025·AI ACT ART. 50 · FINEup to €15M or 3% turnover · art. 99·ITALIAN AI MARKET 2025€1.8bn · +50% · PoliMi Observatory·
LemniaBUSINESS
IT·ENRequest a pilot
KNOWLEDGE GRAPH · ENGINE ANATOMY

A deterministic representation of the company.

NODES
38,412
EDGES
218,490
ENTITY-CLASSES
14
QUERY-LATENCY
< 200ms

Lemnia's knowledge graph is a deterministic structure spanning the whole organisation: documents, topics, processes and external entities (customers, suppliers, products) with their relations. It is built while the company works, continuously updated as the company changes, cited on every answer, and kept on the company's own servers. The per-customer dossier is one projection onto this graph.

This page explains Lemnia on two levels of reading. The first is in plain language, accessible to anyone without a technical background. The second, expandable on demand under each paragraph, holds the architectural detail for CTOs, DPOs, accountants and consultants.

§ 01INTERACTIVE EXPLORER
DEMO DATA · CUSTOMER BIANCHI

A click on a node opens its provenance citations. Dragging repositions the view, and the mouse wheel adjusts the zoom. The dataset is illustrative: the real graph of a company is built from the data the company chooses to connect.

EXPLORER ARCHITECTURE

Rendered via @xyflow/react 12.10 with a precomputed static layout (mulberry32 deterministic seed for reproducibility across reloads). The example dataset is hand-annotated. In the real product the graph is populated by the ingestion engine and navigation consults the SQLite + sqlite-vec store directly over gRPC mTLS on the LAN.

§ 02PIPELINE · SIX STAGES
FROM CONNECTOR TO CITATION
STAGE 01

Connection

Lemnia reads the systems present in the company (ERPs, e-commerce, mail, calendar, support, document repositories such as Google Drive, OneDrive, SharePoint and network folders, and external MCP servers) using only the credentials needed for reading. Writing always requires the explicit consent of the administrator.

TECHNICAL TRACK

Signed connectors built on the published protocols of the source systems (OAuth2 for Microsoft 365 / Google Workspace, REST APIs for Zucchetti, TeamSystem, SAP B1, WhatsApp Business Cloud API, IMAP IDLE for mail, file-share and cloud-drive connectors for Google Drive / OneDrive / SharePoint / network folders, and the Model Context Protocol for external MCP servers). Each connector inherits the document-level ACLs of the source system: that permission set is mirrored into the graph and enforced at retrieval, before ranking, so that no result, snippet or citation surfaces to a user who lacks source-system permission. CST.494

Credentials live in the OS keychain (Keychain Access on macOS, Credential Manager on Windows). Never persisted in clear text on the application disk. Refresh tokens rotated every 24h. CST.375

STAGE 02

Extraction

Scheduled or on-demand reads. Documents, ledger rows, messages, orders. Everything pseudonymised in transit, encrypted at rest.

TECHNICAL TRACK

Streaming ingestion for systems that support webhooks + IMAP IDLE; batch polling for the rest (configurable interval, default 5 min). Sub-5 min p95 from signal to visibility in the dossier. CST.269

At-rest encryption AES-256-GCM with per-tenant DEK, KEK held by the OS Keychain. Optional PII pseudonymisation during ingestion for categories configured by the DPO.

STAGE 03

Understanding

Local Italian-trained models identify the entities (customers, suppliers, products, cases, documents) and the relations between them (invoice → order → customer). On edge cases, the disambiguation is assisted.

TECHNICAL TRACK

NER + RE trained on an Italian corpus (medical + business + tax). Backbone Qwen3.5-4B Q4 for intent comprehension, mDeBERTa-v3-base-italian-NLI for consistency check. CST.333

Disambiguation via Qwen3-Embedding-0.6B over local context, with HITL (human-in-the-loop) modal fallback for cases below confidence 0.7. Disambiguation decisions feed continuous tenant-local training data.

SOTA references: ATOM (arXiv 2510.22590) for 5-tuple atomic-fact extraction, the ingestion-time fact unit of the graph; HippoRAG 2 for ranking via personalized PageRank on the graph.

STAGE 04

Storage

The local graph is signed, and every node, edge and property carries a provenance citation recording which document, which line and which timestamp the fact came from.

TECHNICAL TRACK

Dual-node schema: Entity (customer, supplier, product, case) + Source (document, email, ledger row). Every SUPPORTS edge ties a fact to a source with char-level range (offset_start, offset_end) and BLAKE3 hash of the source content. CST.30

Storage backend: SQLite with sqlite-vec 0.1.9 for embeddings, rusqlite for relational tables, RocksDB for the blob store of original documents. Append-only log with a BLAKE3 seal per transaction, exportable as proof for GDPR Art. 30 audit.

Community detection: Leiden algorithm for cluster analysis on frequently co-occurring entities. Batched PageRank refresh, configurable by graph size. CST.129

STAGE 05

Retrieval

When queried, Lemnia walks the graph for at most 3 hops and 8 seconds. Retrieval is deterministic, with no agent that self-organises and no unpredictable behaviour.

TECHNICAL TRACK

Hybrid BM25 + dense pipeline: query parsing → seed nodes via embedding similarity → multi-hop traversal with PPR weighting → cross-encoder reranking (mDeBERTa). Depth cap: 3 hops biz default, 2 medical. Time cap: 8 s total, fallback to top-3 evidence on timeout. CST.389

Multi-entity seed weighting: each seed weighted by (confidence, PageRank_prior, freshness). Query routes through a deterministic classifier (FetchTimeline | FetchMultiHopChain | FetchAggregate | FetchSummary); a TF-IDF + SVM router sends single-fact queries straight to a direct FTS5 path, skipping graph traversal. No agentic loop on the local model.

SOTA reference: HippoRAG 2 (arXiv 2502.14802) supplies the personalized-PageRank ranker for the multi-hop traversal. Generation stays deterministic, bound by cite-or-refuse.

STAGE 06

Citation

The answer is generated with every sentence anchored to a source-document passage. When the source does not exist, Lemnia states so and declines to invent.

TECHNICAL TRACK

5-step cite-or-refuse pipeline: (1) decomposition of the answer into atomic claims; (2) substring match against evidence set; (3) mDeBERTa-NLI for entailment verification; (4) KG-consistency check via traversal; (5) strip-and-replace for non-entailed claims. CST.82

Per-pack hedging whitelist: verosimilmente, presumibilmente, si stima and pare are exempt from entailment verification, though limited to paragraphs marked as hedged dossier. Strict sections (dunning, quotation, supplier-risk) admit no hedging.

BLAKE3-signed processing register for every answer. Exportable as proof for court or DPO. Regulatory anchor: Trib. Siracusa 338/2026 Art. 96 c.p.c.

§ 03ITALIAN → GRAPH → CITATION
NO LANGUAGE TO LEARN
IT  what happened with customer Bianchi this year?
──────────────────────────────────────────────────────
DSL
    MATCH (c:Customer {name: "Bianchi"})
          -[r:GENERATED|RECEIVED|SENT]-(e)
    WHERE r.timestamp >= "2026-01-01"
    RETURN e ORDERED BY r.timestamp
    LIMIT 50
──────────────────────────────────────────────────────
CITED ANSWER
    «In 2026 customer Bianchi received 3 orders
     (12 Feb, 4 May, 8 Aug), all paid. Last
     invoice is from 12 Nov [F-2026-247].»

Lemnia does not require learning a new language. The operator writes in Italian, and Lemnia translates the query into a structured graph query, executes it and returns a cited answer.

The model that performs the translation runs on the company's hardware, and nothing leaves the LAN at execution time.

FROM NATURAL LANGUAGE TO DSL

The IT→DSL parser is a deterministic classifier that maps the query to one of 4 retrieval forms (Timeline, MultiHopChain, Aggregate, Summary). Backbone Qwen3.5-4B Q4 for intent + slot filling, algorithmic validators for deadlines (cf, p.iva, IBAN), HITL fallback on confidence < 0.7.

No execution of LLM-generated code. The intermediate DSL is just a typed AST that the graph engine executes. The LLM has no access to disk or network.

§ 04E-R SCHEMA · FOURTEEN ENTITIES
GENERIC ONTOLOGY · ADAPTABLE PER NICHE
Customer
Supplier
Product
Order
Invoice
DDT
Case
Document
Email
Message
Ticket
Employee
Decision
Site

Fourteen base entity-classes span the whole organisation, covering documents, topics, processes and external entities (customers, suppliers, products) with their relations, in one continuously updated graph. The per-customer view is one projection of it. Each vertical (micro-enterprise, multi-channel, professional studio, SMB) inherits these classes and adds niche-specific ones. There are about 28 canonical relations, and every edge is annotated with provenance metadata and cardinality.

EXTENDING THE ONTOLOGY PER NICHE

The base schema lives in crates/lemnia-pack-business as Rust types. Every niche (T1-T4) can add classes via Cargo feature flags: e.g. T2 multi-channel adds SkuVariant, ExternalReview, Return; T3 professional studio adds Case, Filing, CalendarHearing.

The 28 canonical relations include GENERATED_BY, RECEIVES, SENDS, MENTIONS, REPLIES_TO, CONTAINS, INVOICED_FOR, LINKS_TO. Each relation carries confidence metadata (0-1), cardinality (1-1, 1-N, N-N) and a provenance citation pointing to the source that generated it.

§ 05THREE DEPLOYMENT TOPOLOGIES
LOCAL-FIRST · ALWAYS
T1

Solo

Lemnia desktop on the owner's laptop, with the Qwen3.5-4B model local, a footprint of about 5 GB and zero outbound traffic at query time. It fits micro-enterprises, shops and artisans.

TECHNICAL TRACK

Stack: Tauri V2 + Rust workspace, llama.cpp for Qwen3.5-4B Q4_K_XL inference (~2.8 GB), sqlite-vec for local embeddings. Single-user, single-tenant. Ingestion via local webhook or polling. CST.333

T2

Studio

A Mac mini or NUC running the headless Lemnia service on the LAN. All collaborators see the same dossier from their clients, and queries stay local. It fits professional studios and multi-channel e-commerce.

TECHNICAL TRACK

T2 architecture: lemnia-server (gRPC mTLS binary) + desktop client tauri-business + mobile-business. Initial pairing via QR code, tenant-scoped certificates, RBAC roles inherited from the source system. Multi-seat (default 3-10 users). CST.403

Intra-LAN sync, never cloud. Pro mode (cloud-burst ingest) requires explicit per-batch consent and keeps a signed audit log.

T3

PMI Sovereign

A dedicated GPU appliance at the company premises, with the whole stack running on company hardware and unlimited users. It fits SMBs of €5-50M, 50-249 employees and regulated contexts.

TECHNICAL TRACK

T3 single-tenant on-prem: x86_64 server with NVIDIA RTX 6000 Ada or equivalent, vLLM 0.19+ for the local Qwen3.6-35B-A3B-FP8 model + DFlash drafter (2.5-2.9× speedup), Linux SEV-SNP for hardware attestation.CST.335

Compliance: AVR (Authorized Vendor Register) log, per-niche pre-signed DPIA, automatic export of GDPR Art. 30 artifacts. NIS2-ready: access log, patch management, separation of duties, BCP/DR exercise.

§ 06TECHNICAL FOUNDATIONS
THE LITERATURE LEMNIA ABSORBS

Lemnia integrates the best of published research on KG-RAG, deterministic retrieval and cite-or-refuse, and optimises it for the Italian case (Italian-native, local-first, compliant), rather than inventing a new methodology.

BIBLIOGRAPHIC REFERENCES
  • HIPPORAG 2 · ARXIV 2502.14802 · OSU-NLP

    Personalized PageRank over a dual-node knowledge graph (chunks and entities). Lemnia takes the PPR weighting as its inter-hop ranker and leaves the agentic side aside.

  • LAZYGRAPHRAG / BENCHMARKQED · MICROSOFT 2026

    Query-time community summarisation. Lemnia adopts it as a community summary cache, in place of eager cloud-side summarisation at ingestion.

  • GRAPHRAG-BENCH · 2026

    A query router (TF-IDF and SVM) that classifies single-fact queries. Lemnia uses it to route those queries to a direct FTS5 path, skipping graph traversal.

  • ATOM · ARXIV 2510.22590 · EACL 2026

    5-tuple atomic-fact extraction. Lemnia adopts the 5-tuple as its ingestion-time fact unit, with every atom bound by cite-or-refuse.

WHAT LEMNIA ADDS
  • Italian-native

    NER, RE and parsing trained on Italian corpora (business, tax, medical), rather than English processed via translation.

  • Mandatory citation

    A 5-step pipeline (decomposition, substring match, NLI entailment, graph consistency, strip-and-replace) with a per-pack hedging whitelist. Hallucinations are never tolerated.

  • Local at query time

    Optional cloud-burst only for heavy ingest and long generation, while query-time retrieval stays on-prem.

  • Compliance built-in

    A BLAKE3-signed register per query, a per-niche pre-signed DPIA, and eligibility for the Annex V 2026 hyper-amortisation.

AI ACT ART. 50 · GDPR · NIS2 · GARANTE PROVV. 474/2025 · EU HOSTED
Request a pilot →← Back to overview
FOUNDERS PROGRAMME · LIMITED PLACES

Lemnia running on the data of a real company.

A thirty-minute demonstration, calibrated to the company's sector. Lemnia composes the record of a real customer, cites the sources line by line and presents the signed register ready for the DPO.

Request a pilotDownload the technical dossier