Technology & research
The foundations that make our solutions fast, sovereign, and verifiable — the packages, the ML engines, and the storage behind everything.
@msm-core — the engine layer
Thirteen pure, injectable packages. The same agent loop runs across OpenAI, Anthropic, Gemini, and local Ollama — with guards, a cost cap, and verify-and-adapt.
Brain-agnostic agent execution loop — guards, cost cap, force-finalize.
Tiered RAG context assembly (zero runtime deps).
Rules → embedding → LLM-judge output gate; fails safe to review.
CAS + idempotent durable job/mission engine, cron, HITL.
Verify-and-adapt planner — side-effect-aware retry or escalate.
Turns run outcomes into reusable lessons.
Five-tier memory: customer / episodic / semantic / procedural / reflection.
PII redaction (Luhn-checked) + prompt-injection + groundedness.
Human-in-the-loop approval gate + NL auto-approve.
Chunk · embed · search over injectable ports.
File → text: cloud-free parsers + born-digital PDF.
Real Word XML, RTL/Arabic-aware, letterhead.
Deterministic DCF: NPV / IRR / payback / sensitivity.
Compute & ML workers
Heavy work ships as versioned Docker images — “packages, but for services.” Every prediction is a labelled opinion with confidence, never a hallucinated fact.
train / predict / forecast / anomaly (XGBoost, LightGBM)
scheduling & allocation (OR-Tools CP-SAT)
discrete-event what-if (SimPy)
predictive-maintenance data twin
Arabic-capable OCR (PaddleOCR)
cross-encoder reranking (ONNX, CPU)
bulk corpus: chunk → redact → embed
study → PPTX + XLSX
DXF / DWG CAD geometry
Baileys multi-session gateway (Node)
monlite — the whole backend in one file
Vectors, full-text, key-value, queue, and cron in a single crash-tested SQLite file. The reason the stack runs anywhere with no infrastructure — and the same code scales up to Postgres.
Zero-dep embedded document DB (API-frozen 2.x).
sqlite-vec / pgvector + hybrid RAG.
FTS5 / tsvector full-text search.
Redis-like cache, locks, sorted sets, pub/sub.
Durable queue (SKIP LOCKED), retries/backoff.
Persisted 5-field scheduler.
SSE live queries with row-level deltas.
Same API on Postgres / JSONB.
IntentText — documents you can verify
A format where the file itself is the data: readable, queryable, and tamper-evident. Seal, sign, and verify offline — the trust layer a .docx cannot provide.
Parser, renderers, query, seal/sign/verify — zero deps.
Embeddable WYSIWYG React editor (TipTap).
Server-side PDF + PDF/A.
Ed25519 signatures.
PAdES ECDSA P-256 + X.509 (Adobe-recognized).
MCP server: parse / query / render for agents.
Research track
Beyond the platform, a model-agnostic Arabic pipeline: a semantic tokenizer (CST), a hyperdimensional intent gate (nemo), and an orchestration layer (msm) — sub-millisecond, no GPU, full EN + AR parity.
CST
Contextual Semantic Tokenizer — reversible, multilingual, Arabic root × pattern algebra.
~35–46% smaller than BPEnemo
Hyperdimensional intent gate — 10,000-dim vectors, nearest-prototype, self-learning.
sub-ms · no GPU · EN+ARmsm
Model-agnostic pipeline — translate → brain → validate, domains declared in YAML.
swap a model in one line