CLSplusplus
Brain-inspired persistent memory for LLMs. Model-agnostic — switch between Claude, GPT-4, Gemini, Llama without losing context. Sleep consolidation, belief protection, 4-layer memory hierarchy. Apache 2.0.
README
What is CLS++?
Every LLM in production today operates with amnesia. Sessions end, context windows clear, and the model forgets everything—preferences, corrections, facts established over months.
CLS++ is an external memory substrate that solves this at its root. Drawing from neuroscientific Complementary Learning Systems (CLS) theory, it implements:
| Feature | Description | |---------|-------------| | Four-store hierarchy | L0 (Working Buffer) → L1 (Indexing) → L2 (Schema Graph) → L3 (Deep Recess) | | Biological consolidation | Salience, Usage, Authority, Conflict, Surprise signals | | Sleep cycle | Nightly maintenance: rank, decay, deduplicate, consolidate | | Reconsolidation gate | Belief revision only with evidence quorum | | Model-agnostic | Any LLM plugs in via REST API—Claude, GPT-4, Gemini, Llama |
Memory is external to the model. Switch models anytime. No reset.
Quick Start
Install
pip install clsplusplus # Python (lightweight: only httpx + pydantic) npm install clsplusplus # JavaScript / TypeScript (zero dependencies)
Python SDK
from clsplusplus import Brain brain = Brain("alice") # Teach it anything in natural language brain.learn("I work at Google as a senior engineer") brain.learn("I prefer Python over JavaScript") # Ask it anything — semantic recall, not keyword matching brain.ask("What's my job?") # ["I work at Google as a senior engineer"] # Get LLM-ready context for any prompt brain.context("coding help") # "Known facts about this user:\n- I work at Google..." # Forget (GDPR right to be forgotten) brain.forget("I work at Google as a senior engineer")
JavaScript / TypeScript SDK
import { Brain } from "clsplusplus"; const brain = new Brain("alice"); await brain.learn("I work at Google as a senior engineer"); const facts = await brain.ask("What's my job?"); const context = await brain.context("coding help"); await brain.forget("I work at Google as a senior engineer");
Use with OpenAI
from clsplusplus import Brain brain = Brain("alice") # Wrap any LLM function — auto-injects memory, auto-learns @brain.wrap def chat(system_prompt, user_message): return openai.chat(system=system_prompt, user=user_message) response = chat("You are a helpful assistant", "Help me with Python") # Brain auto-recalls relevant memory, injects into prompt, # calls your LLM, learns from the exchange, returns response.
Full API
| Method | Description |
|--------|-------------|
| brain.learn(fact) | Teach a fact. Returns memory ID. |
| brain.ask(question) | Query for relevant facts. Returns list. |
| brain.context(topic) | Get LLM-ready context string. |
| brain.forget(fact) | Forget by text or ID. |
| brain.absorb(text) | Bulk-learn from document or conversation. |
| brain.who() | Auto-generated user profile. |
| brain.correct(old, new) | Update a belief. |
| brain.chat(message, llm) | Full conversation handler with memory. |
| brain.teach(dict) | Learn from structured data. |
| brain.watch(messages) | Learn from chat message history. |
| brain.wrap(fn) | Wrap any LLM function with auto-memory. |
Run the Full Server Locally
git clone https://github.com/rajamohan1950/CLSplusplus.git cd CLSplusplus pip install -e ".[server]" # Start infrastructure (Redis + PostgreSQL) docker compose up -d redis postgres # Start the API server uvicorn clsplusplus.api:create_app --factory --host 0.0.0.0 --port 8080
Try It Live
Try the demo — Tell Claude something, ask OpenAI. Same memory. No sign-up.
The Chrome extension (Web Store, v6.0.1) captures user messages from
ChatGPT, Claude, and Gemini chat pages automatically and feeds them through
the same memory pipeline. Host permissions: chatgpt.com,
chat.openai.com, claude.ai, gemini.google.com. The Link Account popup
differentiates 401 / 403 / network / unknown errors so you know whether the
key is wrong, the account is unlinked, or the server is unreachable.
Architecture
Browser (extension/capture.js) Any LLM client (SDK / REST)
↓ ↓
↓ www.clsplusplus.com (Vercel, Next.js)
↓ │ rewrites /api/v1/*, /api/admin/*
↓ ▼
└──────────────► Render-hosted FastAPI (clsplusplus-api)
│ middleware: auth, rate limit, abuse-guard
▼
┌─────────────────────────────────────────┐
│ CLS++ Core Service │
│ L0: Redis working buffer │ ← Prefrontal cortex
│ L1: PostgreSQL+pgvector episodic │ ← Hippocampus
│ L2: Schema graph (crystallized) │ ← Neocortex
│ L3: Deep archive │ ← Thalamus
│ PhaseMemoryEngine (gas→liquid→ │
│ solid→glass, auto tier-compression) │
│ SleepOrchestrator (replay + REM) │
│ ReconsolidationGate (belief revision) │
│ Weblab (PostHog flags + auto-rollback)│
│ Pricing control plane (memory-stored) │
└─────────────────────────────────────────┘
Every user message captured in the extension and every SDK learn() call
lands in L0, is promoted through the phase engine by the same thermodynamic
rules, and is persisted to L1 in the background. There is no separate
"explicit store" path — capture is continuous, tier compression is
automatic.
SaaS Mode (Memory-as-a-Service)
Enable API key auth and rate limiting for production:
export CLS_API_KEYS=cls_live_xxxxxxxxxxxxxxxxxxxxxxxx export CLS_REQUIRE_API_KEY=true export CLS_RATE_LIMIT_REQUESTS=100 export CLS_RATE_LIMIT_WINDOW_SECONDS=60 # Abuse-guard (env-tunable; defaults shown) export CLS_ABUSE_AUTHFAIL_THRESHOLD=60 # auth failures per IP per window export CLS_ABUSE_AUTHFAIL_WINDOW_SECONDS=600 # 10 minutes export CLS_ABUSE_WHITELIST_IPS= # comma-separated operator IPs
Requests carrying a valid API key are exempt from auth-fail flood counting
(see src/clsplusplus/abuse_guard.py).
Product endpoints: POST /v1/memories/encode, POST /v1/memories/retrieve, DELETE /v1/memories/forget, GET /v1/health/score. See SaaS docs.
Pricing
CLS++ launched in India and bills in INR via Razorpay (UPI / QR) — not
USD, not Stripe. Tiers: Pro ₹299, Business ₹999, Enterprise
₹4,999. The pricing control plane lives inside the CLS++ memory layer
itself (reserved namespace __cls_pricing__) and is operator-tunable at
runtime via POST /admin/pricing/config. The default config — currency,
margin floor/target, infra spend, dynamic-demand toggle — is seeded by
src/clsplusplus/pricing_store.py on first read. GET /v1/pricing is a
public, abuse-exempt endpoint.
Deployment
| Platform | Guide | |----------|-------| | Render (free tier) | Deploy in 1 click • Setup guide | | AWS Free Tier | CloudFormation • Step-by-step | | AWS | CloudFormation | | Azure | ARM template |
Documentation
| Document | Description | |----------|-------------| | API Reference | Endpoints, auth, examples | | API Blueprint | SaaS API playbook (DX, security, billing) | | SaaS Strategy | Memory-as-a-Service, pricing | | Marketplace Integration | AWS, Azure, GCP, OCI | | Productionization | Deployment, security, compliance | | Commercialization | Go-to-market, licensing |
Status
Phase 1 (Foundation) — Complete
- [x] Four stores (L0–L3) + Plasticity Engine
- [x] Write/Read API + Python SDK
- [x] Docker Compose + Render deploy (Dockerfile ships
/app/scripts/for operator tooling) - [x] Sleep cycle orchestrator
- [x] Reconsolidation gate
- [x] API key auth + rate limiting
- [x] SaaS product endpoints
- [x] Chrome extension (v6.0.1) capturing ChatGPT / Claude / Gemini
- [x] Abuse-guard with env-tunable thresholds and operator IP whitelist
- [x] PostHog Weblab — staged rollouts with auto-rollback (5xx > 2% or p95 > 3s)
- [x] INR pricing (Razorpay / UPI) with memory-resident control plane
Recent Architectural Decisions
- Auto-crystallization gated OFF by default. The Landauer
liquid→solid pipeline produced low-quality
[Schema: subject]token-soup entries that leaked into user-visible memory lists. Re-enable per process withCLS_ENABLE_AUTO_CRYSTALLIZATION=1. Melting still runs so existing schemas drain out. Seesrc/clsplusplus/memory_phase.py. - Pricing lives in the memory layer. Operator-tunable config (margin
floor/target, infra cost, dynamic-demand toggle) and the bucketed demand
history are stored as two fixed-id documents in namespace
__cls_pricing__, not in env vars. - Frontend ↔ backend topology. The Next.js frontend on Vercel
(
www.clsplusplus.com) rewrites/api/v1/*and/api/admin/*to the Render-hosted FastAPI. The bare onrender.com host is the internal upstream target — not the public surface. - Integration ownership.
POST /v1/integrationsinjectsowner_emailfrom the JWT-authed user; non-admin override returns 403. A backfill script handles legacy NULL rows.
Operator Runbooks
See docs/RUNBOOKS.md and
docs/LAUNCH_RUNBOOK.md for incident response,
deploy procedures, and the abuse-guard / weblab dashboards. The
container ships operator scripts at /app/scripts/ (see
scripts/admin_doctor.py, scripts/backfill_integration_owner_email.py).
Contributing
We welcome contributions. See CONTRIBUTING.md and the Wiki for details.
License
Provisional patent filed October 2025. Apache 2.0 (see LICENSE).
<p align="center"> <strong>AlphaForge AI Labs</strong> • <a href="https://github.com/rajamohan1950">Rajamohan Jabbala</a> • 2026 </p>