vebrOS Chat — Foundation Document
12 June 2026
The working blueprint for an embeddable AI chatbot SaaS, shown on-camera in episode 1 and shared here as a living reference.
vebrOS Chat — Foundation Document
Working codename. Branding TBD. Last updated: 12 June 2026
1. Essence
A tailored AI chatbot, trained on your content and ready in a few clicks — right there when your users need it.
2. Architecture Principle
One engine, many skins.
The core is a single engine with parameters. Every product surface is a different skin/delivery mechanism wrapped around that same engine. We are not building three products — we are building one and wrapping it.
3. Product Surfaces
Surface 1 — Website Chat (customer-facing) — SHIPS FIRST
- Embedded on a website
- Fresh session every visitor, no cross-session memory
- Strictly scoped to the content the owner uploads
- Public-facing
Surface 2 — WordPress Plugin (later)
- Same engine, wrapped for the WordPress market
- Large existing market, easy distribution
Surface 3 — Browser Extension (enterprise, later)
- For internal company use
- Solo chats or team chats, scoped by team and role
- Persistent
- Governed from a central web panel — admins control who sees what
- Enterprise-grade, higher value per customer
4. Build Order
- Engine — ingest content, store it, serve scoped answers
- Website embed — first skin, first live demo, first users
- WordPress plugin + browser extension — same engine, wrapped later
5. The Buyer
Buyer differs per skin:
- Website embed → solo devs & small agencies. Building sites for clients, want a quick AI chat to drop in. Comfortable with a snippet/API, want clean docs and control. This is the primary launch buyer — and overlaps directly with the YouTube audience, so the channel feeds the product.
- WordPress plugin → site owners / non-technical. Install-and-go, no terminal, no API keys to wrangle.
- Browser extension → corporations. High security demands (SSO, data residency, audit logs). Long sales cycles, high value per customer.
Ladder from technical → non-technical → enterprise.
6. The Differentiator
Against alternatives (Kundo, Crisp, Intercom AI, or rolling your own with the raw API):
- Extremely easy setup — fastest path from signup to a live, working chat.
- Great UI, front and back — the embedded widget looks great; the admin backend is a pleasure to use.
- Very customizable — handles and levers to adjust everything... but great defaults first. Works beautifully out of the box; advanced levers are tucked away for those who want them. (Easy and customizable pull in opposite directions — we resolve this with strong defaults + optional depth.)
- Decent free tier, great features — generous enough to build trust and drive word of mouth.
North star: so good that people recommend it unprompted. Product-led growth.
Competitor note: Kundo is a heavy enterprise customer-service platform. Our angle is leaner, self-serve, AI-first, embed-anywhere.
7. Tech Stack
Framework
- App + admin panel: Next.js (React). Chosen for speed-to-ship, largest ecosystem, easy future hiring, and it's already familiar from Splitte/Grundo. Scales well into the millions of requests. Biggest risk to future-proofing is stalling, not the framework ceiling — Next.js minimizes that.
- Embed widget: vanilla JS — tiny, framework-agnostic bundle. Must load fast and never conflict with the host site. Explicitly NOT React.
Database
- MongoDB Atlas holds everything — documents (users, accounts, settings, chat logs). No second database needed.
- MongoDB Atlas Vector Search is available natively but NOT used in v1 (no embeddings in v1). Reserved for v2 retrieval work.
LLM & Retrieval
- Chat model: DeepSeek. Chosen primarily for cost — dramatically cheaper per token than frontier models, which matters when every customer chat runs through it and we want a generous free tier.
- v1 retrieval: content-stuffing. No embeddings in v1. Uploaded content is stored and injected directly into DeepSeek's context window per chat, scoped to the owner's content. Works well for single-site-sized knowledge (FAQ, product pages, a few docs). Simpler, faster first build.
- v2 retrieval: our own LLM-based methods. Semantic search / retrieval for customers whose content is too big to fit in context. Native MongoDB Atlas Vector Search is available when we get there. This is deliberately deferred — and makes for strong build-in-public content when we tackle it.
Hosting
- Vercel. Zero-config Next.js deploys, auto-scaling, generous free tier; already familiar. Caveat: serverless execution time limits on lower tiers — fine for streaming chat responses in v1, worth monitoring.
Auth
- Better Auth. Open-source, data lives in our own MongoDB (not a third party), free. Has a MongoDB adapter. Handles email/password, social logins, sessions. Future-proofs the enterprise skin — org/roles/SSO plugins available for the browser-extension corporate buyers later, so nothing needs ripping out.
8. Business Model / Pricing
Metering
- Headline billing unit: chats/month (a chat = one conversation/session, ~20 messages avg). This is what's advertised and what tiers are built around.
- Hidden rail 1: messages-per-chat cap — prevents one conversation running forever. Soft "fair use per conversation" in docs, no scary number on pricing page.
- Hidden rail 2: tokens-per-message cap — prevents a single message blowing up cost/abuse.
- Model selection as a feature gate — free gets DeepSeek; premium models could unlock on paid (future).
Two-track pricing (customer chooses)
Track A — Simple Tiers: fixed monthly price, bucket of included chats. Cheaper within expected volume because it's predictable. Overages are deliberately expensive — the cost of not committing to usage-based.
Track B — Pay-as-you-go ramp: transparent graduated per-chat pricing that steps down with volume. Fully public, self-serve to any scale — NO "contact sales," no gatekeeping. A whale lands on the lowest rate without ever emailing us. Transparency is itself a selling point vs. competitors who hide pricing.
Example ramp (numbers illustrative, measure-first):
First N chats included / base
N – 2,000 $0.10 / chat
2,001 – 10,000 $0.07 / chat
10,001 – 50,000 $0.05 / chat
50,000+ $0.03 / chat
Free tier
- 50 chats/month — positioned as a taste/demo, expect fast upgrade.
- Graceful cap when exhausted — soft "demo limit reached" message, NEVER a dead/broken bot (protects the dev who installed it in front of their client).
- DeepSeek only.
- "Powered by vebrOS Chat" badge — free advertising + upgrade nudge. Every free bot is a billboard (product-led growth engine).
Platform floor
- $5/month to keep a bot active (applies to the pay-as-you-go track; Simple Tiers already have a price floor).
- Purpose is filtering + revenue predictability, NOT covering idle cost. Idle technical cost is pennies (storage ~2–3¢, embed serving negligible). The $5 floor filters out dead-weight/spam accounts (skin in the game) and clears Stripe's cut cleanly (~9% at $5).
Overage behavior
- Customer's choice in settings. Defaults to soft cap (bot pauses, no surprise charge). Opt into auto-overage for guaranteed uptime. Removes the #1 objection to usage-based pricing (fear of surprise bills).
Cost reality (working assumptions — MEASURE FIRST)
- ~$0.03/chat working DeepSeek cost estimate (content-stuffing re-sends context every message, so chats cost more than bare API calls). This is a guess until measured on real traffic.
- Fixed monthly baseline: ~$35/mo (Vercel, MongoDB Atlas, domains, services) before any customer usage. Drops per-customer fast as customer count grows.
- Stripe ~2.9% + $0.30/transaction is the real per-charge floor — the $0.30 flat fee is why sub-$5 subscriptions don't work. Above ~$9 the percentage settles.
- Content storage capped per tier — makes idle cost bounded and predictable rather than open-ended.
Open / to revisit after first measurements
- Exact included-chat allowances per Simple Tier.
- Exact ramp breakpoints and rates.
- Real per-chat cost (the single biggest variable).
- Minimum billing threshold for pay-as-you-go (avoid processing tiny charges).
9. The Channel Tie-In
This product is built in public. The build is the content. Series premise: "Watch me build and sell a real AI chatbot SaaS from zero."