What you’re building
A chat interface where users type business questions (“What are the top 10 best-selling products?”, “Show me the monthly revenue trend”) and get answers from a real database. The app converts each question to SQL, validates it, runs it on a read-only MySQL connection, and sends the raw results back through the LLM for formatting. The pipeline:- User asks a question in plain English
- The LLM generates a SQL query based on the database schema
- The SQL guard validates the query (SELECT only, enforced LIMIT, no dangerous keywords)
- The query runs on a read-only MySQL endpoint
- Results go back to the LLM
- The LLM formats an answer with markdown tables and summaries
- The answer streams to the user
What changed from the chatbot
If you’ve read the chatbot tutorial (example 01), here’s what’s different. Gone:- Context stuffing (loading all Upsun docs into the prompt)
- Datamarking and message enclosure (security features for document context)
- Output filtering and n-gram leakage detection
- Prompt injection pattern detection
- The
build-context.tsscript andcontext.txtfile
- MySQL database with an ecommerce schema (products, customers, orders, carts, reviews)
- A seed script that populates the database with coherent fixture data on deploy
- A SQL guard that validates generated queries before execution
- Separate read-only and write database connection pools
- A two-step LLM flow (generate SQL, then format results)
- Upsun MySQL service with read-only and admin endpoints
Prerequisites
Node.js 22+, pnpm, an OpenAI API key from platform.openai.com, a local MySQL server for development, the Upsun CLI (docs.upsun.com/administration/cli), and Git.Project setup
Create the project:mysql2 is the MySQL client. It has built-in promise support, so no callback wrangling.
Configure TypeScript (tsconfig.json):
View source on GitHub
rootDir is set to . instead of ./src because the seed script lives in scripts/ and imports from src/. Both directories need to compile.
Add scripts to package.json:
The database layer
Database connection
Createsrc/database.ts:
View source on GitHub
getWritePool() connects through the DATABASE relationship (admin privileges), used only by the seed script. getReadPool() connects through the REPORTS relationship (read-only), used by the app at runtime.
Locally, both pools use the same MySQL config from your .env. On Upsun, they get separate credentials.
parseUpsunRelationship reads the environment variables that Upsun injects when you define a relationship. A relationship named database gives you DATABASE_HOST, DATABASE_PORT, DATABASE_USERNAME, DATABASE_PASSWORD, and DATABASE_PATH.
The schema definition
Createsrc/schema.ts:
View source on GitHub
information_schema at runtime because it’s simpler and the schema doesn’t change.
The SQL guard
The SQL guard sits between the LLM output and the database. LLMs generate text. They can be prompted to generate anything, includingDROP TABLE. The guard rejects anything that isn’t a safe read query.
Create src/sql-guard.ts:
View source on GitHub
- The query must start with
SELECTorWITH(for CTEs). Everything else is rejected. - Certain keywords are forbidden even inside a SELECT.
INSERT,UPDATE,DELETE,DROPare obvious.INTO OUTFILEandLOAD DATAprevent file system access.GRANTandREVOKEprevent privilege escalation. - Semicolons inside the query are blocked. This stops stacking attacks like
SELECT 1; DROP TABLE products. - If the LLM forgets a LIMIT, the guard appends one (capped at 500 rows). No accidental full table scans.
The chat logic
Createsrc/chat.ts:
View source on GitHub
<sql> tags, which makes extraction reliable.
The chat flow happens in two LLM calls:
Fixture data
The seed script generates deterministic ecommerce data using a seeded PRNG (mulberry32, seed 42). Same data every time, which makes testing predictable. Createscripts/seed.ts. The full script is in the repository. It generates:
- 10 categories (Electronics, Clothing, Home & Kitchen, Books, Sports, Beauty, Toys, Food, Office, Garden)
- 100 products, 10 per category, with realistic names, prices (300), and stock levels
- 500 customers, names from common first/last name pools, spread across 20 cities in 10 countries
- 750 orders with 1 to 5 items each, weighted toward “delivered” status (60%), totals calculated from actual product prices
- 100 active carts for random customers, 1 to 4 items each
- 800 reviews, ratings skewed toward 3 to 5 stars, one review per customer per product
The Upsun configuration
This is the interesting part. Create.upsun/config.yaml:
View source on GitHub
MySQL read-only endpoints
Theservices.mysql.configuration section defines two endpoints instead of one:
admin: full read/write access to themainschema. The seed script uses this during deploy.reader: read-only (ro) access to the same schema. The app uses this at runtime.
database relationship gives you DATABASE_HOST, DATABASE_PORT, DATABASE_USERNAME, DATABASE_PASSWORD, and DATABASE_PATH. The reports relationship gives you the same variables with the REPORTS_ prefix.
The reader endpoint gets its own MySQL user with only SELECT privileges. If the application sends a write query through the read pool, the database rejects it. Hard stop, at the infrastructure level, regardless of what the application code does.
This matters because you’re trusting a probabilistic text generator to follow your instructions. It usually does. “Usually” is not a word you want next to DROP TABLE. The SQL guard catches problems at the application level. The read-only endpoint catches everything the guard misses at the database level.
The deploy hook
The deploy hook runsnode dist/scripts/seed.js on every deployment. Since the seed script is idempotent, it only inserts data the first time. After that it detects existing data and exits.
The seed script uses the database (admin) relationship because it needs write access. The application server uses reports (reader) because it should never write.
Local development
Create a MySQL database for local development:.env:
http://localhost:3000 and try asking questions.
Deploying to Upsun
Initialize Git and push:Testing
Try these questions:- “What are the top 10 best-selling products?”
- “Which customers have spent the most money?”
- “Show me the monthly revenue trend”
- “What are the highest rated products by category?”
- “How many orders are still pending?”
- “What’s the average order value by country?”
- “Which products have the most reviews but the lowest ratings?”
How the safety layers stack up
Three layers between the user and a write query:- The system prompt tells the LLM to only generate SELECT queries. Works most of the time, but LLMs can be prompted to ignore instructions.
- The SQL guard validates the query in application code. Checks it starts with SELECT, has no forbidden keywords, no stacked statements, and a LIMIT. Deterministic, so it catches what the LLM gets wrong.
- The read-only MySQL endpoint rejects writes at the database level. If both layers above fail, the connection itself cannot modify data.