Backend Developer
~80% LLM cost ↓
AI Wisdom (AI Support Platform API)
Backend platform for AI customer-support chatbots using LLM (Ollama) and RAG (embeddings + vector search with pgvector).
Highlights
- Led architecture and implementation of RAG pipelines (ingestion, embeddings, retrieval, generation)
- Vector search on PostgreSQL (pgvector) to retrieve relevant context
- Reduced LLM response costs by ~80% via retrieval-first and context control
- Multi-tenant API: every request requires an API key + tenant identifier
RAG flow
Docs/KB → chunking → embeddings
↓
pgvector search (top-k)
↓
context assembly + prompt
↓
LLM (Ollama) response
Access
This API is multi-tenant and requires an API key + tenant identifier on every request (no public “try it”).