How to Build an AI Chatbot for Your Website (4 Approaches

Adding an AI chatbot to your website sounds simple. Install a widget, connect an API key, write a system prompt. Done. Except that approach produces a chatbot that handles 30–40% of queries before confusing or frustrating the rest of your visitors.

We have built chatbots across all four architecture tiers for clients ranging from single-location service businesses to mid-market SaaS platforms. The honest conclusion from our data: widget platforms resolve 30–45% of queries. Custom RAG chatbots resolve 60–80%. The gap is not the AI model quality. It is the knowledge base quality and retrieval architecture.

This guide walks through all four approaches, explains when each makes sense, and provides the technical depth to actually implement them.

The Four Architecture Options

Option 1: Widget Platforms (Intercom, Drift, Tidio)

Cost: $29–$449/month. Setup time: 30–90 minutes. Resolution rate: 30–45% on typical business queries.

These platforms provide an embeddable widget with AI built in. Intercom Fin, Drift AI, and Tidio's Lyro all work similarly: you upload your help documentation, connect your live chat team, and the AI handles what it can before escalating. They are genuinely good products for what they are.

The problem is cost at scale. Intercom charges $0.99 per AI resolution. At 2,000 resolutions/month, that is $1,980/month in resolution fees alone, plus the base platform fee. For high-volume use cases, this is economically irrational compared to a direct API implementation.

Use widget platforms when: you need something live in a day, you have under 500 conversations/month, or your team has no technical capacity for anything else.

Option 2: No-Code Builders (Voiceflow, Botpress)

Cost: $0–$749/month. Setup time: 1–5 days.Resolution rate: 35–55% depending on flow design.

Voiceflow and Botpress let you design conversation flows visually, add AI-powered intent detection, and embed the result on any website. They are significantly more powerful than simple widget platforms — you can design branching logic, integrate APIs, and build multi-step qualification flows.

The limitation is conversation intelligence. These platforms excel at structured flows ("press 1 for X, press 2 for Y" evolved into AI intent detection), but they struggle with open-ended questions that require genuine reasoning. Botpress improved significantly with their LLM integration, but the rigid flow structure fights against natural conversation.

Use no-code builders when: you need complex conversation flows with branching logic, your team is non-technical but has more time than money, or you need specific integrations (Salesforce, HubSpot) without custom development.

Option 3: API-First (OpenAI Assistants API + Custom Widget)

Cost: $50–$200/month at 1,000 conversations. Setup time:3–7 days with a developer. Resolution rate: 55–70%.

Build a lightweight backend that calls the OpenAI Assistants API, create a React chat widget, and embed it on your website via a script tag. You get file search (vector store), code interpreter, function calling, and conversation memory out of the box from OpenAI. This is significantly faster than building a full custom RAG pipeline.

Option 4: Full Custom (Next.js + Vercel AI SDK + Your LLM)

Cost: $30–$100/month at 1,000 conversations plus development time.Setup time: 2–4 weeks. Resolution rate: 60–80%.

Full ownership of the entire stack. Your own RAG pipeline, your own vector store, your own conversation storage, your own analytics. Maximum control, maximum resolution rate, maximum development investment.

Use this approach when: you need maximum performance, have compliance requirements that prevent data going to third-party platforms, or are building a chatbot as a core product feature rather than a support tool.

Key Takeaway

For most small and mid-size businesses, Option 3 (API-first) is the right starting point. You get 80% of the performance of a full custom build at 30% of the development cost. Graduate to Option 4 when you are handling 5,000+ conversations/month and the economics justify the additional engineering investment.

Decision Framework

Answer these four questions to find your option:

Team: Do you have a developer available? No → Option 1 or 2. Yes → Option 3 or 4.
Budget: Is your monthly budget under $100? Option 1 (Tidio free tier or Botpress free) or Option 4 if developer available. Over $100/month in API costs acceptable? Any option.
Conversation complexity: Mostly FAQ and simple questions? Option 1–3. Complex reasoning, multi-step processes, account-specific lookups? Option 3–4.
Integration needs: Standard CRM/helpdesk integrations? Option 1–2. Custom APIs, internal databases, proprietary systems? Option 3–4.

Option 3 Step-by-Step: API-First Implementation

This is the implementation we recommend most often. Here is how to build it.

Step 1: Set Up the OpenAI Assistant

In the OpenAI playground, create a new Assistant. Set the model to gpt-4o-mini (best cost/performance for support use cases). Enable "File Search" (not Code Interpreter unless needed — it costs extra). Write your system prompt with explicit boundaries as described in our ChatGPT customer service guide.

Upload your knowledge base documents to the vector store. OpenAI handles chunking and embedding automatically. For a 100-document knowledge base, this takes 5–10 minutes.

Step 2: Create the API Route

Build a server-side API route that manages thread creation and message sending. The pattern is: receive user message → create or retrieve thread → add message → create run → poll for completion → return assistant response. Using the OpenAI Node.js SDK:

POST /api/chat — accepts {threadId, message}, returns {threadId, response}
Thread IDs are stored client-side (localStorage or cookie) to maintain conversation continuity
Server-side: validate input, enforce rate limiting, log conversations to your database

Step 3: Build the React Widget

A minimal chat widget needs: a toggle button (typically bottom-right), a message thread display, a text input, and typing indicators. Keep it under 200 lines of React. The critical UX elements are:

Optimistic UI: show user message immediately, spinner while waiting for response
Streaming (if you switch to Chat Completions): use Server-Sent Events for real-time text rendering
Mobile-responsive: the widget must work on phones. Use a fixed-position container with max-height and scroll.
Welcome message: shown on first open, sets expectations and offers quick-start options

Step 4: Embed on Any Website

Build the widget as a Next.js app, deploy to Vercel, and create a script tag embed:

The widget loads from your domain as an iframe or shadow DOM component
Pass configuration (company name, colors, initial message) via data attributes on the script tag
This approach works on any HTML website — WordPress, Squarespace, Shopify, static HTML

Pro Tip: Use shadow DOM for the widget rather than a plain iframe. Shadow DOM prevents CSS conflicts with the host website while still allowing you to style the widget from your domain. iframes work but require the host site to set correct CSP headers, which many site owners cannot do.

Option 4 Step-by-Step: Full Custom with RAG

The full custom approach uses Next.js route handlers, the Vercel AI SDK, Supabase pgvector for the knowledge base, and your choice of LLM.

Backend Architecture

The core is a streaming API route using Vercel AI SDK's streamText():

Receive user message via POST
Generate embedding for the query using text-embedding-3-small
Run vector similarity search against Supabase pgvector
Retrieve top 4–6 relevant chunks
Construct prompt: system instructions + retrieved context + conversation history + user message
Call streamText() with the constructed messages array
Return the stream to the client

Supabase pgvector Setup

Enable the vector extension and create a documents table with a content column (text) and an embedding column (vector(1536) for text-embedding-3-small). Add a metadata JSONB column for category, source_url, and last_updated. Create an HNSW index on the embedding column for fast approximate nearest neighbor search at scale:

Enable extension: CREATE EXTENSION IF NOT EXISTS vector;
Create index: CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Query with: ORDER BY embedding <=> query_embedding LIMIT 5;

Knowledge Base Ingestion Pipeline

Build a server-side ingestion script that runs on a schedule or webhook trigger:

Fetch updated documents from your content source (help center API, Notion API, local files)
Chunk documents into 300–400 token segments with 50-token overlap
Generate embeddings in batches of 100 (OpenAI rate limits apply)
Upsert to Supabase using the source URL as a unique key to handle updates

UI/UX Best Practices That Actually Affect Resolution Rate

These are not cosmetic suggestions. They directly affect how many conversations resolve successfully.

Widget Placement

Bottom-right is the expected position. Do not be creative here. Customers have been trained by every major website to find chat support there. Deviation hurts discoverability without any benefit.

Welcome Message A/B Testing

The welcome message is the highest-leverage optimization in a chatbot. We have tested 20+ variants across clients and the pattern is consistent: specific beats generic. "Hi! Ask me anything about our return policy, shipping times, or account setup." outperforms "Hi! How can I help you today?" by 23% in message initiation rate. Specific prompts reduce off-topic queries by 40%.

Quick Reply Chips

Offer 3–4 common question buttons below the welcome message. These remove the blank-input anxiety and route users to your highest-quality knowledge base content. They also represent your most common Tier 1 queries — serve them first.

Conversation History

Persist conversation history in localStorage with a 24-hour expiry. Returning users should not have to repeat context. This is table-stakes UX that most implementations miss.

Mobile Responsiveness

Test on a 375px-wide viewport. The chat widget should occupy 85–95% of screen width on mobile and not overlap the page scroll. Common mistake: a widget sized for desktop that partially covers the mobile viewport on open.

Cost Comparison at 1,000 Conversations/Month

Intercom Fin: $74 base + $0.99/resolution × ~650 resolutions = $717/month
Tidio Lyro: $299/month for 200 conversations, $0.50/conversation overage = $699/month
Botpress Pro: $445/month all-in at that volume
OpenAI Assistants API (Option 3): ~$100 API costs + $20 hosting = $120/month
Full Custom (Option 4): ~$50 API costs + $25 Supabase + $0 Vercel = $75/month

The Platform Trap

Platform pricing is designed around the assumption that you will not do the math until you are deeply integrated and switching is painful. At 1,000 conversations/month, the difference between Intercom Fin and a direct API implementation is $597/month — $7,164/year. The API-first implementation takes one developer one week to build. That is a 4-week payback period.

The Metrics That Matter

Do not measure chatbot "engagement." That is a vanity metric. Measure:

Resolution rate: Conversations fully handled without human escalation. Target 60%+ with good RAG.
Deflection rate: Tickets that would have gone to your support team but were handled by the bot. Calculate value using your cost-per-ticket.
Post-chat CSAT: Send a one-question survey after chat ends. 4-5 stars on 60%+ of responses is a healthy benchmark.
Escalation rate: The percentage reaching a human. Below 30% suggests over-escalation (too conservative). Above 60% suggests under-performance.
False resolution rate: Conversations marked as resolved where the customer returned within 24 hours with the same question. Target below 8%.

Key Takeaway

The resolution rate difference between widget platforms (30–45%) and custom RAG chatbots (60–80%) is entirely driven by knowledge base quality and retrieval precision. Spend more time on your knowledge base than on your UI. A beautiful widget with bad retrieval performs worse than an ugly widget with excellent retrieval.

Ready to Build?

For businesses just starting out, our recommendation is to begin with the API-first approach (Option 3) and graduate to full custom when volume justifies the investment. The 3–7 day build time and $50–$200/month operational cost make it accessible to businesses of almost any size.

If you want a pre-built implementation deployed and customized for your specific knowledge base and use case, explore our AI chatbot service or read our complete guide to building an AI customer service agent.

This guide walks through all four approaches, explains when each makes sense, and provides the technical depth to actually implement them.

The Four Architecture Options

Option 1: Widget Platforms (Intercom, Drift, Tidio)

Cost: $29–$449/month. Setup time: 30–90 minutes. Resolution rate: 30–45% on typical business queries.

Use widget platforms when: you need something live in a day, you have under 500 conversations/month, or your team has no technical capacity for anything else.

Option 2: No-Code Builders (Voiceflow, Botpress)

Cost: $0–$749/month. Setup time: 1–5 days.Resolution rate: 35–55% depending on flow design.

Option 3: API-First (OpenAI Assistants API + Custom Widget)

Cost: $50–$200/month at 1,000 conversations. Setup time:3–7 days with a developer. Resolution rate: 55–70%.

Option 4: Full Custom (Next.js + Vercel AI SDK + Your LLM)

Cost: $30–$100/month at 1,000 conversations plus development time.Setup time: 2–4 weeks. Resolution rate: 60–80%.

Key Takeaway

Decision Framework

Answer these four questions to find your option:

Team: Do you have a developer available? No → Option 1 or 2. Yes → Option 3 or 4.
Budget: Is your monthly budget under $100? Option 1 (Tidio free tier or Botpress free) or Option 4 if developer available. Over $100/month in API costs acceptable? Any option.
Conversation complexity: Mostly FAQ and simple questions? Option 1–3. Complex reasoning, multi-step processes, account-specific lookups? Option 3–4.
Integration needs: Standard CRM/helpdesk integrations? Option 1–2. Custom APIs, internal databases, proprietary systems? Option 3–4.

Option 3 Step-by-Step: API-First Implementation

This is the implementation we recommend most often. Here is how to build it.

Step 1: Set Up the OpenAI Assistant

Upload your knowledge base documents to the vector store. OpenAI handles chunking and embedding automatically. For a 100-document knowledge base, this takes 5–10 minutes.

Step 2: Create the API Route

POST /api/chat — accepts {threadId, message}, returns {threadId, response}
Thread IDs are stored client-side (localStorage or cookie) to maintain conversation continuity
Server-side: validate input, enforce rate limiting, log conversations to your database

Step 3: Build the React Widget

A minimal chat widget needs: a toggle button (typically bottom-right), a message thread display, a text input, and typing indicators. Keep it under 200 lines of React. The critical UX elements are:

Optimistic UI: show user message immediately, spinner while waiting for response
Streaming (if you switch to Chat Completions): use Server-Sent Events for real-time text rendering
Mobile-responsive: the widget must work on phones. Use a fixed-position container with max-height and scroll.
Welcome message: shown on first open, sets expectations and offers quick-start options

Step 4: Embed on Any Website

Build the widget as a Next.js app, deploy to Vercel, and create a script tag embed:

The widget loads from your domain as an iframe or shadow DOM component
Pass configuration (company name, colors, initial message) via data attributes on the script tag
This approach works on any HTML website — WordPress, Squarespace, Shopify, static HTML

Option 4 Step-by-Step: Full Custom with RAG

The full custom approach uses Next.js route handlers, the Vercel AI SDK, Supabase pgvector for the knowledge base, and your choice of LLM.

Backend Architecture

The core is a streaming API route using Vercel AI SDK's streamText():

Receive user message via POST
Generate embedding for the query using text-embedding-3-small
Run vector similarity search against Supabase pgvector
Retrieve top 4–6 relevant chunks
Construct prompt: system instructions + retrieved context + conversation history + user message
Call streamText() with the constructed messages array
Return the stream to the client

Supabase pgvector Setup

Enable extension: CREATE EXTENSION IF NOT EXISTS vector;
Create index: CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Query with: ORDER BY embedding <=> query_embedding LIMIT 5;

Knowledge Base Ingestion Pipeline

Build a server-side ingestion script that runs on a schedule or webhook trigger:

Fetch updated documents from your content source (help center API, Notion API, local files)
Chunk documents into 300–400 token segments with 50-token overlap
Generate embeddings in batches of 100 (OpenAI rate limits apply)
Upsert to Supabase using the source URL as a unique key to handle updates

UI/UX Best Practices That Actually Affect Resolution Rate

These are not cosmetic suggestions. They directly affect how many conversations resolve successfully.

Widget Placement

Bottom-right is the expected position. Do not be creative here. Customers have been trained by every major website to find chat support there. Deviation hurts discoverability without any benefit.

Welcome Message A/B Testing

Quick Reply Chips

Conversation History

Persist conversation history in localStorage with a 24-hour expiry. Returning users should not have to repeat context. This is table-stakes UX that most implementations miss.

Mobile Responsiveness

Cost Comparison at 1,000 Conversations/Month

Intercom Fin: $74 base + $0.99/resolution × ~650 resolutions = $717/month
Tidio Lyro: $299/month for 200 conversations, $0.50/conversation overage = $699/month
Botpress Pro: $445/month all-in at that volume
OpenAI Assistants API (Option 3): ~$100 API costs + $20 hosting = $120/month
Full Custom (Option 4): ~$50 API costs + $25 Supabase + $0 Vercel = $75/month

The Platform Trap

The Metrics That Matter

Do not measure chatbot "engagement." That is a vanity metric. Measure:

Resolution rate: Conversations fully handled without human escalation. Target 60%+ with good RAG.
Deflection rate: Tickets that would have gone to your support team but were handled by the bot. Calculate value using your cost-per-ticket.
Post-chat CSAT: Send a one-question survey after chat ends. 4-5 stars on 60%+ of responses is a healthy benchmark.
Escalation rate: The percentage reaching a human. Below 30% suggests over-escalation (too conservative). Above 60% suggests under-performance.
False resolution rate: Conversations marked as resolved where the customer returned within 24 hours with the same question. Target below 8%.

Key Takeaway

The Four Architecture Options

Option 1: Widget Platforms (Intercom, Drift, Tidio)

Option 2: No-Code Builders (Voiceflow, Botpress)

Option 3: API-First (OpenAI Assistants API + Custom Widget)

Option 4: Full Custom (Next.js + Vercel AI SDK + Your LLM)

Decision Framework

Option 3 Step-by-Step: API-First Implementation

Step 1: Set Up the OpenAI Assistant

Step 2: Create the API Route

Step 3: Build the React Widget

Step 4: Embed on Any Website

Option 4 Step-by-Step: Full Custom with RAG

Backend Architecture

Supabase pgvector Setup

Knowledge Base Ingestion Pipeline

UI/UX Best Practices That Actually Affect Resolution Rate

Widget Placement

Welcome Message A/B Testing

Quick Reply Chips

Conversation History

Mobile Responsiveness

Cost Comparison at 1,000 Conversations/Month

The Metrics That Matter

Ready to Build?

Related Articles

How to Build an AI Customer Service Agent (Step-by-Step)

How to Implement ChatGPT for Customer Service (The Right

AI Agents vs Traditional Chatbots

Explore Our Services

AI Agent Implementation

AI Chatbots & Agents

Custom Web Design

Make AI Your Edge.

The Four Architecture Options

Option 1: Widget Platforms (Intercom, Drift, Tidio)

Option 2: No-Code Builders (Voiceflow, Botpress)

Option 3: API-First (OpenAI Assistants API + Custom Widget)

Option 4: Full Custom (Next.js + Vercel AI SDK + Your LLM)

Decision Framework

Option 3 Step-by-Step: API-First Implementation

Step 1: Set Up the OpenAI Assistant

Step 2: Create the API Route

Step 3: Build the React Widget

Step 4: Embed on Any Website

Option 4 Step-by-Step: Full Custom with RAG

Backend Architecture

Supabase pgvector Setup

Knowledge Base Ingestion Pipeline

UI/UX Best Practices That Actually Affect Resolution Rate

Widget Placement

Welcome Message A/B Testing

Quick Reply Chips

Conversation History

Mobile Responsiveness

Cost Comparison at 1,000 Conversations/Month

The Metrics That Matter

Ready to Build?

Related Articles

How to Build an AI Customer Service Agent (Step-by-Step)

How to Implement ChatGPT for Customer Service (The Right

AI Agents vs Traditional Chatbots

Explore Our Services

AI Agent Implementation

AI Chatbots & Agents

Custom Web Design

Make AI Your Edge.