Chatcast technology

The model that powers commerce-grade AI.

Chatcast runs on a proprietary intent classifier trained specifically for ecommerce queries. Smaller, faster, cheaper, and more accurate than any general-purpose LLM on the task that drives shopping conversions.

Independent benchmark — full results

66M parameters, beating models with 1,000× more.

We ran 1,000 real product-query pairs through our model and every frontier LLM on the market. The accuracy gap held on out-of-distribution categories — proving the model learned the structure of product relevance, not just memorized the training set.

Model	In-dist accuracy	Out-of-dist accuracy	Latency	Cost / query
Chatcast (DistilBERT, 66M params)	81.0%	80.5%	37ms	$0.00
Gemini 2.5 Pro	70.0%	72.5%	2,285ms	$0.24
GPT-4o	69.5%	67.5%	803ms	$0.48
Gemini 2.5 Flash Lite	69.5%	64.5%	716ms	$0.02
Claude Haiku 4.5	68.0%	68.5%	804ms	$0.23
Claude Opus 4.5	67.0%	64.5%	1,516ms	$3.44
GPT-4o-mini	66.0%	64.0%	807ms	$0.03
Claude Sonnet 4.5	65.0%	62.5%	1,212ms	$0.68

Lower latency and cost are better. Methodology and dataset details in the full benchmark.

Read the full benchmark

How it works

Built on four design choices.

Trained on real shopping data

Public ecommerce datasets gave us scale. Synthetic data balanced the long tail. Proprietary query logs gave us accuracy on the queries that actually drive conversions.

66M parameters, fine-tuned

A DistilBERT base, fine-tuned for ecommerce intent classification. Small enough to run on CPU. Specialized enough to outperform models 1,000× larger on the task that matters.

Composes with frontier LLMs

Our intent model routes the query. Frontier LLMs (Gemini, Claude, GPT) handle response synthesis where their generation strength wins. Best tool for each job.

Owned, not rented

We control the weights, the training data, and the deployment. No upstream pricing changes, rate limits, or model deprecations to scramble around.

Intent-level performance

93.7% on exact matches.

The classification that drives conversions: when the model says “this is the right product,” it’s almost always correct.

Intent	Definition	In-dist	Out-of-dist
Exact Match	Product satisfies all query specs	93.7%	91.1%
Substitute	Functional alternative	59.2%	71.8%
Irrelevant	Not relevant to query	65.2%	42.9%

What this means for your store

The model picks the right product. You see the lift.

11 points more accurate than GPT-4o

Routes shoppers to the product they actually want. Higher conversion, fewer wrong-product abandonments.

20× faster than frontier LLMs

37ms vs. 700–2,300ms means the response feels instant. No spinner. No drop-off while waiting.

$0 per query

Run inference on every pageview, every search, every interaction without watching the meter. Frontier-LLM pricing kills always-on UX.

0.5pt drop on unseen categories

Most LLMs lose 2–5 points on out-of-distribution queries. Our model holds at 80.5% — it learned structure, not memorization.

Composability

Specialized for routing. Composes with frontier LLMs for generation.

Our intent model handles the high-volume, high-stakes work: classifying the shopper’s intent and finding the right product. Frontier LLMs handle conversational generation where their strength wins. The result is faster, cheaper, and more accurate than relying on a generalist for both jobs.

Intent classification → Chatcast model (37ms, $0)

Product retrieval → Chatcast catalog index

Response synthesis → Gemini / Claude / GPT (when needed)

Privacy & control

Owned weights, EU-hosted

We host our own model. Your queries never train a third-party LLM. GDPR-compliant by default.

No upstream surprises

No rate limits, no deprecated model IDs, no overnight pricing changes. The model that runs your store today runs it tomorrow.

Continuously improved

We retrain on aggregated shopping signals across our network. Every store benefits from every search.

See the model on your store.

Connect Shopify in 15 minutes. The intent classifier is live the moment your catalog syncs — no configuration required.

Start free Talk to the team