Do Edge Models replace LLMs?

No. They’re designed to complement inference by handling quick, cheap edge decisions (classification, redaction, routing signals).

Where do Edge Models run?

At the edge, close to where your traffic enters the gateway. Exact placement depends on deployment and regional requirements.

Yes. They can generate signals that routing policies use.

It’s in progress. If you have a concrete use case, we can prioritize the first supported workflows.

AI Gateway feature

In progress

Run lightweight models at the edge to add “reflexes” before calling an LLM: classification, redaction, enrichment, and routing decisions.

We’re actively building this. Talk to us if you want to help shape the first use cases.

Use small models to compress, filter, or pre-process so large models see only what matters.

Make quick decisions close to the user (and the provider) before the expensive call.

Apply privacy layers (e.g., PII detection/redaction) before prompts are forwarded.

Answers reflect current direction and may evolve as the platform ships.