AI Gateway feature

Private Models

Early access

Run open-source LLMs as on-demand, serverless instances and route to them through the same Edgee gateway API alongside public providers.

We’re building serverless provisioning and placement. Tell us your model and deployment constraints.

Capabilities

  • On-demand, serverless OSS model instances
  • Routing policies that can include private deployments
  • Environment separation (dev/staging/prod) as the control plane matures
  • Model/version pinning and regional placement controls

How it works

  1. You choose a model (and constraints like region/performance).
  2. Edgee provisions and exposes it behind a stable gateway model name.
  3. Your app calls Edgee as usual; routing can target the private model when appropriate.
  4. Observability and cost signals remain centralized at the gateway.

Common use cases

  • Workloads requiring stricter control of data path and deployment region
  • Latency-sensitive inference close to user populations
  • Hybrid deployments: private baseline + public fallback
  • Cost-sensitive tasks where OSS models are viable

Sovereignty and control

Deploy closer to your users or inside specific regions to meet latency and compliance needs.

Unified integration

Your app keeps one API while you mix public models and your private deployments.

Operational simplicity

Provisioning and scaling are handled as a service, rather than another platform to maintain.

FAQ

Answers reflect current direction and may evolve as the platform ships.

Ship faster

Start with one key. Scale with policies.

Use Edgee’s unified access to get moving quickly, then add routing, budgets, and privacy controls as your AI usage grows.

Contact
Private Models — Edgee AI Gateway