AI Gateway feature
Private Models
Early access
Run open-source LLMs as on-demand, serverless instances and route to them through the same Edgee gateway API alongside public providers.
We’re building serverless provisioning and placement. Tell us your model and deployment constraints.
Capabilities
- On-demand, serverless OSS model instances
- Routing policies that can include private deployments
- Environment separation (dev/staging/prod) as the control plane matures
- Model/version pinning and regional placement controls
How it works
- You choose a model (and constraints like region/performance).
- Edgee provisions and exposes it behind a stable gateway model name.
- Your app calls Edgee as usual; routing can target the private model when appropriate.
- Observability and cost signals remain centralized at the gateway.
Common use cases
- Workloads requiring stricter control of data path and deployment region
- Latency-sensitive inference close to user populations
- Hybrid deployments: private baseline + public fallback
- Cost-sensitive tasks where OSS models are viable
Sovereignty and control
Deploy closer to your users or inside specific regions to meet latency and compliance needs.
Unified integration
Your app keeps one API while you mix public models and your private deployments.
Operational simplicity
Provisioning and scaling are handled as a service, rather than another platform to maintain.
FAQ
Answers reflect current direction and may evolve as the platform ships.