Skip to main content

Overview

OGC Inference Endpoints is the easiest way to deploy machine learning models as scalable API endpoints. It provides you with the flexibility to choose from a variety of pre-trained models, specify your compute requirements, and deploy them in specific locations to minimize latency and optimize performance.

It allows you two select two types of models:

  • Pre-trained models - Choose from Llama 3.1, Llama 3.2, Mistral, Qwen 2.5, & many more.
  • Custom Models - Coming soon.