Ollama · Capability
Ollama API
Ollama provides a REST API for running and managing large language models locally. The API supports text generation, chat completions, embeddings, model management, and streaming responses. It serves as the primary interface for interacting with models running on the Ollama inference engine at localhost:11434.
What You Can Do
POST
Generatecompletion
— Ollama Generate a completion
/api/generate
POST
Generatechatcompletion
— Ollama Generate a chat completion
/api/chat
POST
Generateembeddings
— Ollama Generate embeddings
/api/embed
GET
Listmodels
— Ollama List local models
/api/tags
POST
Showmodelinfo
— Ollama Show model information
/api/show
POST
Createmodel
— Ollama Create a model
/api/create
POST
Copymodel
— Ollama Copy a model
/api/copy
POST
Pullmodel
— Ollama Pull a model
/api/pull
POST
Pushmodel
— Ollama Push a model
/api/push
DELETE
Deletemodel
— Ollama Delete a model
/api/delete
GET
Listrunningmodels
— Ollama List running models
/api/ps
POST
Createblob
— Ollama Create a blob
/api/blobs/{digest}
GET
Getversion
— Get Ollama version
/api/version
MCP Tools
generatecompletion
Ollama Generate a completion
generatechatcompletion
Ollama Generate a chat completion
generateembeddings
Ollama Generate embeddings
listmodels
Ollama List local models
read-only
idempotent
showmodelinfo
Ollama Show model information
createmodel
Ollama Create a model
copymodel
Ollama Copy a model
pullmodel
Ollama Pull a model
pushmodel
Ollama Push a model
deletemodel
Ollama Delete a model
idempotent
listrunningmodels
Ollama List running models
read-only
idempotent
createblob
Ollama Create a blob
getversion
Get Ollama version
read-only
idempotent