Gravitee · Capability

Gravitee Llm Proxy Bridge

Routes Naftiko-side LLM calls through Gravitee's LLM Proxy (the Enterprise AI Agent Management module fronting OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Mistral, Hugging Face). Naftiko capability LLM calls automatically pick up Gravitee's prompt-token tracking, prompt guard-rails, semantic caching, and PII redaction policies without per-capability LLM-vendor wiring.

Run with Naftiko NaftikoGraviteePartnershipLLM-ProxyAI-GatewayToken-TrackingGuard-Rails

What You Can Do

POST

Chat completion —

/v1/chat/completions

POST

Completion —

/v1/completions

POST

Embedding —

/v1/embeddings

GET

List models —

/v1/models

GET

Get token usage —

/llm/usage

MCP Tools

chat-completion

Run an OpenAI-compatible chat completion through Gravitee LLM Proxy (with token-tracking + guard-rails + semantic cache).

completion

Run an OpenAI-compatible text completion through Gravitee LLM Proxy.

embedding

Compute embeddings through Gravitee LLM Proxy.

read-only

list-models

List the LLM models available through Gravitee LLM Proxy.

read-only

get-token-usage

Get token consumption stats from Gravitee LLM Proxy (per model, per period).

read-only

Capability Spec

naftiko: "1.0.0-alpha2"

info:
  title: Gravitee LLM Proxy Bridge
  description: >-
    Routes Naftiko-side LLM calls through Gravitee's LLM Proxy (the
    Enterprise AI Agent Management module fronting OpenAI, Anthropic,
    Azure OpenAI, AWS Bedrock, Mistral, Hugging Face). Naftiko capability
    LLM calls automatically pick up Gravitee's prompt-token tracking,
    prompt guard-rails, semantic caching, and PII redaction policies
    without per-capability LLM-vendor wiring.
  tags:
    - Naftiko
    - Gravitee
    - Partnership
    - LLM-Proxy
    - AI-Gateway
    - Token-Tracking
    - Guard-Rails
  created: '2026-05-15'
  modified: '2026-05-15'

binds:
  - namespace: gravitee-env
    description: Gravitee LLM Proxy endpoint + token.
    keys:
      GRAVITEE_LLM_BASE: GRAVITEE_LLM_BASE
      GRAVITEE_LLM_TOKEN: GRAVITEE_LLM_TOKEN

capability:
  consumes:
    - namespace: gravitee-llm
      type: http
      baseUri: '{{GRAVITEE_LLM_BASE}}'
      authentication:
        type: bearer
        token: '{{GRAVITEE_LLM_TOKEN}}'
      resources:
        - name: chat-completions
          path: '/v1/chat/completions'
          operations:
            - name: chat-completion
              method: POST
        - name: completions
          path: '/v1/completions'
          operations:
            - name: completion
              method: POST
        - name: embeddings
          path: '/v1/embeddings'
          operations:
            - name: embedding
              method: POST
        - name: list-models
          path: '/v1/models'
          operations:
            - name: list-models
              method: GET
        - name: get-token-usage
          path: '/llm/usage'
          operations:
            - name: get-token-usage
              method: GET
              inputParameters:
                - { name: since, in: query, type: string, required: false }
                - { name: model, in: query, type: string, required: false }

  exposes:
    - type: rest
      address: 0.0.0.0
      port: 8080
      namespace: gravitee-llm-proxy-bridge-rest
      description: OpenAI-compatible REST surface backed by Gravitee LLM Proxy with safety + caching layered in.
      resources:
        - name: chat-completion
          path: '/v1/chat/completions'
          operations:
            - name: chat-completion
              method: POST
              call: gravitee-llm.chat-completion
        - name: completion
          path: '/v1/completions'
          operations:
            - name: completion
              method: POST
              call: gravitee-llm.completion
        - name: embedding
          path: '/v1/embeddings'
          operations:
            - name: embedding
              method: POST
              call: gravitee-llm.embedding
        - name: list-models
          path: '/v1/models'
          operations:
            - name: list-models
              method: GET
              call: gravitee-llm.list-models
        - name: get-token-usage
          path: '/llm/usage'
          operations:
            - name: get-token-usage
              method: GET
              inputParameters:
                - { name: since, in: query, type: string, required: false }
                - { name: model, in: query, type: string, required: false }
              call: gravitee-llm.get-token-usage

    - type: mcp
      address: 0.0.0.0
      port: 3010
      namespace: gravitee-llm-proxy-bridge-mcp
      description: MCP server exposing the Gravitee LLM Proxy as agent-callable LLM tools.
      tools:
        - name: chat-completion
          description: Run an OpenAI-compatible chat completion through Gravitee LLM Proxy (with token-tracking + guard-rails + semantic cache).
          hints: { destructiveHint: false }
          call: gravitee-llm.chat-completion
        - name: completion
          description: Run an OpenAI-compatible text completion through Gravitee LLM Proxy.
          hints: { destructiveHint: false }
          call: gravitee-llm.completion
        - name: embedding
          description: Compute embeddings through Gravitee LLM Proxy.
          hints: { readOnly: true }
          call: gravitee-llm.embedding
        - name: list-models
          description: List the LLM models available through Gravitee LLM Proxy.
          hints: { readOnly: true }
          call: gravitee-llm.list-models
        - name: get-token-usage
          description: Get token consumption stats from Gravitee LLM Proxy (per model, per period).
          hints: { readOnly: true }
          inputParameters:
            - { name: since, type: string, required: false, description: ISO 8601 lower bound. }
            - { name: model, type: string, required: false, description: Filter to a specific model. }
          call: gravitee-llm.get-token-usage