Spider · Capability

Spider Cloud API — Scraping

Spider Cloud API — Scraping. Extract content from individual web pages, transform raw HTML or PDF into clean output, and bypass anti-bot protections using residential proxies and fingerprint rotation. Self-contained Naftiko capability covering the scraping business surface.

Spider Cloud API — Scraping is a Naftiko capability published by Spider, one of 5 capabilities the APIs.io network indexes for this provider. It bundles 3 operations across the POST method.

The capability includes 3 read-only operations. Lead operation: Extract content from a single web page in the requested output format. Can be deployed as a REST endpoint, MCP tool, or Agent Skill via Naftiko.

Tagged areas include Spider Cloud, Scraping, Transform, and Unblocker.

Run with Naftiko Spider CloudScrapingTransformUnblocker

What You Can Do

POST

Scrape — Extract content from a single web page.

/v1/scrape

POST

Unblocker — Bypass anti-bot protections.

/v1/unblocker

POST

Transform — Convert raw HTML or PDF into clean output.

/v1/transform

MCP Tools

scrape-url

Extract content from a single web page in the requested output format.

read-only

unblocker-fetch

Bypass anti-bot protections using residential proxies and fingerprint rotation.

read-only

transform-content

Convert raw HTML or PDF into clean markdown, JSON, or plain text.

read-only idempotent

Capability Spec

naftiko: 1.0.0-alpha2
info:
  label: Spider Cloud API — Scraping
  description: 'Spider Cloud API — Scraping. Extract content from individual web pages, transform raw HTML
    or PDF into clean output, and bypass anti-bot protections using residential proxies and fingerprint
    rotation. Self-contained Naftiko capability covering the scraping business surface.'
  tags:
    - Spider Cloud
    - Scraping
    - Transform
    - Unblocker
  created: '2026-05-25'
  modified: '2026-05-25'
binds:
  - namespace: env
    keys:
      SPIDER_CLOUD_API_KEY: SPIDER_CLOUD_API_KEY
capability:
  consumes:
    - type: http
      namespace: spider-cloud-scraping
      baseUri: https://api.spider.cloud
      description: Spider Cloud API — Scraping business capability. Self-contained, no shared references.
      resources:
        - name: scrape
          path: /scrape
          operations:
            - name: scrape
              method: POST
              description: Extract content from a single web page in the requested output format.
              outputRawFormat: json
              outputParameters:
                - name: result
                  type: object
                  value: $.
              inputParameters:
                - name: body
                  in: body
                  type: object
                  description: Request body (JSON).
                  required: true
        - name: unblocker
          path: /unblocker
          operations:
            - name: unblocker
              method: POST
              description: Bypass anti-bot protections using residential proxies, stealth headers, and
                fingerprint rotation.
              outputRawFormat: json
              outputParameters:
                - name: result
                  type: object
                  value: $.
              inputParameters:
                - name: body
                  in: body
                  type: object
                  description: Request body (JSON).
                  required: true
        - name: transform
          path: /transform
          operations:
            - name: transform
              method: POST
              description: Convert raw HTML or PDF into clean markdown, JSON, or plain text.
              outputRawFormat: json
              outputParameters:
                - name: result
                  type: object
                  value: $.
              inputParameters:
                - name: body
                  in: body
                  type: object
                  description: Request body (JSON).
                  required: true
      authentication:
        type: bearer
        token: '{{env.SPIDER_CLOUD_API_KEY}}'
  exposes:
    - type: rest
      namespace: spider-cloud-scraping-rest
      port: 8080
      description: REST adapter for Spider Cloud API — Scraping.
      resources:
        - path: /v1/scrape
          name: scrape
          description: REST surface for scrape.
          operations:
            - method: POST
              name: scrape
              description: Extract content from a single web page.
              call: spider-cloud-scraping.scrape
              with:
                body: rest.body
              outputParameters:
                - type: object
                  mapping: $.
        - path: /v1/unblocker
          name: unblocker
          description: REST surface for unblocker.
          operations:
            - method: POST
              name: unblocker
              description: Bypass anti-bot protections.
              call: spider-cloud-scraping.unblocker
              with:
                body: rest.body
              outputParameters:
                - type: object
                  mapping: $.
        - path: /v1/transform
          name: transform
          description: REST surface for transform.
          operations:
            - method: POST
              name: transform
              description: Convert raw HTML or PDF into clean output.
              call: spider-cloud-scraping.transform
              with:
                body: rest.body
              outputParameters:
                - type: object
                  mapping: $.
    - type: mcp
      namespace: spider-cloud-scraping-mcp
      port: 9090
      transport: http
      description: MCP adapter for Spider Cloud API — Scraping.
      tools:
        - name: scrape-url
          description: Extract content from a single web page in the requested output format.
          hints:
            readOnly: true
            destructive: false
            idempotent: false
          call: spider-cloud-scraping.scrape
          with:
            body: tools.body
          outputParameters:
            - type: object
              mapping: $.
        - name: unblocker-fetch
          description: Bypass anti-bot protections using residential proxies and fingerprint rotation.
          hints:
            readOnly: true
            destructive: false
            idempotent: false
          call: spider-cloud-scraping.unblocker
          with:
            body: tools.body
          outputParameters:
            - type: object
              mapping: $.
        - name: transform-content
          description: Convert raw HTML or PDF into clean markdown, JSON, or plain text.
          hints:
            readOnly: true
            destructive: false
            idempotent: true
          call: spider-cloud-scraping.transform
          with:
            body: tools.body
          outputParameters:
            - type: object
              mapping: $.