Rate Limits

Current limit

30 requests per minute per partner API key.

This is a hard ceiling enforced at the Zuplo edge before the origin receives the request.

How it's measured

What happens on breach

HTTP 429 Too Many Requests with a Retry-After response header. The Retry-After value is seconds to wait before the next safe request.

HTTP/1.1 429 Too Many Requests
Retry-After: 17
Content-Type: application/json

<Zuplo-emitted rate-limit body — see partnerdocs.collectivex.health/api for schema>

The exact body is Zuplo-emitted, so its shape is documented in the auto-generated OpenAPI reference at partnerdocs.collectivex.health/api rather than here. The Retry-After header is what your code should key off of — not the body.

Minimum viable

Honor Retry-After literally:

if r.status_code == 429:
    time.sleep(int(r.headers["Retry-After"]))
    # Then retry with same request_id

That's correct but pessimistic at low concurrency.

Exponential backoff with jitter (preferred)

For request bursts or batch jobs, wrap retries in exponential backoff with full jitter to avoid thundering-herd:

import time
import random

def backoff_with_jitter(attempt: int, retry_after: int | None) -> float:
    """Base case: honor Retry-After. Fallback: exponential with jitter."""
    if retry_after is not None:
        return retry_after
    base = min(2 ** attempt, 32)  # 1, 2, 4, 8, 16, 32 (cap)
    return random.uniform(0, base)  # full jitter

def call_with_retry(request_body, max_retries=5):
    for attempt in range(max_retries + 1):
        r = httpx.post(
            f"{CXH_BASE_URL}/v1/oura/recommendation",
            headers={"Authorization": f"ApiKey {CXH_API_KEY}"},
            json=request_body,
        )
        if r.status_code != 429:
            return r
        retry_after = int(r.headers.get("Retry-After", 0)) or None
        sleep_s = backoff_with_jitter(attempt, retry_after)
        time.sleep(sleep_s)
    raise RuntimeError(f"Exceeded {max_retries} 429 retries")

Node.js / JavaScript

async function callWithRetry(body, { maxRetries = 5 } = {}) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const r = await fetch(`${CXH_BASE_URL}/v1/oura/recommendation`, {
      method: "POST",
      headers: {
        "Authorization": `ApiKey ${CXH_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify(body),
    });
    if (r.status !== 429) return r;
    const retryAfter = parseInt(r.headers.get("Retry-After") || "0", 10);
    const base = Math.min(2 ** attempt, 32);
    const sleepMs = (retryAfter || Math.random() * base) * 1000;
    await new Promise((resolve) => setTimeout(resolve, sleepMs));
  }
  throw new Error(`Exceeded ${maxRetries} 429 retries`);
}

Planning your throughput

At 30 req/min sustained:

For batch workloads (e.g. nightly digest, re-processing), rate-limit your own dispatcher at 25 req/min (leaves 5-req headroom for user-initiated traffic). A simple token-bucket works:

# Refill 25 tokens per minute = 1 token every 2.4 seconds
MIN_INTERVAL_S = 60.0 / 25  # 2.4
last_send = 0.0
for item in batch:
    elapsed = time.monotonic() - last_send
    if elapsed < MIN_INTERVAL_S:
        time.sleep(MIN_INTERVAL_S - elapsed)
    send(item)
    last_send = time.monotonic()

Getting a higher limit

30 req/min is the default for sandbox integration. Prod cutover may have a different ceiling — decided per partner contract.

To request a higher limit:

  1. Open a ticket via support with subject Rate limit increase — <tenant-id>.
  2. Include: current peak req/min (from your own metrics), projected peak, and a justification (user growth, feature launch, batch workload).
  3. Response time: typically 2–3 business days. Rate-limit changes require CollectiveX-side review of origin capacity.
  4. Sandbox limits can be raised temporarily for load testing — request a time window (e.g. "weekdays 10am–12pm UTC for 2 weeks").

Anti-patterns

Monitoring your own headroom

Every 200/4xx response carries:

X-RateLimit-Remaining: 27
X-RateLimit-Limit: 30
X-RateLimit-Reset: 42        # seconds until the sliding window resets

Use X-RateLimit-Remaining to drive client-side throttling — back off proactively when remaining drops below your safety margin (e.g. 5 or 6) instead of waiting for the 429.

A 429 indicates you exceeded the window by at least 1 request. Treat hitting any 429 as a sign your dispatch logic is too aggressive and tighten the throttle.