Skip to content

Rate Limiting

Horizon enforces rate limits on a per-API-key basis to ensure fair usage and platform stability. Every API key has a configurable rate limit that defaults to 100 requests per minute.

Each API key tracks the number of requests made within a sliding one-minute window. When the limit is reached, subsequent requests receive a 429 Too Many Requests response until the window resets.

Rate limits are set at key creation time and can be customized per key. For example, a key used by an internal backend service might have a higher limit than one issued to a third-party integration.

{
"client_name": "high-volume-service",
"scopes": ["quickbooks", "conversations"],
"rate_limit": 500
}

Every API response includes headers that report the current rate limit status for your key:

HeaderDescription
X-RateLimit-LimitThe maximum number of requests allowed per minute for this key.
X-RateLimit-RemainingThe number of requests remaining in the current window.
X-RateLimit-ResetUnix timestamp (in seconds) when the current window resets.

Example response headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1742310360

When you exceed your rate limit, the API returns a 429 status code with a JSON error body:

{
"error": "rate_limit_exceeded",
"message": "Rate limit exceeded. Maximum 100 requests per minute.",
"retry_after": 23
}

The retry_after field indicates the number of seconds to wait before retrying. The Retry-After HTTP header is also included.

HTTP/1.1 429 Too Many Requests
Retry-After: 23
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1742310360

The recommended approach is to implement exponential backoff with jitter when you receive a 429 response:

Terminal window
# Check the Retry-After header and wait before retrying
curl -w "\n%{http_code}" -X GET https://api.horizonplatform.ai/api/conversations \
-H "x-api-key: hz_live_abc123def456"
# If 429, wait for Retry-After seconds and retry
async function horizonRequest(url, options, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status !== 429) {
return response;
}
const retryAfter = parseInt(response.headers.get('Retry-After') || '5', 10);
const jitter = Math.random() * 1000;
const delay = retryAfter * 1000 + jitter;
console.warn(`Rate limited. Retrying in ${Math.round(delay / 1000)}s...`);
await new Promise(resolve => setTimeout(resolve, delay));
}
throw new Error('Max retries exceeded due to rate limiting');
}
const response = await horizonRequest(
'https://api.horizonplatform.ai/api/conversations',
{ headers: { 'x-api-key': 'hz_live_abc123def456' } }
);
import time
import random
import requests
def horizon_request(url, headers, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code != 429:
return response
retry_after = int(response.headers.get('Retry-After', 5))
jitter = random.uniform(0, 1)
delay = retry_after + jitter
print(f"Rate limited. Retrying in {delay:.1f}s...")
time.sleep(delay)
raise Exception("Max retries exceeded due to rate limiting")
response = horizon_request(
'https://api.horizonplatform.ai/api/conversations',
headers={'x-api-key': 'hz_live_abc123def456'}
)

If you need to make many requests, consider batching your work to stay within limits. For example, instead of fetching conversations one at a time, use the list endpoint with a higher limit parameter to retrieve multiple records per request.

Webhook trigger endpoints (POST /api/webhooks/agent/:webhookToken) have their own rate limiting independent of API keys. Rate limits for webhooks are configured on the webhook itself and are enforced based on the webhook token rather than an API key.

If your integration requires a higher rate limit, you can:

  1. Create a new API key with a higher rate_limit value via POST /admin/api-keys.
  2. Update an existing key’s limit by revoking and recreating it with the desired rate limit.
  3. Contact support for platform-level limit increases if the per-key maximum does not meet your needs.