Skip to main content

Async Jobs

For large batches or very slow targets, submit a job and poll for the result instead of holding a synchronous connection open. Async jobs survive timeouts and let you fire many requests without managing thousands of concurrent sockets.

info

Async jobs run on the Redis-backed queue. If your deployment runs the gateway without a queue, async returns 503 and you should use the synchronous /v1/scrape instead. The hosted API at api.omniscrape.io always supports async.

1. Submit a job

POST https://api.omniscrape.io/v1/scrape/async

The body is identical to /v1/scrape. The response returns immediately (HTTP 202) with a job id and the URL to poll:

{
"success": true,
"job_id": "3f9a2c8d-1e7b-4a06-9c2d-8f2a91c3aa10",
"status": "queued",
"poll": "/v1/jobs/3f9a2c8d-1e7b-4a06-9c2d-8f2a91c3aa10"
}

2. Poll for status

GET https://api.omniscrape.io/v1/jobs/{job_id}
curl https://api.omniscrape.io/v1/jobs/3f9a2c8d-1e7b-4a06-9c2d-8f2a91c3aa10 \
-H "X-API-Key: $OMNISCRAPE_KEY"

The job details are returned under data. While processing:

{
"success": true,
"data": {
"job_id": "3f9a2c8d-1e7b-4a06-9c2d-8f2a91c3aa10",
"status": "processing",
"url": "https://example.com",
"created_at": "2026-06-23T10:30:00Z",
"updated_at": "2026-06-23T10:30:02Z"
}
}

When finished, status becomes completed and the full scrape response is included under data.result:

{
"success": true,
"data": {
"job_id": "3f9a2c8d-1e7b-4a06-9c2d-8f2a91c3aa10",
"status": "completed",
"result": {
"success": true,
"data": { "content": "...", "status_code": 200 },
"metadata": { "elapsed_time": 7.4 },
"billing": { "charged": 0.0035 }
}
}
}

Job statuses

StatusMeaning
queuedAccepted, waiting for a worker.
processingBeing processed now.
completedDone. result is available.
failedCould not complete. See error.
timed_outExceeded the job timeout.

Polling tips

  • Poll every 2–5 seconds. There is no benefit to polling faster than once per second.
  • Completed job results are retained for 24 hours, then purged.
  • A failed core unlock is not billed, the same as synchronous requests.
import time, os, requests

H = {"X-API-Key": os.environ["OMNISCRAPE_KEY"]}
job = requests.post(
"https://api.omniscrape.io/v1/scrape/async",
headers=H, json={"url": "https://example.com"},
).json()["job_id"]

while True:
r = requests.get(f"https://api.omniscrape.io/v1/jobs/{job}", headers=H).json()
data = r["data"]
if data["status"] in ("completed", "failed", "timed_out"):
break
time.sleep(3)

print(data.get("result"))