Skip to main content
POST
/
v1
/
extract
/
schema
Extract Schema Endpoint
curl --request POST \
  --url https://api.extract.page/v1/extract/schema \
  --header 'Content-Type: application/json' \
  --header 'X-API-KEY: <api-key>' \
  --data '
{
  "url": "<string>",
  "schema": {},
  "strict": true,
  "auto_schema": false,
  "extract_images": true
}
'
{
  "values": {},
  "evidence": {},
  "page_count": 123,
  "ungrounded_fields": [
    "<string>"
  ],
  "generated_schema": {}
}

Authorizations

X-API-KEY
string
header
required

Body

application/json

Input to the schema-extraction pipeline.

Given a document and an arbitrary user JSON schema, fill the schema's fields from the document and cite where each value came from. The model only ever emits chunk indices + a verbatim quote; the backend remaps index → {page, bbox, text} and verifies the quote, so a value can be grounded to a real box without the model emitting (or hallucinating) coordinates.

url
string | null
schema
Schema · object | null
strict
boolean
default:true
auto_schema
boolean
default:false
extract_images
boolean
default:true

Response

Successful Response

values
Values · object
required
evidence
Evidence · object
required
page_count
integer
required
ungrounded_fields
string[]
generated_schema
Generated Schema · object | null