pb.documents ingests source files and reads what was extracted from them. A
document is the developer-facing handle on the source/file substrate; see
Sources for the provenance view of the same material.
Ingest a document
upload is the high-level path: it requests a signed URL, uploads the bytes, and
registers the file in one call.
createSignedUploadUrl issues the URL and storage path, you upload the bytes
yourself, then register records the storage path.
Methods
| Method | Description |
|---|---|
pb.documents.upload(input) | Signed-url upload + register, in one call. Returns the document id and status. |
pb.documents.createSignedUploadUrl(input) | Issue a signed upload URL and storage path. |
pb.documents.register(input) | Register an already-uploaded file by storage path. |
pb.documents.create(input) | Ingest a document from content or a reference. |
pb.documents.search(input) | RAG search over the document’s source chunks. |
pb.documents.list(input?) | List documents in the project. |
pb.documents.get(id) | Read a single document. |
pb.documents.extract(id, input?) | Run extraction over the document through a shape. |
Search documents
Over REST these are the
/v1/documents endpoints: upload corresponds to the
signed-url + register flow, and search maps to POST /v1/documents/search.Related
Sources
The provenance view: read entities scoped to the source they came from.
Capture and extract
Run a document through a shape.