Skip to content

Consuming Content

With the complete Packt knowledge graph alongside the content, API responses can become quite large. 1 MB of source content could carry 3-5 MB of associated context.

We therefore recommend that you use different endpoints or consider using field selectors to only return content that you are interested in.

Endpoints

Content is available via three HTTP endpoints:

Abridged (default)

GET /v1/content/{id}

Returns the document in abridged Packt Content Format with entity IDs only — entity metadata is not expanded. This is the smallest JSON response and is suitable for applications that either do not need entity data or resolve entities separately.

Complete

GET /v1/content/{id}/complete

Returns the full Packt Content Format with all entity data expanded inline. Use this when your application needs the complete knowledge graph context alongside the content.

Markdown

GET /v1/content/{id}/md

Returns the document as plain Markdown. Useful for integrating Content Lake content directly into third party systems.

Consider the knowledge graph

A significant part of the Content Lake's value is the knowledge graph backing content. When using plain Markdown that value is lost. Consider whether your usage would benefit from consuming Packt content with knowledge graph entities, adapting the content to fit your product, and then passing that as Markdown to your downstream client.

HTTP response format

Most other Packt API responses are JSON and include Content-Type: application/json. The Markdown endpoint uses Content-Type: text/markdown. Make sure your client supports this.

Versioning

All endpoints accept a version parameter. If you do not specify a version then the latest will be returned.

GET /v1/content/{id}              — latest version
GET /v1/content/{id}/latest       — equivalent
GET /v1/content/{id}/{version-id} — specific version

See Versioning for details on the version model and compaction.

Streaming

All API endpoints stream responses. Clients do not need to wait for the whole response before starting to process data.

However, as the native Packt Content Format is JSON, each block is a different element within a content array. Clients will have to wait for all blocks to be returned before confidently parsing the JSON response.

JSONL / NDJSON

For real-time processing (e.g. text-to-speech pipelines), a JSONL / NDJSON response option streams each block as a newline-delimited element. This response option only contains PCF blocks — it does not contain metadata that normal API responses include. It does, however, allow you to build real-time experiences.

For example, if someone wanted to consume a Content Lake document as an audio experience, you could stream the document to a text-to-speech model which also supports streaming, and within 100 milliseconds be streaming the output audio to your end users.

If your product experience requires both streaming and content metadata, make one request to the normal endpoint with field selectors for metadata only, and a separate request to the NDJSON endpoint for the content stream.

Caching Guidance

Application type Recommended cache duration
Editing applications No caching
Internal tools < 60 seconds
End-user applications Up to 1 hour

For end-user applications, longer caches are acceptable only for content ingested more than 24 hours ago, as that content should have been fully processed within that time window. We still caution against caches longer than 1 hour.