Creating Products
Products on the Packt Platform are lightweight wrappers around Content Lake content. This page covers the format generation pipeline for Content Lake-backed products, product assets, and how products are organised into series, bundles, and collections.
Format Generation
Format generation is the core differentiator for Content Lake-backed products. The design principle is no stored artifacts — every format request triggers a real-time generation and streaming pipeline.
Request
│
▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ CL Document │──▶│ Asset │──▶│ Format │
│ Retrieval │ │ Injection │ │ Rendering │
│ (streaming) │ │ │ │ (streaming) │
└──────────────┘ └──────────────┘ └──────────────┘
│
▼
Response Stream
1. Content Lake Document Retrieval
Documents are fetched from the Content Lake using its JSONL streaming endpoint, which delivers each block as a newline-delimited JSON element with sub-100ms time-to-first-byte. For pinned documents, the specific version is requested. For latest-tracking documents, the latest version is fetched.
2. Asset Injection
Product assets (covers, imprint) are injected at the appropriate positions. The table of contents is generated from document heading structure (H1, H2, H3 blocks). The index is generated from entity and keyword data in the Content Lake's knowledge graph.
3. Format Rendering
The assembled content stream is rendered into the target format (PDF, ePub, InDesign) and streamed back to the client. The client receives bytes as they are rendered — there is no buffering of the complete output.
Determinism
For products where all documents are pinned to specific versions, the generated output is deterministic — the same request always produces the same file. This is important for downstream consumers that need reproducible artifacts.
For products with latest-tracking documents, the output reflects the Content Lake state at generation time.
Performance
The pipeline is designed for low latency:
- Content Lake streaming: sub-100ms time-to-first-byte
- Asset injection and ToC generation operate on the stream as it flows through
- End-to-end time-to-first-byte targets sub-second for typical book products
Product Assets
| Asset | Source | Stored |
|---|---|---|
| Front Cover | Uploaded image | Yes |
| Back Cover | Uploaded image | Yes |
| Imprint | Generated from product/publisher metadata | No |
| Table of Contents | Generated from CL document headings | No |
| Index | Generated from CL entity/keyword data | No |
For Content Lake-backed products, cover images are the only binary assets stored. The imprint, table of contents, and index are generated dynamically during format generation and never persisted.
Series, Bundles, and Collections
Series
A series is an ordered collection of products that share a series identifier and metadata. Each product retains its own lifecycle, pricing, and metadata. Series membership is an association, not a structural dependency.
Bundles
A bundle packages multiple products for sale as a unit with bundle-specific pricing. Bundles have their own lifecycle and can be published and retired independently of their constituent products.
Collections
Collections are curated groupings for merchandising and editorial purposes. Unlike series (which have a fixed order and shared identity) and bundles (which have pricing), collections are lightweight — simply named lists of product references.