Documents are the primary unit of information in Statiq and are a combination of content and metadata.
Documents are immutable and once their content is changed or a value is added to their metadata, it can never be removed (though it can be overwritten by a cloned document). Though the documentation often talks about documents being "transformed" or "manipulated" by modules, this isn't strictly accurate. Instead modules return a new cloned copy of the document with different content and/or additional metadata, while maintaining all the metadata the original document had.
For example, this visualizes a single document that contains some content as well as two metadata values:
It's tempting to think of documents as being one-to-one with files in the filesystem, but they're much more than that. While files are often the primary way documents are created, they can come from other sources too. It's better to think of documents and being part of a database. In this mode of thought Statiq is like a document database, with Statiq documents being analogous to the document concept from other document databases.
About Metadata
Along with their content, every document also contains metadata. As with documents, metadata is immutable and you must clone a document to add additional metadata. Several modules, such as SetMetadata
, are designed to allow you to manipulate document metadata as part of your pipeline.
Creating and Cloning Documents
There are two ways to get a new document: you can create one from scratch or you can clone an existing one.
To create a document you typically call one of the CreateDocument()
method overloads on the current execution context. These methods let you provide the initial metadata and/or content that the new document should contain.
To clone an existing document and replace or add new content and/or metadata you can call one of the Clone()
methods on the document itself. If you're unsure whether you have a null document, the execution context also provides several CloneOrCreateDocument()
overloads that either clones an existing document or creates a new one depending on if the provided document reference is null or not.
If your module creates or manipulates documents, follow these guidelines and tips on document creation and working with documents:
- Call
Clone()
on existing documents to clone with new properties. - Call
Engine.SetDefaultDocumentType<TDocument>()
to change the default document type. - Call
CreateDocument()
(engine or execution context) to create a new document of the default document type. - Call
CreateDocument<TDocument>()
(engine or execution context) to create a new document of the specified document type. - Call
CloneOrCreateDocument()
(engine or execution context) to either clone or create a new document of the default document type depending on if a passed-in document exists (isnull
) or not. - Call
CloneOrCreateDocument<TDocument>()
(engine or execution context) to either clone or create a new document of the specified document type depending on if a passed-in document exists (isnull
) or not.
Statiq is very flexible with what can be considered a document. You may find that a custom document type better represents your data than creating a standard document. If you already have an existing data element (such as the result of an API call), it might also be helpful to wrap that object as a document instead of copying it’s data to a default document object. Follow these guidelines and tips when working with alternate document types:
- Use base classes:
- Implementing
IDocument
is the minimum requirement, but it’s not recommended to implement this interface directly. - Override
Document<TDocument>
to derive a custom document type with built-in metadata support. - Override
IDocument.Clone()
in custom document types as needed. The default behavior is to perform a member-wise clone.
- Implementing
- Convert an existing object of any type into a
IDocument
using.ToDocument()
extensions:- This wraps the object in an
ObjectDocument<T>
.
- This wraps the object in an
Document Properties
In addition to metadata, every document has a few core properties.
Document ID
Every document has an Id
property. This is a Guid
that uniquely identifies the document within a given execution. Once a document is created every cloned copy of that document, regardless of whether the content or metadata is changed, will have the same ID. This lets you identify the "same" document even after it's been cloned a number of times.
Source and Destination
All documents have two properties that relate to file location: Source
and Destination
. Source
is an absolute path and indicates where on disk the document came from (assuming it came from disk). Destination
is a relative path and indicates where in the output folder the document should be written. Not all documents are intended for output (some are just for conveying data), so not all documents will have a source or destination property.
Content Provider
The content of a document is accessed through a content provider (an instance of IContentProvider
). This lets the framework control access to content and ensures a consistent experience regardless of content source. You should provide a content provider when creating or cloning a document if you want to set it's content. The most common way of getting a content provider for a particular type of content (such as a string
or a Stream
) is to call one of the GetContentProvider()
methods from the current execution context.
Accessing Documents
During execution you can access all the documents generated by each pipeline using the IPipelineOutputs
interface, which is available via the Outputs
property of the current execution context. You can also access documents generated by a given pipeline using various modules such as ConcatDocuments
which are useful when setting up multi-pipeline document flows.
Child Pages
Metadata Values
There are lots of different ways of defining metadata as atomic values, complex objects, or powerful lazily evaluated scripts and delegates.
Accessing Metadata
Every document acts like a dictionary and implements IReadOnlyDictionary<string, object>
for easy access. Metadata key/value pairs can be accessed through this interface.