Processor #
A processor performs specific actions on the input document when it flows through the pipeline:
[Doc input] --> [Processor] --> [Doc output]
Pipeline #
A pipeline is basically multiple processors chained together, users can achieve sophisticated document processing via pipelines:
[Doc input] --> [Processor A] --> [Processor B] --> [Processor C] --> [Doc output]
Processor categories #
Processors are grouped by the type of input they expect.
Document Processors #
Operate on documents. Each pipeline message carries a serialized document.
Attachment Processors #
Operate on attachments. Each pipeline message carries the serialized attachment metadata. Attachment processors load the binary content themselves when they need it.