Offline Processor

Pipeline #

What Is Pipeline? #

A pipeline is a function combination used for processing tasks offline. It uses the pipeline design pattern, just as online request filters do. A processor is the basic unit of a pipeline. Each processing component focuses on one task and the components can be flexibly assembled, and plugged and removed as required.

Pipeline Definition #

A typical pipeline service is defined as follows:

pipeline:
- name: request_logging_index
  auto_start: true
  keep_running: true
  processor:
    - json_indexing:
        index_name: "gateway_requests"
        elasticsearch: "dev"
        input_queue: "request_logging"
        idle_timeout_in_seconds: 1
        worker_size: 1
        bulk_size_in_mb: 10 #in MB

In the above configuration, a processing pipeline named request_logging_index is defined, and the processor parameter defines several processing units for the pipeline, which are executed in sequence.

Parameter Description #

Parameters related to pipeline definition are described as follows:

NameTypeDescription
namestringName of a pipeline, which must be unique
auto_startboolWhether the pipeline automatically starts with the gateway startup, that is, whether the task is executed immediately
keep_runningboolWhether the gateway starts executing the task again after completing the execution
singletonboolWhether the task is a singleton and only one node instance is allowed to run in a cluster
max_running_in_msintThe maximum time that the task runs execution, 60000 milliseconds by default.
retry_delay_in_msintMinimum waiting time for the task re-execution, which is set to 5000 milliseconds by default
processorarrayList of processors to be executed by the pipeline in sequence

Processor List #

Task Scheduling #

Event Processing #

Index Writing #

Index Diff #

Request Replay #

Edit Edit this page