S3 Connector #
Register S3 Connector #
curl -XPUT "http://localhost:9000/connector/s3?replace=true" -d '{
"name" : "S3 对象存储连接器",
"description" : "提取 S3 云端文件元数据信息。",
"category" : "cloud_storage",
"icon" : "/assets/icons/connector/s3/icon.png",
"tags" : [
"s3",
"storage"
],
"url" : "http://coco.rs/connectors/s3",
"assets" : {
"icons" : {
"default" : "/assets/icons/connector/s3/icon.png"
}
}
}'
Use
s3
as a unique identifier, as it is a builtin connector.
Update coco-server’s config #
Below is an example configuration for enabling the S3 Connector in coco-server:
connector:
s3:
enabled: true
queue:
name: indexing_documents
interval: 30s
Explanation of Config Parameters #
Field | Type | Description |
---|---|---|
enabled | boolean | Enables or disables the S3 connector. Set totrue to activate it. |
interval | string | Specifies the time interval (e.g.,60s ) at which the connector will check for updates. |
queue.name | string | Defines the name of the queue where indexing tasks will be added. |
Use the S3 Connector #
The S3 Connector allows you to index data from your S3 service into your system. Follow these steps to set it up:
Configure S3 Client #
To configure your S3 connection, you’ll need to provide several key parameters. This setup allows your application to securely authenticate with and access objects within a specified S3 bucket.
First, you’ll need your S3 credentials:
access_key_id
: This is your public identifier, similar to a username.
secret_access_key
: This is your confidential key, like a password, used to sign requests and prove your identity. Keep this highly secure.
Next, specify the location of your data:
bucket
: This is the name of the S3 bucket where your objects are stored and which you intend to interact with.
endpoint
: This is the URL of your S3 service. It could be an AWS S3 endpoint (e.g., s3.us-west-2.amazonaws.com) or the address of another S3-compatible service (e.g., a MinIO instance).
Finally, you have a few optional parameters to fine-tune data access:
use_ssl
: A boolean flag (defaults to true) to enable or disable SSL/TLS (HTTPS) for secure communication. It’s strongly recommended to keep this enabled.
prefix
: A string that acts as a filter. If provided, the system will only process objects whose keys (paths) start with this string. For instance, a prefix of documents/ would limit access to files within that “folder.”
extensions
: An array of strings defining specific file extensions to include (e.g., [“pdf”, “docx”]). If this list is empty or omitted, all file types in the specified path will be considered.
Example Request #
Here is an example request to configure the Notion Connector:
curl -H 'Content-Type: application/json' -XPOST "http://localhost:9000/datasource/" -d '
{
"name":"My S3 Documents",
"type":"connector",
"connector":{
"id":"s3",
"config":{
"access_key_id": "your-access_key_id",
"secret_access_key":"your-secret_access_key",
"bucket":"your-bucket",
"endpoint":"your-s3-service-endpoint",
"use_ssl":true,
"prefix":"",
"extensions": [ "pdf", "docx", "txt" ]
}
}
}''
Supported Config Parameters for S3 Connector #
Below are the configuration parameters supported by the S3 Connector:
Field | Type | Description |
---|---|---|
access_key_id | string | Your S3 access key ID. |
secret_access_key | string | Your S3 secret access key. |
bucket | string | The name of the S3 bucket to index. |
endpoint | string | The S3 service endpoint URL. |
use_ssl | bool | Optional. Whether to use SSL for the connection. Defaults totrue . |
prefix | string | Optional. A prefix to filter objects in the bucket (e.g.,documents/ ). |
extensions | []string | Optional. An array of file extensions to include (e.g., pdf, docx). If omitted or empty, all files will be indexed. |