S3

S3 Connector #

Register S3 Connector #

curl -XPUT "http://localhost:9000/connector/s3?replace=true" -d '{
  "name" : "S3 对象存储连接器",
  "description" : "提取 S3 云端文件元数据信息。",
  "category" : "cloud_storage",
  "icon" : "/assets/icons/connector/s3/icon.png",
  "tags" : [
    "s3",
    "storage"
  ],
  "url" : "http://coco.rs/connectors/s3",
  "assets" : {
  "icons" : {
    "default" : "/assets/icons/connector/s3/icon.png"
    }
  }
}'

Use s3 as a unique identifier, as it is a builtin connector.

Update coco-server’s config #

Below is an example configuration for enabling the S3 Connector in coco-server:

connector:
  s3:
    enabled: true
    queue:
      name: indexing_documents
    interval: 30s

Explanation of Config Parameters #

FieldTypeDescription
enabledbooleanEnables or disables the S3 connector. Set totrue to activate it.
intervalstringSpecifies the time interval (e.g.,60s) at which the connector will check for updates.
queue.namestringDefines the name of the queue where indexing tasks will be added.

Use the S3 Connector #

The S3 Connector allows you to index data from your S3 service into your system. Follow these steps to set it up:

Configure S3 Client #

To configure your S3 connection, you’ll need to provide several key parameters. This setup allows your application to securely authenticate with and access objects within a specified S3 bucket.

First, you’ll need your S3 credentials:

access_key_id: This is your public identifier, similar to a username.

secret_access_key: This is your confidential key, like a password, used to sign requests and prove your identity. Keep this highly secure.

Next, specify the location of your data:

bucket: This is the name of the S3 bucket where your objects are stored and which you intend to interact with.

endpoint: This is the URL of your S3 service. It could be an AWS S3 endpoint (e.g., s3.us-west-2.amazonaws.com) or the address of another S3-compatible service (e.g., a MinIO instance).

Finally, you have a few optional parameters to fine-tune data access:

use_ssl: A boolean flag (defaults to true) to enable or disable SSL/TLS (HTTPS) for secure communication. It’s strongly recommended to keep this enabled.

prefix: A string that acts as a filter. If provided, the system will only process objects whose keys (paths) start with this string. For instance, a prefix of documents/ would limit access to files within that “folder.”

extensions: An array of strings defining specific file extensions to include (e.g., [“pdf”, “docx”]). If this list is empty or omitted, all file types in the specified path will be considered.

Example Request #

Here is an example request to configure the Notion Connector:

curl -H 'Content-Type: application/json' -XPOST "http://localhost:9000/datasource/" -d '
{
    "name":"My S3 Documents",
    "type":"connector",
    "connector":{
        "id":"s3",
         "config":{
            "access_key_id": "your-access_key_id",
            "secret_access_key":"your-secret_access_key",
            "bucket":"your-bucket",
            "endpoint":"your-s3-service-endpoint",
            "use_ssl":true,
            "prefix":"",
            "extensions": [ "pdf", "docx", "txt" ]
        }
    }
}''

Supported Config Parameters for S3 Connector #

Below are the configuration parameters supported by the S3 Connector:

FieldTypeDescription
access_key_idstringYour S3 access key ID.
secret_access_keystringYour S3 secret access key.
bucketstringThe name of the S3 bucket to index.
endpointstringThe S3 service endpoint URL.
use_sslboolOptional. Whether to use SSL for the connection. Defaults totrue.
prefixstringOptional. A prefix to filter objects in the bucket (e.g.,documents/).
extensions[]stringOptional. An array of file extensions to include (e.g., pdf, docx). If omitted or empty, all files will be indexed.
Edit Edit this page