S3 (s3)¶
The s3 connector indexes objects under a bucket prefix. It's S3-compatible, so
the same connector covers AWS S3, Cloudflare R2, Google Cloud Storage (S3
interop), and MinIO — the only difference is endpoint_url.
How MFS sees it¶
The tree mirrors object keys under the configured bucket and prefix:
Objects are classified by extension exactly like the file connector:
documents and code are converted and embedded; structured text is browse/grep
only; other types are browse/export only.
Credentials¶
Pick the path for your provider:
-
AWS S3: an IAM access key (or STS temporary credentials). boto3 reads
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEYfrom the environment automatically, so you can omit them from the TOML. Minimum IAM policy: -
Cloudflare R2: an R2 API token with Object Read; set
endpoint_url = "https://<account-id>.r2.cloudflarestorage.com"andregion = "auto". - GCS (S3 interop): an HMAC key;
endpoint_url = "https://storage.googleapis.com". - MinIO: the service's access/secret key;
endpoint_urlof your MinIO URL.
Configuration¶
bucket = "acme-docs"
prefix = "engineering/rfc/"
region = "us-west-2"
access_key_id = "env:AWS_ACCESS_KEY_ID"
secret_access_key = "env:AWS_SECRET_ACCESS_KEY"
# endpoint_url = "https://<account-id>.r2.cloudflarestorage.com" # R2/GCS/MinIO
Sync and freshness¶
The connector uses each object's etag as its cursor, so re-syncs only re-process
changed objects; deletions are caught by full_scan. Versioned buckets expose
only the latest version.
Search and browse¶
mfs add s3://acme-docs --config ./s3.toml
mfs search "retention policy" s3://acme-docs/engineering/rfc/
mfs cat s3://acme-docs/engineering/rfc/rfc-001.md --range 1:80
mfs export s3://acme-docs/engineering/rfc/rfc-001.pdf /tmp/rfc-001.pdf
Pitfalls¶
prefixis exact — use a trailing slash when you mean a directory-like prefix.- IAM must allow both
ListBucketandGetObjectfor the scoped bucket/prefix. - Very large PDFs or Office files can be expensive to convert.