BigQuery (`bigquery`)¶

The bigquery connector indexes BigQuery table rows as searchable records, with a schema summary per table. Use it to search large analytical tables — a knowledge base, an events table — by meaning.

How MFS sees it¶

bigquery://analytics/
└── events/
    └── tables/
        └── user_events/
            ├── rows.jsonl     table_rows    → one searchable chunk per row
            └── schema.json    table_schema  → searchable column summary

Rows are chunked per-row and need [[objects]].text_fields to become searchable.

Credentials¶

BigQuery uses Application Default Credentials (ADC) — there is no token in the TOML; the credentials must be visible to the server process. Three common paths:

Service account JSON (production): create a service account with roles/bigquery.dataViewer on the target datasets, then point the server at it:
```
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa.json
mfs-server run
```
gcloud auth application-default login (dev / single-user) — writes ~/.config/gcloud/application_default_credentials.json.
Workload Identity on GKE / Cloud Run — ADC is automatic.

Make sure the BigQuery API is enabled on the project.

Google Cloud BigQuery API page

Google Cloud service accounts page

Google Cloud create service account form

Configuration¶

project = "analytics-prod"
datasets = ["events", "kb"]
max_read_rows = 100000

[[objects]]
match = "/kb/tables/articles"
text_fields = ["title", "body_markdown"]
locator_fields = ["article_id"]

Sync and freshness¶

The connector uses the table's modified time as its cursor; deletions are caught by full_scan. It reads rows via list_rows, so max_read_rows caps large tables.

Search and browse¶

mfs add bigquery://analytics --config ./bigquery.toml

mfs search "refund event" bigquery://analytics/events/tables/user_events/rows.jsonl
mfs search "email column" bigquery://analytics --kind schema_summary
mfs cat bigquery://analytics/kb/tables/articles/rows.jsonl --locator '{"article_id":"a-123"}'

Pitfalls¶

ADC must be visible to the server process, not just the CLI shell.
BigQuery has no primary key for most tables — choose stable locator_fields explicitly.
Rows need text_fields to be searchable.

BigQuery (bigquery)¶