HTTP API¶

The MFS HTTP API is the /v1 control plane between the CLI, generated SDKs, and the Python FastAPI server. Use it when you want to integrate directly with MFS without shelling out to mfs.

protocol/openapi.yaml is the source of truth for endpoint paths, methods, operation IDs, and typed schemas. protocol/schemas/openapi.json mirrors the same contract for JSON consumers. Server behavior such as authentication, runtime error wrapping, and a few query-parameter details lives in server/python/src/mfs_server/api/app.py. For local protocol-change checks and SDK regeneration steps, see Development.

The examples below use MFS_URL so they work with local, Docker, and remote servers:

export MFS_URL=http://127.0.0.1:13619

Authentication¶

When the server is configured with auth_token, every request except GET /healthz must include a bearer token:

export MFS_TOKEN=replace-with-your-token
curl -sS -H "Authorization: Bearer $MFS_TOKEN" "$MFS_URL/v1/server/info"

If auth_token is not configured on the server, the bearer header is not required. Do not infer auth behavior from generated SDK README text; the server middleware is authoritative.

GET /healthz is intentionally outside /v1 and is exempt from bearer auth so liveness probes can run without access to API credentials. It returns only:

{ "status": "ok" }

CLI token fallback

The Rust CLI can read MFS_API_TOKEN, profile tokens, or the local server.token file. That fallback is CLI behavior, not an HTTP API requirement. Direct API clients should send Authorization: Bearer <token> whenever their target server requires it.

For the full server, CLI, and direct-client token map, see Auth and Secrets.

Workflow Matrix¶

Workflow	Endpoint	Operation ID	Request shape	Response shape
Server info	`GET /v1/server/info`	`getServerInfo`	No body	`ServerInfo`
Server status	`GET /v1/status`	`status`	No body	`StatusResponse`
Add or sync a source	`POST /v1/add`	`addSource`	`AddRequest` JSON	`AddResponse`
Upload a tar stream	`POST /v1/upload?name=...&process=...`	`uploadSource`	Raw tar or tar.gz body	`AddResponse`
File manifest step	`POST /v1/files/manifest`	`filesManifest`	`ManifestRequest` JSON	`ManifestResponse`
File upload step	`PUT /v1/files/upload?client_id=...&root=...&process=...&full=...`	`filesUpload`	Raw tar or tar.gz body	`AddResponse`
List jobs	`GET /v1/jobs?limit=...`	`listJobs`	Query parameters	`JobResponse[]`
Poll one job	`GET /v1/jobs/{job_id}`	`getJob`	Path parameter	`JobResponse`
Cancel a job	`POST /v1/jobs/{job_id}/cancel`	`cancelJob`	Path parameter	`CancelResponse`
Probe a connector	`POST /v1/connectors/probe`	`probeConnector`	`ProbeRequest` JSON	`ProbeResponse`
Estimate a connector	`POST /v1/connectors/estimate`	`estimateConnector`	`ProbeRequest` JSON	`EstimateResponse`
Inspect a connector	`GET /v1/connectors/inspect?target=...`	`inspectConnector`	Query parameter	JSON summary
Remove a connector	`DELETE /v1/connectors?target=...`	`removeConnector`	Query parameter	`RemoveResponse`
Search indexed content	`GET /v1/search?q=...`	`search`	Query parameters	`SearchResponse`
Grep a path	`GET /v1/grep?pattern=...&path=...`	`grep`	Query parameters	`GrepResponse`
List a path	`GET /v1/ls?path=...`	`ls`	Query parameter	`LsResponse`
Read an object	`GET /v1/cat?path=...`	`cat`	Query parameters	`CatResponse` or metadata
Read the first entries	`GET /v1/head?path=...&n=...`	`head`	Query parameters	`CatResponse`
Read the last entries	`GET /v1/tail?path=...&n=...`	`tail`	Query parameters	`CatResponse`
Export full content	`GET /v1/export?path=...`	`export`	Query parameter	`CatResponse`

GET /v1/connectors/inspect currently has an empty response schema in OpenAPI. Treat it as a connector/server JSON summary and avoid depending on fields that are not modeled in the protocol.

Minimal Examples¶

Check server info¶

curl -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/server/info"

{
  "version": "0.4.0",
  "machine_id": "mfs-host",
  "namespace": "default"
}

Add a source and poll the job¶

Set process=false when you want the request to return a job_id for worker processing. Set process=true when your client wants the server call to run indexing inline before returning.

curl -sS -H "Authorization: Bearer $MFS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"target":"/data/project","process":false}' \
  "$MFS_URL/v1/add"

{ "job_id": "8a6c6d4e3f9a4d1bb7d2b7c3f0e1a234" }

Poll the job by ID:

curl -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/jobs/8a6c6d4e3f9a4d1bb7d2b7c3f0e1a234"

{
  "id": "8a6c6d4e3f9a4d1bb7d2b7c3f0e1a234",
  "status": "succeeded",
  "op_kind": "sync",
  "trigger": "manual",
  "error": null,
  "total_objects": 42,
  "succeeded_objects": 42,
  "failed_objects": 0,
  "cancelled_objects": 0,
  "started_at": "2026-06-03T10:00:00Z",
  "finished_at": "2026-06-03T10:00:15Z"
}

The API models status as a string. Clients should display unknown status values instead of hard-coding a closed enum.

Send process=false to queue the work (a worker drains it) or process=true to index inline before the response returns. For job status meanings see Jobs.

Search¶

q is required. path, mode, top_k, and collapse are optional in OpenAPI. The server also accepts kind as a comma-separated chunk-kind filter.

curl -G -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/search" \
  --data-urlencode "q=release checklist" \
  --data-urlencode "path=/data/project" \
  --data-urlencode "top_k=5"

{
  "results": [
    {
      "source": "/data/project/README.md",
      "content": "Release checklist...",
      "score": 0.82,
      "locator": { "lines": [10, 18] },
      "metadata": { "chunk_kind": "body" }
    }
  ]
}

Browse and read¶

List a path:

curl -G -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/ls" \
  --data-urlencode "path=/data/project"

Read a bounded range:

curl -G -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/cat" \
  --data-urlencode "path=/data/project/README.md" \
  --data-urlencode "range=1:120"

Read by locator when a search hit includes one:

curl -G -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/cat" \
  --data-urlencode "path=/data/project/README.md" \
  --data-urlencode 'locator={"lines":[10,18]}'

Use meta=true when you need object metadata instead of content:

curl -G -sS -H "Authorization: Bearer $MFS_TOKEN" \
  "$MFS_URL/v1/cat" \
  --data-urlencode "path=/data/project/README.md" \
  --data-urlencode "meta=true"

Key Schemas¶

AddRequest¶

Field	Type	Notes
`target`	string	Required path or connector URI to register and index.
`config`	object or null	Optional connector configuration. The CLI loads this from a TOML file, but API clients send JSON.
`full`	boolean	Force full re-indexing and ignore caches or fingerprints.
`since`	string or null	Cursor or date for connectors that support incremental sync.
`process`	boolean	Controls inline processing versus worker processing. Send it explicitly if your client depends on either behavior.
`update`	boolean	Apply config to an existing connector.

ResultEnvelope¶

Field	Type	Notes
`source`	string	Object URI or path that can be sent to browse endpoints.
`content`	string	Snippet or matched content.
`score`	number or null	Ranking score when available.
`locator`	object or null	Per-hit identity, such as line bounds or a structured connector key.
`metadata`	object	Connector and chunk metadata.

LsEntry¶

Field	Type	Notes
`name`	string	Entry name.
`type`	string	`file` or `dir`.
`media_type`	string or null	Media type when known.
`size_hint`	integer or null	Approximate or known size hint.
`path`	string or null	Full object URI or path for `cat`, `search`, `head`, or `export`.
`search_status`	string or null	Indexing status such as `indexed`, `partial`, or `not_indexed` when known.
`indexable`	boolean or null	Whether the object is eligible for indexing.

JobResponse¶

Field	Type	Notes
`id`	string	Job ID returned by ingest endpoints.
`status`	string	Current job status. Treat as an open string.
`op_kind`	string or null	Operation kind when stored.
`trigger`	string or null	Trigger source when stored.
`error`	string or null	Error text or code when the job fails.
`total_objects`	integer or null	Total objects planned or observed.
`succeeded_objects`	integer or null	Object count completed successfully.
`failed_objects`	integer or null	Object count that failed.
`cancelled_objects`	integer or null	Object count cancelled.
`started_at`	string or null	Start timestamp when available.
`finished_at`	string or null	Finish timestamp when available.

StatusResponse¶

Field	Type	Notes
`connectors`	`ConnectorRow[]`	Registered connectors with `root_uri`, `type`, and `status`.
`jobs`	object	Counts grouped by job status.

Connector Requests and Responses¶

Schema	Fields	Notes
`ProbeRequest`	`target`, `config`	Used by probe and estimate endpoints.
`ProbeResponse`	`target`, `type`, `ok`, `detail`	Reports whether the server can probe the connector target.
`EstimateResponse`	`target`, `type`, `objects`, `sampled_objects`, `est_chunks`, `est_tokens`	Zero-billing pre-flight estimate based on metadata and local dry-run work.
`RemoveResponse`	`target`, `removed`	Returned after a registered connector root is removed. Child paths and unregistered targets return an error envelope.

Errors¶

Runtime API errors use a stable envelope:

{
  "code": "object_too_large_for_cat",
  "detail": "...",
  "suggestions": ["head", "cat --range", "export"]
}

Clients should switch on code, not detail. The server wraps HTTPException, request-validation failures, and uncaught exceptions into this shape. Some OpenAPI responses still reference FastAPI validation schemas, but the runtime validation handler returns:

{
  "code": "validation_error",
  "detail": "...",
  "suggestions": ["fix request shape"]
}

Common codes include:

Code	HTTP status	Typical cause
`unauthorized`	401	Missing or invalid bearer token when auth is enabled.
`validation_error`	422	Malformed request shape, invalid enum value, or unknown query parameter.
`not_found`	404	Missing path, object, connector, or job.
`sync_already_running`	409	A sync is already in flight for the connector.
`connector_removing`	409	Connector removal is already in progress.
`object_too_large_for_cat`	400	`cat` without a bounded range on a large object.
`is_directory`	400	`cat` was requested for a directory.
`range_unsupported`	400	Range reads are not supported for the object.
`density_unsupported`	400	Requested density is unsupported for the object.
`tail_unsupported`	400	The object has no stable ordering for tail reads.
`locator_not_found`	404	The requested locator is no longer present.
`connector_unhealthy`	502	Source connectivity or credentials failed.
`internal_error`	500	Unhandled server exception.

chunk_max_exceeded is not a hard error. It is surfaced through partial search availability or search_status: partial, so search may still return results with incomplete recall.

For the full code-to-recovery matrix, including CLI first actions and deeper workflow links, see Error Codes.

SDKs¶

Python and TypeScript SDK source trees are generated from protocol/openapi.yaml by sdks/generate.sh:

Language	Directory	Generator
Python	`sdks/python/`	OpenAPI Generator `python` client with `urllib3`
TypeScript	`sdks/typescript/`	OpenAPI Generator `typescript-fetch` client

The checked-in SDK directories currently expose generated clients for the server, ingest, retrieval, and browse API groups. The OpenAPI spec also contains connector-management operations; after protocol changes, regenerate the SDKs and check the generated API classes before documenting a client method name.

Treat generated README install, authorization, and default-host snippets as scaffolding unless your release process verifies them. For the docs-site SDK entry point, see SDKs.