Export & Bundles¶

1 endpoint for merging two to five remote files into a single virtual dataset via configurable joins.

Method	Endpoint	Purpose
`POST`	`/v1/datasets/bundle`	Join 2–5 remote data sources into a single named dataset

Python SDK Examples¶

Join an orders CSV with a customers Parquet file¶

from toolkitapi import Analytics

with Analytics(api_key="tk_...") as analytics:
    bundle = analytics.create_bundle({
        "sources": [
            {
                "alias": "orders",
                "data_url": "https://storage.example.com/orders.csv",
                "file_type": "csv",
            },
            {
                "alias": "customers",
                "data_url": "https://storage.example.com/customers.parquet",
                "file_type": "parquet",
            },
        ],
        "joins": [
            {
                "left_alias": "orders",
                "right_alias": "customers",
                "left_key": "customer_id",
                "right_key": "id",
                "join_type": "INNER",
            }
        ],
    })

dataset_id = bundle["dataset_id"]
print(dataset_id)

Analyze the bundled dataset immediately¶

from toolkitapi import Analytics

with Analytics(api_key="tk_...") as analytics:
    bundle = analytics.create_bundle({
        "sources": [
            {"alias": "sales", "data_url": "https://example.com/sales.csv", "file_type": "csv"},
            {"alias": "regions", "data_url": "https://example.com/regions.json", "file_type": "json"},
        ],
        "joins": [
            {
                "left_alias": "sales",
                "right_alias": "regions",
                "left_key": "region_code",
                "right_key": "code",
                "join_type": "LEFT",
            }
        ],
    })

    result = analytics.analyze({
        "data_url": "",           # not used when dataset_id is supplied via bundle
        "prompt": "What is total revenue by region name?",
        "file_type": "auto",
    })

Note

Pass the dataset_id returned from create_bundle directly to /v1/analyze, /v1/visualize, or /v1/validate-chart — there is no separate "attach" step.

Request Parameters¶

POST /v1/datasets/bundle¶

Parameter	Type	Required	Description
`sources`	array	Yes	Between 2 and 5 source objects. Each `alias` must be unique
`joins`	array	Yes	One or more join definitions linking sources together

`sources` item fields¶

Field	Type	Required	Description
`alias`	string	Yes	Short unique name for this source — used as a column-name prefix in the merged schema (e.g. `orders.revenue`)
`data_url`	string	Yes	Publicly reachable URL to the data file. Pre-signed object-storage URLs are supported
`file_type`	string	No	One of `csv`, `json`, `parquet`, or `tsv`. Inferred from the URL extension when omitted

`joins` item fields¶

Field	Type	Required	Description
`left_alias`	string	Yes	Alias of the left-hand source
`right_alias`	string	Yes	Alias of the right-hand source
`left_key`	string	Yes	Column name in the left source to join on (without alias prefix)
`right_key`	string	Yes	Column name in the right source to join on (without alias prefix)
`join_type`	string	No	`INNER` (default), `LEFT`, `LEFT ANTI`, or `CROSS`

Response Fields¶

Field	Type	Description
`dataset_id`	string	Unique handle for the merged dataset — pass directly to `/v1/analyze`, `/v1/visualize`, or `/v1/validate-chart`
`sources`	array of strings	Aliases of the sources included in the bundle
`columns`	array	Merged schema — each entry has `name` (with alias prefix), `type`, and `nullable`
`schema_fingerprint`	string	Fingerprint of the merged schema for cache and drift detection

Tip

Column names in the merged schema are prefixed with their source alias (e.g. orders.revenue, customers.region) to prevent collisions. Use these prefixed names when building query prompts or chart specs against the bundle.

Visualization

Using with Python

Export & Bundles¶

Python SDK Examples¶

Join an orders CSV with a customers Parquet file¶

Analyze the bundled dataset immediately¶

Request Parameters¶

POST /v1/datasets/bundle¶

sources item fields¶

joins item fields¶

Response Fields¶

`sources` item fields¶

`joins` item fields¶