Combine data from multiple sources with different join types, multi-key support, and strong type safety. Learn how to merge datasets effectively.
Joining DataFrames lets you combine data from multiple sources. tidy-ts supports all standard join types with multi-key support and strong type safety.
import { createDataFrame } from "@tidy-ts/dataframe";
const employees = createDataFrame([
{ emp_id: 1, name: "Alice", dept_id: 10 },
{ emp_id: 2, name: "Bob", dept_id: 20 },
{ emp_id: 3, name: "Charlie", dept_id: 10 },
]);
const departments = createDataFrame([
{ dept_id: 10, dept_name: "Engineering" },
{ dept_id: 20, dept_name: "Marketing" },
]);
const joined = employees.innerJoin(departments, "dept_id");
joined.print("Inner join result:");
import { createDataFrame } from "@tidy-ts/dataframe";
const employees = createDataFrame([
{ emp_id: 1, name: "Alice", dept_id: 10 },
{ emp_id: 2, name: "Bob", dept_id: 20 },
{ emp_id: 3, name: "Charlie", dept_id: 10 },
]);
const departments = createDataFrame([
{ dept_id: 10, dept_name: "Engineering" },
{ dept_id: 20, dept_name: "Marketing" },
]);
const joined = employees.innerJoin(departments, "dept_id");
joined.print("Inner join result:");
Single key joins are the most common type. Each join type preserves different combinations of records from the left and right DataFrames.
const employees = createDataFrame([
{ emp_id: 1, name: "Alice", dept_id: 10 },
{ emp_id: 2, name: "Bob", dept_id: 20 },
{ emp_id: 3, name: "Charlie", dept_id: 30 }, // No matching department
]);
const departments = createDataFrame([
{ dept_id: 10, dept_name: "Engineering" },
{ dept_id: 20, dept_name: "Marketing" },
]);
const joined = employees.leftJoin(departments, "dept_id");
joined.print("Left join result:");
const employees = createDataFrame([
{ emp_id: 1, name: "Alice", dept_id: 10 },
{ emp_id: 2, name: "Bob", dept_id: 20 },
{ emp_id: 3, name: "Charlie", dept_id: 30 }, // No matching department
]);
const departments = createDataFrame([
{ dept_id: 10, dept_name: "Engineering" },
{ dept_id: 20, dept_name: "Marketing" },
]);
const joined = employees.leftJoin(departments, "dept_id");
joined.print("Left join result:");
Multi-key joins are great for complex data relationships where you need to match on multiple criteria. tidy-ts provides strong typing for all join scenarios.
const sales = createDataFrame([
{ year: 2023, quarter: "Q1", product: "Widget A", sales: 1000 },
{ year: 2023, quarter: "Q2", product: "Widget B", sales: 1500 },
]);
const targets = createDataFrame([
{ year: 2023, quarter: "Q1", product: "Widget A", target: 1200 },
{ year: 2023, quarter: "Q2", product: "Widget B", target: 1400 },
]);
const joined = sales.innerJoin(targets, ["year", "quarter", "product"]);
joined.print("Multi-key join result:");
const sales = createDataFrame([
{ year: 2023, quarter: "Q1", product: "Widget A", sales: 1000 },
{ year: 2023, quarter: "Q2", product: "Widget B", sales: 1500 },
]);
const targets = createDataFrame([
{ year: 2023, quarter: "Q1", product: "Widget A", target: 1200 },
{ year: 2023, quarter: "Q2", product: "Widget B", target: 1400 },
]);
const joined = sales.innerJoin(targets, ["year", "quarter", "product"]);
joined.print("Multi-key join result:");
Real-world data often has different column names for the same concept. tidy-ts allows you to specify different column names for left and right DataFrames.
const orders = createDataFrame([
{ order_id: 1, order_region: "North", order_product: "A", quantity: 10 },
{ order_id: 2, order_region: "South", order_product: "B", quantity: 20 },
]);
const inventory = createDataFrame([
{ inv_region: "North", inv_product: "A", stock: 100 },
{ inv_region: "South", inv_product: "B", stock: 200 },
]);
const joined = orders.innerJoin(inventory, {
keys: {
left: ["order_region", "order_product"],
right: ["inv_region", "inv_product"],
},
});
joined.print("Join with different column names:");
const orders = createDataFrame([
{ order_id: 1, order_region: "North", order_product: "A", quantity: 10 },
{ order_id: 2, order_region: "South", order_product: "B", quantity: 20 },
]);
const inventory = createDataFrame([
{ inv_region: "North", inv_product: "A", stock: 100 },
{ inv_region: "South", inv_product: "B", stock: 200 },
]);
const joined = orders.innerJoin(inventory, {
keys: {
left: ["order_region", "order_product"],
right: ["inv_region", "inv_product"],
},
});
joined.print("Join with different column names:");
For complex data integration scenarios, you can use suffixes to distinguish between columns with the same name and join on multiple keys simultaneously.
const orders = createDataFrame([
{ order_id: 1, order_region: "North", order_product: "A", quantity: 10 },
{ order_id: 2, order_region: "South", order_product: "B", quantity: 20 },
]);
const inventory = createDataFrame([
{ inv_region: "North", inv_product: "A", stock: 100 },
{ inv_region: "South", inv_product: "B", stock: 200 },
]);
const joined = orders.innerJoin(inventory, {
keys: {
left: ["order_region", "order_product"],
right: ["inv_region", "inv_product"],
},
});
joined.print("Join with different column names:");
const orders = createDataFrame([
{ order_id: 1, order_region: "North", order_product: "A", quantity: 10 },
{ order_id: 2, order_region: "South", order_product: "B", quantity: 20 },
]);
const inventory = createDataFrame([
{ inv_region: "North", inv_product: "A", stock: 100 },
{ inv_region: "South", inv_product: "B", stock: 200 },
]);
const joined = orders.innerJoin(inventory, {
keys: {
left: ["order_region", "order_product"],
right: ["inv_region", "inv_product"],
},
});
joined.print("Join with different column names:");
Join Type | Result Type Pattern | Description |
---|---|---|
Inner Join | L ∪ (R \ K) | All fields required - only matching records |
Left Join | L ∪ (R \ K)? | Right non-key fields: T | undefined |
Right Join | (L \ K)? ∪ R | Left non-key fields: T | undefined |
Outer Join | (L \ K)? ∪ (R \ K)? | Both sides: T | undefined |
Cross Join | L ∪ R | All fields required (Cartesian product) |
Where: L = Left DataFrame, R = Right DataFrame, K = Join keys
Note: All joins use explicit undefined unions (T | undefined), never optional properties (T?)