Tidy-TS

Statistical Analysis

Tidy-TS provides a statistical toolkit with 80+ functions across descriptive stats, hypothesis testing, and probability distributions.

Descriptive Statistics

Basic Descriptive Statistics

Essential statistical measures for understanding your data

The stats module provides all the descriptive statistics you need: mean(), median(), mode(), stdev(), variance(), min(), max(), range()

import { createDataFrame, stats as s } from "@tidy-ts/dataframe";

const sampleData = createDataFrame([
  { id: 1, value: 10, category: "A", score: 85 },
  { id: 2, value: 20, category: "B", score: 92 },
  { id: 3, value: 15, category: "A", score: 78 },
  { id: 4, value: 25, category: "B", score: 88 },
  { id: 5, value: 12, category: "A", score: 95 },
  { id: 6, value: 30, category: "C", score: 82 },
  { id: 7, value: 18, category: "B", score: 90 },
  { id: 8, value: 22, category: "A", score: 87 },
]);

const values = sampleData.value;

console.log("Values:", values);
console.log("Sum:", s.sum(values));
console.log("Mean:", s.mean(values));
console.log("Median:", s.median(values));
console.log("Min:", s.min(values));
console.log("Max:", s.max(values));

import { createDataFrame, stats as s } from "@tidy-ts/dataframe";

const sampleData = createDataFrame([
  { id: 1, value: 10, category: "A", score: 85 },
  { id: 2, value: 20, category: "B", score: 92 },
  { id: 3, value: 15, category: "A", score: 78 },
  { id: 4, value: 25, category: "B", score: 88 },
  { id: 5, value: 12, category: "A", score: 95 },
  { id: 6, value: 30, category: "C", score: 82 },
  { id: 7, value: 18, category: "B", score: 90 },
  { id: 8, value: 22, category: "A", score: 87 },
]);

const values = sampleData.value;

console.log("Values:", values);
console.log("Sum:", s.sum(values));
console.log("Mean:", s.mean(values));
console.log("Median:", s.median(values));
console.log("Min:", s.min(values));
console.log("Max:", s.max(values));

Quantiles and Percentiles

Statistical measures for data distribution analysis

Quantile and ranking functions: quantile(), percentileRank(), iqr(), quartiles(), rank(), denseRank(), percentileRank()

const quartiles = s.quartiles(values);
const q25 = s.quantile(values, 0.25);
const q75 = s.quantile(values, 0.75);

console.log("Quartiles [Q25, Q50, Q75]:", quartiles);
console.log("25th percentile:", q25);
console.log("75th percentile:", q75);

const quartiles = s.quartiles(values);
const q25 = s.quantile(values, 0.25);
const q75 = s.quantile(values, 0.75);

console.log("Quartiles [Q25, Q50, Q75]:", quartiles);
console.log("25th percentile:", q25);
console.log("75th percentile:", q75);

Cumulative Functions

Calculate running totals and cumulative statistics

Cumulative operations for time series: cumsum(), cummean(), cummin(), cummax(), cumprod()

const cumsum = s.cumsum(values);
const cummax = s.cummax(values);

console.log("Cumulative sum:", cumsum);
console.log("Cumulative max:", cummax);

const cumsum = s.cumsum(values);
const cummax = s.cummax(values);

console.log("Cumulative sum:", cumsum);
console.log("Cumulative max:", cummax);

Window Functions

Lag, lead, and other window operations

Window functions for time series analysis: lag(), lead()

const withRanking = sampleData
  .mutate({
    value_rank: (row, _index, df) => s.rank(df.value, row.value),
  });

withRanking.print("Data with ranking information:");

const withRanking = sampleData
  .mutate({
    value_rank: (row, _index, df) => s.rank(df.value, row.value),
  });

withRanking.print("Data with ranking information:");

Probability Distributions

16 probability distributions with functions for random values, density, probability, quantile, and data generation.
Continuous: normal, beta, gamma, exponential, chi-square, t, F, uniform, Weibull, log-normal, Wilcoxon
Discrete: binomial, Poisson, geometric, negative binomial, hypergeometric

Probability Distribution Functions

Individual distribution functions for statistical analysis

Each distribution provides random(), density(), probability(), and quantile() functions. You can also generate distribution data for visualization.

import { s } from "@tidy-ts/dataframe";

// Individual distribution functions
const randomValue = s.dist.normal.random({ mean: 0, standardDeviation: 1, sampleSize: 10 });        // Random sample
const density = s.dist.normal.density({ at: 0, mean: 0, standardDeviation: 1});        // PDF at x=0
const probability = s.dist.normal.probability({ at: 1.96, mean: 0, standardDeviation: 1 });  // CDF (P-value)
const quantile = s.dist.normal.quantile({ probability: 0.975, mean: 0, standardDeviation: 1 });  // Critical value

// Generate distribution data for visualization
const normalPDFData = s.dist.normal.data({
  mean: 0,
  standardDeviation: 2,
  type: "pdf",
  range: [-4, 4],
  points: 100,
});

// Other distributions: beta, gamma, exponential, chi-square, t, f, uniform,
// weibull, binomial, poisson, geometric, hypergeometric, and more
const betaSample = s.dist.beta.random({ alpha: 2, beta: 5 });
const chiSquareQuantile = s.dist.chiSquare.quantile({ probability: 0.95, degreesOfFreedom: 1 });

import { s } from "@tidy-ts/dataframe";

// Individual distribution functions
const randomValue = s.dist.normal.random({ mean: 0, standardDeviation: 1, sampleSize: 10 });        // Random sample
const density = s.dist.normal.density({ at: 0, mean: 0, standardDeviation: 1});        // PDF at x=0
const probability = s.dist.normal.probability({ at: 1.96, mean: 0, standardDeviation: 1 });  // CDF (P-value)
const quantile = s.dist.normal.quantile({ probability: 0.975, mean: 0, standardDeviation: 1 });  // Critical value

// Generate distribution data for visualization
const normalPDFData = s.dist.normal.data({
  mean: 0,
  standardDeviation: 2,
  type: "pdf",
  range: [-4, 4],
  points: 100,
});

// Other distributions: beta, gamma, exponential, chi-square, t, f, uniform,
// weibull, binomial, poisson, geometric, hypergeometric, and more
const betaSample = s.dist.beta.random({ alpha: 2, beta: 5 });
const chiSquareQuantile = s.dist.chiSquare.quantile({ probability: 0.95, degreesOfFreedom: 1 });

Hypothesis Testing

Comprehensive statistical tests with two approaches: a custom-designed Compare API that guides you to the right test, and direct access to specific tests. All tests are rigorously validated against R.

Compare API - Intuitive Statistical Comparisons

Custom-designed API to help you perform the analysis best suited to your needs

All tests available are rigorously vetted against results in R using testing against randomly generated data. The Compare API guides you to the right statistical test with descriptive function names and helpful options.

import { stats as s } from "@tidy-ts/dataframe";

// Compare API - designed to help you perform the analysis best suited to your needs
const heights = [170, 165, 180, 175, 172, 168];
const testResult = s.compare.oneGroup.centralTendency.toValue({
  data: heights,
  hypothesizedValue: 170,
  parametric: "parametric" // Use "auto" for help deciding if parametric or non-parametric is best
}); 
console.log(testResult);

// {
//   test_name: "One-sample t-test",
//   p_value: 0.47...,
//   effect_size: { value: 0.31..., name: "Cohen's D" },
//   test_statistic: { value: 0.76..., name: "T-Statistic" },
//   confidence_interval: {
//     lower: 166.08...,
//     upper: 177.24...,
//     confidence_level: 0.95
//   },
//   degrees_of_freedom: 5,
//   alpha: 0.05
// } 

const group1 = [23, 45, 67, 34, 56, 78, 29, 41, 52, 38]; // Hours spent studying per week
const group2 = [78, 85, 92, 73, 88, 95, 69, 81, 89, 76]; // Final exam scores
const groupComparison = s.compare.twoGroups.association.toEachOther({
  x: group1,
  y: group2,
  method: "pearson", // Use "auto" for help choosing the right correlation test
});
console.log(groupComparison);

// Two-group comparison result: {
//   test_name: "Pearson correlation test",
//   p_value: 0.0003...,
//   effect_size: { value: 0.90..., name: "Pearson's R" },
//   test_statistic: { value: 5.95..., name: "T-Statistic" },
//   confidence_interval: {
//     lower: 0.63...,
//     upper: 0.97...,
//     confidence_level: 0.95
//   },
//   degrees_of_freedom: 8,
//   alpha: 0.05
// }

import { stats as s } from "@tidy-ts/dataframe";

// Compare API - designed to help you perform the analysis best suited to your needs
const heights = [170, 165, 180, 175, 172, 168];
const testResult = s.compare.oneGroup.centralTendency.toValue({
  data: heights,
  hypothesizedValue: 170,
  parametric: "parametric" // Use "auto" for help deciding if parametric or non-parametric is best
}); 
console.log(testResult);

// {
//   test_name: "One-sample t-test",
//   p_value: 0.47...,
//   effect_size: { value: 0.31..., name: "Cohen's D" },
//   test_statistic: { value: 0.76..., name: "T-Statistic" },
//   confidence_interval: {
//     lower: 166.08...,
//     upper: 177.24...,
//     confidence_level: 0.95
//   },
//   degrees_of_freedom: 5,
//   alpha: 0.05
// } 

const group1 = [23, 45, 67, 34, 56, 78, 29, 41, 52, 38]; // Hours spent studying per week
const group2 = [78, 85, 92, 73, 88, 95, 69, 81, 89, 76]; // Final exam scores
const groupComparison = s.compare.twoGroups.association.toEachOther({
  x: group1,
  y: group2,
  method: "pearson", // Use "auto" for help choosing the right correlation test
});
console.log(groupComparison);

// Two-group comparison result: {
//   test_name: "Pearson correlation test",
//   p_value: 0.0003...,
//   effect_size: { value: 0.90..., name: "Pearson's R" },
//   test_statistic: { value: 5.95..., name: "T-Statistic" },
//   confidence_interval: {
//     lower: 0.63...,
//     upper: 0.97...,
//     confidence_level: 0.95
//   },
//   degrees_of_freedom: 8,
//   alpha: 0.05
// }

Available Compare API Functions

Complete reference of comparison functions available

Each function has various options to help both new and experienced users feel confident in what they're getting.

// Here are the various functions that the compare API exposes for use.  
// Each has various options to help both new and experienced users feel confident in what they're getting.
s.compare.oneGroup.centralTendency.toValue(...)
s.compare.oneGroup.proportions.toValue(...)
s.compare.oneGroup.distribution.toNormal(...)
s.compare.twoGroups.centralTendency.toEachOther(...)
s.compare.twoGroups.association.toEachOther(...)
s.compare.twoGroups.proportions.toEachOther(...)
s.compare.twoGroups.distributions.toEachOther(...)
s.compare.multiGroups.centralTendency.toEachOther(...)
s.compare.multiGroups.proportions.toEachOther(...)

// Here are the various functions that the compare API exposes for use.  
// Each has various options to help both new and experienced users feel confident in what they're getting.
s.compare.oneGroup.centralTendency.toValue(...)
s.compare.oneGroup.proportions.toValue(...)
s.compare.oneGroup.distribution.toNormal(...)
s.compare.twoGroups.centralTendency.toEachOther(...)
s.compare.twoGroups.association.toEachOther(...)
s.compare.twoGroups.proportions.toEachOther(...)
s.compare.twoGroups.distributions.toEachOther(...)
s.compare.multiGroups.centralTendency.toEachOther(...)
s.compare.multiGroups.proportions.toEachOther(...)

Specific Test API

Direct access to specific statistical tests if you prefer

If you'd prefer to have the specific test instead, we provide that via the test API as well. All tests return detailed, typed results.

// If you'd prefer to have the specific test instead, we provide that via the test API as well. 
const data = [170, 165, 180, 175, 172, 168];
const before = [23, 25, 28, 30, 32, 29, 27];
const after = [25, 27, 30, 32, 34, 31, 29];
const group1 = [23, 25, 28, 30, 32, 29, 27];
const group2 = [18, 20, 22, 24, 26, 21, 19];
const group3 = [15, 17, 19, 21, 23, 18, 16];

const oneSampleT = s.test.t.oneSample({ data, mu: 170, alternative: "two-sided", alpha: 0.05 });
const independentT = s.test.t.independent({ x: group1, y: group2, alpha: 0.05 });
const pairedT = s.test.t.paired({ x: before, y: after, alpha: 0.05 });
const anovaResult = s.test.anova.oneWay([group1, group2, group3], 0.05);
const mannWhitney = s.test.nonparametric.mannWhitney(group1, group2, 0.05);
const kruskalWallis = s.test.nonparametric.kruskalWallis([group1, group2], 0.05);
const pearsonTest = s.test.correlation.pearson(group1, group2, "two-sided", 0.05);
const shapiroWilk = s.test.normality.shapiroWilk(data, 0.05);

// If you'd prefer to have the specific test instead, we provide that via the test API as well. 
const data = [170, 165, 180, 175, 172, 168];
const before = [23, 25, 28, 30, 32, 29, 27];
const after = [25, 27, 30, 32, 34, 31, 29];
const group1 = [23, 25, 28, 30, 32, 29, 27];
const group2 = [18, 20, 22, 24, 26, 21, 19];
const group3 = [15, 17, 19, 21, 23, 18, 16];

const oneSampleT = s.test.t.oneSample({ data, mu: 170, alternative: "two-sided", alpha: 0.05 });
const independentT = s.test.t.independent({ x: group1, y: group2, alpha: 0.05 });
const pairedT = s.test.t.paired({ x: before, y: after, alpha: 0.05 });
const anovaResult = s.test.anova.oneWay([group1, group2, group3], 0.05);
const mannWhitney = s.test.nonparametric.mannWhitney(group1, group2, 0.05);
const kruskalWallis = s.test.nonparametric.kruskalWallis([group1, group2], 0.05);
const pearsonTest = s.test.correlation.pearson(group1, group2, "two-sided", 0.05);
const shapiroWilk = s.test.normality.shapiroWilk(data, 0.05);

Import Options

Flexible import patterns for different coding styles

Import as 'stats' for clarity or 's' for brevity. Both provide access to the same statistical functionality.

// Option 1: Import with full name for clarity
import { stats } from "@tidy-ts/dataframe";
const mean1 = stats.mean([1, 2, 3, 4, 5]);
const randomNormal1 = stats.dist.normal.random(0, 1);
const tTest1 = stats.test.t.oneSample([1, 2, 3], 2, "two-sided", 0.05);

// Option 2: Import with short alias for brevity
import { stats as s } from "@tidy-ts/dataframe";
const mean2 = s.mean([1, 2, 3, 4, 5]);
const randomNormal2 = s.dist.normal.random(0, 1);
const tTest2 = s.test.t.oneSample([1, 2, 3], 2, "two-sided", 0.05);

// Option 3: Both imports (they reference the same object)
import { stats, s } from "@tidy-ts/dataframe";
console.log("Same object:", stats === s);  // true

// Option 4: Default import
import stats from "@tidy-ts/dataframe";
const mean4 = stats.mean([1, 2, 3, 4, 5]);

// All approaches provide access to 80+ statistical functions:
// - Descriptive statistics: mean, median, stdev, variance, etc.
// - Distribution functions: stats.dist.normal.random, stats.dist.beta.density, etc.
// - Statistical tests: stats.test.t.oneSample, stats.test.anova.oneWay, etc.

// Option 1: Import with full name for clarity
import { stats } from "@tidy-ts/dataframe";
const mean1 = stats.mean([1, 2, 3, 4, 5]);
const randomNormal1 = stats.dist.normal.random(0, 1);
const tTest1 = stats.test.t.oneSample([1, 2, 3], 2, "two-sided", 0.05);

// Option 2: Import with short alias for brevity
import { stats as s } from "@tidy-ts/dataframe";
const mean2 = s.mean([1, 2, 3, 4, 5]);
const randomNormal2 = s.dist.normal.random(0, 1);
const tTest2 = s.test.t.oneSample([1, 2, 3], 2, "two-sided", 0.05);

// Option 3: Both imports (they reference the same object)
import { stats, s } from "@tidy-ts/dataframe";
console.log("Same object:", stats === s);  // true

// Option 4: Default import
import stats from "@tidy-ts/dataframe";
const mean4 = stats.mean([1, 2, 3, 4, 5]);

// All approaches provide access to 80+ statistical functions:
// - Descriptive statistics: mean, median, stdev, variance, etc.
// - Distribution functions: stats.dist.normal.random, stats.dist.beta.density, etc.
// - Statistical tests: stats.test.t.oneSample, stats.test.anova.oneWay, etc.

Function Reference

Complete list of 80+ statistical functions organized by category

Descriptive Statistics

• s.sum() - Sum of values
• s.mean() - Arithmetic mean
• s.median() - Median value
• s.mode() - Most frequent value
• s.stdev() - Standard deviation
• s.variance() - Variance
• s.min() - Minimum value
• s.max() - Maximum value
• s.range() - Range (max - min)
• s.product() - Product of values

Statistical Functions

• s.quantile() - Quantiles and percentiles
• s.quartiles() - Quartiles [Q25, Q50, Q75]
• s.iqr() - Interquartile range
• s.percentileRank() - Percentile rank
• s.rank() - Ranking values
• s.denseRank() - Dense ranking
• s.unique() - Unique values
• s.uniqueCount() - Count of unique values
• s.corr() - Correlation coefficient
• s.covariance() - Covariance

Cumulative Functions

• s.cumsum() - Cumulative sum
• s.cumprod() - Cumulative product
• s.cummin() - Cumulative minimum
• s.cummax() - Cumulative maximum
• s.cummean() - Cumulative mean

Window & Utility Functions

• s.lag() - Lag values
• s.lead() - Lead values
• s.round() - Round to decimal places
• s.floor() - Floor values
• s.ceiling() - Ceiling values
• s.countValue() - Count specific values

Distribution Functions

• s.dist.normal.density() - Normal density
• s.dist.normal.probability() - Normal CDF
• s.dist.normal.quantile() - Normal quantiles
• s.dist.normal.random() - Normal random samples
• s.dist.beta.density() - Beta density
• s.dist.beta.random() - Beta random samples
• s.dist.gamma.density() - Gamma density
• s.dist.binomial.random() - Binomial random samples
• ...16 distributions × 4 functions each = 64 functions

Statistical Tests

• s.test.t.oneSample() - One-sample t-test
• s.test.t.independent() - Two-sample t-test
• s.test.anova.oneWay() - One-way ANOVA
• s.test.correlation.pearson() - Pearson correlation test
• s.test.nonparametric.mannWhitney() - Mann-Whitney U test
• s.test.categorical.chiSquare() - Chi-square test
• s.test.normality.shapiroWilk() - Normality test
• s.test.nonparametric.kruskalWallis() - Kruskal-Wallis test
• ...8 categories with 20+ statistical tests