From raw data to APA-formatted PDF, DOCX, and HTML — now with 15+ data formats, AI-powered chat, and autonomous robustness checks.
Each stage executes via asyncio.to_thread, streaming real-time progress events to the CLI while offloading CPU-bound computation to separate threads.
15+ formats: CSV, TSV, JSON, Excel, Parquet, Feather, SPSS, Stata, SAS, HDF5, SQLite, and remote URLs. Automatic format detection with lazy dependency loading.
Shapiro-Wilk normality and Levene homoscedasticity tests with borderline detection (0.04 < p < 0.06). Results cached via joblib.Memory keyed on SHA-256 data hashes.
Decision-tree mapping group count + assumption results to a ranked test list. Routes to t_test, mann_whitney, anova, kruskal_wallis, or regression.
Plugin registry dispatching to registered models via the @register decorator. Ships with frequentist (ANOVA, T-Test) and Bayesian (PyMC T-Test, ANOVA, Regression) models.
APA7-compliant table builder with effect sizes (η², Cohen's d, R²). Formats p-values per APA convention (< .001 or exact to three decimals).
Jinja2 template orchestrator generating PDF, DOCX, or HTML output. Supports pluggable journal templates (APA7, Vancouver, IEEE, Nature).
Three headline features expand StatForge from a pipeline runner into a full research companion.
Load CSV, TSV, JSON, Excel, Parquet, Feather, SPSS, Stata, SAS, HDF5, SQLite databases, and remote URLs. Optional dependencies are imported lazily with clear install hints.
Run statforge chat data.csv to explore your dataset interactively. Each row becomes a searchable document (microgpt philosophy). Connects to Claude API or uses a built-in rule engine.
Pass --auto to the run command. When borderline assumptions are detected (0.04 < p < 0.06), both parametric and non-parametric tests run automatically and results are compared.
Every feature maps directly to the implemented Python codebase. No aspirational claims.
The statforge validate command performs preliminary data quality checks: flagging missing values, detecting data type anomalies, and screening for outliers via IQR methods — entirely decoupled from the statistical models.
Auto-suggests data-driven weakly informative priors (Normal with μ = observed mean, σ = 2× observed SD). Documents the rationale for peer review. Runs automated sensitivity analysis across uninformative, weakly informative, and informative prior variants.
The MethodsBuilder synthesizes journal-ready prose specifying the exact test name, StatForge version, assumption rationale, significance threshold (α), and effect size metric with interpretation scale (per Cohen, 1988).
StatForge is free and open-source. Professional services are available for institutions requiring bespoke configurations.