EDA Suite
The EDA suite runs a read-only health check on your dataset and returns a suitability score, a quick indicator of how model-ready your data is. Your data is never modified.
It is designed for regression datasets with numeric targets. The time column is optional.
Three ways to run EDA
As a programmatic object eda()
eda()Returns a result object you can access in code. Use this when you want to check the suitability score, loop through recommendations, or conditionally branch based on data quality.
rep = fount_eda.eda(df, target="Units", datetime_col="date")
# Access results
print(rep.suitability_score)
print(rep.recommendations)As markdown text eda_md()
eda_md()Returns a single markdown string. Use this when you want to save the report to a file, include it in a pull request, or paste it into documentation or notes.
md = fount_eda.eda_md(df, target="Units", datetime_col="date")
# Save to a file
with open("eda_report.md", "w", encoding="utf-8") as f:
f.write(md)As a console print eda_print()
eda_print()Prints a readable summary directly to the terminal or notebook output. Use this for a quick sense-check during exploration.
fount_eda.eda_print(df, target="Units", datetime_col="date")Full report and score eda_report_and_score()
eda_report_and_score()Generates a complete report alongside the suitability score in a single call.
fount_eda.eda_report_and_score(df)Targets and time column
Target column:
- Pass a single numeric column name as a string:
target="Units" - Pass multiple numeric column names as a list:
target=["Units_A", "Units_B"]
Time column:
- Optional. Pass the name of your date column:
datetime_col="date" - If included, make sure the column is already parsed as a datetime, use
fount_eda.to_datetime()first if needed - If your dataset has no date column, omit the parameter entirely
What the suitability score means
The suitability score is a 0–100 indicator of how model-ready your dataset is. A higher score generally means the data is cleaner and easier to model.
Use it to:
- Compare different versions or variants of your dataset
- Spot when a cleanup step has improved data quality
- Get a quick baseline before investing time in feature engineering
The score is a directional signal, not a guarantee of model performance. A high score means your data is structurally sound, it doesn't account for whether the features are predictive.
What to do with the results
The EDA suite highlights common issues and suggests next steps. Here's how to act on the most frequent findings:
| Finding | Suggested action |
|---|---|
| High missing values | Impute missing values or drop columns/rows where appropriate |
| Large variance in scales | Standardize numeric features before modeling |
| Many outliers | Try robust scaling, clipping, or winsorizing |
| Very rare or very many categories | Consider grouping low-frequency categories or revisiting encoding choices |
Quick reference
| Method | Use case |
|---|---|
fount_eda.eda(df, target=..., datetime_col=...) | Programmatic access to results |
fount_eda.eda_md(df, target=..., datetime_col=...) | Markdown string for saving or sharing |
fount_eda.eda_print(df, target=..., datetime_col=...) | Quick terminal or notebook print |
fount_eda.eda_report_and_score(df) | Full report and score in one call |
rep.suitability_score | 0–100 model-readiness score |
rep.recommendations | List of suggested improvements |