EDA Suite

The EDA suite runs a read-only health check on your dataset and returns a suitability score, a quick indicator of how model-ready your data is. Your data is never modified.

It is designed for regression datasets with numeric targets. The time column is optional.

Three ways to run EDA

As a programmatic object eda()

Returns a result object you can access in code. Use this when you want to check the suitability score, loop through recommendations, or conditionally branch based on data quality.

rep = fount_eda.eda(df, target="Units", datetime_col="date")

# Access results
print(rep.suitability_score)
print(rep.recommendations)

As markdown text eda_md()

Returns a single markdown string. Use this when you want to save the report to a file, include it in a pull request, or paste it into documentation or notes.

md = fount_eda.eda_md(df, target="Units", datetime_col="date")

# Save to a file
with open("eda_report.md", "w", encoding="utf-8") as f:
    f.write(md)

As a console print eda_print()

Prints a readable summary directly to the terminal or notebook output. Use this for a quick sense-check during exploration.

fount_eda.eda_print(df, target="Units", datetime_col="date")

Full report and score eda_report_and_score()

Generates a complete report alongside the suitability score in a single call.

fount_eda.eda_report_and_score(df)

Targets and time column

Target column:

  • Pass a single numeric column name as a string: target="Units"
  • Pass multiple numeric column names as a list: target=["Units_A", "Units_B"]

Time column:

  • Optional. Pass the name of your date column: datetime_col="date"
  • If included, make sure the column is already parsed as a datetime, use fount_eda.to_datetime() first if needed
  • If your dataset has no date column, omit the parameter entirely

What the suitability score means

The suitability score is a 0–100 indicator of how model-ready your dataset is. A higher score generally means the data is cleaner and easier to model.

Use it to:

  • Compare different versions or variants of your dataset
  • Spot when a cleanup step has improved data quality
  • Get a quick baseline before investing time in feature engineering
📘

The score is a directional signal, not a guarantee of model performance. A high score means your data is structurally sound, it doesn't account for whether the features are predictive.

What to do with the results

The EDA suite highlights common issues and suggests next steps. Here's how to act on the most frequent findings:

FindingSuggested action
High missing valuesImpute missing values or drop columns/rows where appropriate
Large variance in scalesStandardize numeric features before modeling
Many outliersTry robust scaling, clipping, or winsorizing
Very rare or very many categoriesConsider grouping low-frequency categories or revisiting encoding choices

Quick reference

MethodUse case
fount_eda.eda(df, target=..., datetime_col=...)Programmatic access to results
fount_eda.eda_md(df, target=..., datetime_col=...)Markdown string for saving or sharing
fount_eda.eda_print(df, target=..., datetime_col=...)Quick terminal or notebook print
fount_eda.eda_report_and_score(df)Full report and score in one call
rep.suitability_score0–100 model-readiness score
rep.recommendationsList of suggested improvements