Tuning

Learn how to fine-tune machine learning models using the Python SDK's tune() function and manage tuning jobs programmatically.

Tuning

The Python SDK's tuning module enables you to fine-tune machine learning models on your datasets with customizable parameters for time series forecasting and other predictive tasks. This SDK provides a programmatic interface to create and manage tuning jobs.

Quick Start

Model tuning is a crucial step in optimizing your machine learning models for better performance. The Python SDK's tune() function creates a tuning job that optimizes model parameters based on your dataset and requirements.

Prerequisites

Before using the tuning functionality, ensure you have:

  • Installed the Python SDK: pip install fount-core
  • Configured your client with API credentials
  • Uploaded a dataset using the SDK's upload functionality

Key Features

check-circle
Automated Validation

Automatically split your data into training and validation sets with customizable ratios using the SDK

clock
Time Series Support

Built-in support for various time granularities (daily, weekly, monthly, etc.) through the SDK interface

cog
Flexible Configuration

Configure categorical columns, target variables, and other parameters directly in your Python code

tasks
Job Management

Monitor progress, retrieve metrics, and control job execution programmatically through SDK methods

Troubleshooting

All configuration and job errors below also apply to client.train(). See Training > Troubleshooting for the full data quality and column error reference. This section highlights the most common issues specific to tuning workflows.

Configuration Errors

ValidationError: Field required: dataset

dataset was not passed or is None. Upload a dataset and confirm it is valid before calling tune():

dataset = client.upload_dataframe(df, name="my_data")
print(dataset.id)  # must not be None

ValidationError: Field required: model_name

Pass a descriptive string name:

tuning_job = client.tune(dataset=dataset, model_name="q4_sales_tuned", ...)

ValidationError: extra fields not permitted

Only use documented SDK parameter names. Remove any backend-only fields from the tune() call.

TypeError: tune() got an unexpected keyword argument ...

Internal backend fields (e.g. s3_bucket_name, device) were passed to the SDK. Remove them.

Column & Parameter Errors

categorical_cols must be List[str] / target_columns must be List[str]

Always pass a list, even for a single column:

categorical_cols=["Region"]     # correct
target_columns=["Sales"]        # correct

validation_split must be between 0.0 and 1.0

Use a fraction such as 0.1 or 0.2. Very small validation splits are especially problematic in tuning since each trial needs enough validation data to evaluate:

validation_split=0.1   # minimum recommended for tuning

time_granularity must be one of {...}

Use exact lowercase strings: daily, weekly, monthly, quarterly, non-timeseries, etc. Avoid aliases like "day" or "week".

Date column '...' not found in the data/ Target columns [...] not found in the data / Categorical columns [...] not found in the data

Column names must exactly match the dataset (case-sensitive, no trailing spaces):

print(df.columns.tolist())
assert date_column in df.columns
assert all(c in df.columns for c in target_columns)
assert all(c in df.columns for c in categorical_cols)

No input features found in the data

After removing target, date, and categorical columns, no feature columns remain. Ensure the dataset has at least one numeric input feature.

time series granularity but no date column

Set time_granularity="non-timeseries" for cross-sectional data, or provide a valid date_column for time-series tuning.

Data Quality Errors

could not convert string to float

A non-categorical feature column contains text values. Add it to categorical_cols or remove it:

print(df.drop(columns=target_columns + [date_column] + categorical_cols).dtypes)

Input contains NaN, infinity or a value too large

Impute or drop missing/infinite values before upload:

df = df.fillna(df.median(numeric_only=True))

Training very slow / high cardinality categorical feature

Tuning is especially slow with high-cardinality columns since each trial re-encodes them. Remove or bin ID-like columns before tuning.

Found array with 0 sample(s)

The dataset is empty or the validation split leaves no rows. Check row counts after preprocessing.

Job Execution & Monitoring Errors

TypeError: sleep length must be non-negative / poll_interval must be int

Use a positive integer poll interval:

tuning_job.run(wait=True, poll_interval=30)

Results not available / Job is not completed

Call metrics() only after the job status is Completed:

tuning_job.run(wait=True)
metrics_df = tuning_job.metrics()

AttributeError: 'bool' object has no attribute 'metrics'

Do not overwrite the tuning job object with the return value of run():

success = tuning_job.run(wait=True)   # store separately
metrics_df = tuning_job.metrics()      # call on original object

Cannot stop job in Completed/Failed state

Only stop jobs that are Pending or Running:

if tuning_job.status()["status"] in ["Pending", "Running"]:
    tuning_job.stop()