Snowflake + Fount

This documentation explains how to connect Snowflake with Python to read data from Snowflake tables, convert it into a pandas DataFrame, and train machine learning models using Fount. It includes Snowflake account setup, warehouse and database configuration, table creation, package installation, authentication setup, and Python code for securely accessing and processing data inside a Jupyter Notebook environment.

Prerequisites

You need:

  • Snowflake Account
  • Warehouse
  • Database
  • Schema
  • Uploaded table

Step-by-Step Setup

Create Snowflake Account

https://signup.snowflake.com/

Setup Steps

  • Create Database
  • Create Schema
  • Create Warehouse
  • Upload CSV file as table
  • Copy Account Identifier from URL
  • Use Snowflake login credentials in Python

Required Packages

pip install pandas snowflake-connector-python fount_core aiohttp

Python Code

"""
Snowflake + Fount Integration

This script:
1. Connects to Snowflake Warehouse
2. Reads a Snowflake table into pandas
3. Uploads the DataFrame to Fount
4. Trains a forecasting model using Fount
"""

import snowflake.connector
import pandas as pd
from fount import Fount

# =========================
# SNOWFLAKE CONFIG
# =========================

conn = snowflake.connector.connect(
    user='YOUR_USERNAME',
    password='YOUR_PASSWORD',
    account='YOUR_ACCOUNT_IDENTIFIER',
    warehouse='COMPUTE_WH',
    database='TEST_DB',
    schema='PUBLIC'
)

# =========================
# READ DATA FROM SNOWFLAKE
# =========================

query = """
SELECT *
FROM INVENTORY_DATA
"""

df = pd.read_sql(query, conn)

print(df.head())
print(df.shape)

# =========================
# FOUNT TESTING
# =========================

client = Fount()

dataset = client.upload_dataframe(
    df,
    name="Snowflake_Dataset"
)

train_job = client.train(
    dataset=dataset,
		series_id_cols=["category"],
    categorical_cols=[
        'Seasonality Factors',
        'External Factors',
        'Demand Trend',
        'Customer Segments'
    ],

    model_name="snowflake_inventory_model",

    date_column="Date",

    target_columns=["Sales Quantity"],

    validation_data_required=True,

    validation_split=0.2,

    time_granularity="daily",

    machine="ml.g5.12xlarge"
)

print(train_job)