AWS S3 + Fount
This documentation explains how to connect Amazon Web Services (AWS) S3 storage with Python to access and process CSV files. It includes AWS account setup, S3 bucket creation, IAM user configuration, package installation, and Python code for securely reading files from S3 using Access Keys and Secret Keys.
Prerequisites
You need:
- AWS Account
- S3 Bucket
- Access Key
- Secret Key
- CSV file uploaded to S3
Step-by-Step Setup
Create AWS Account
https://aws.amazon.com/Setup Steps
- Login to AWS Console
- Search for S3 in the search bar
- Create Bucket → Enter unique bucket name → Create
- Upload CSV file into the bucket
- Search IAM → Users → Create User
- Attach permission: AmazonS3FullAccess
- Go to Security Credentials → Create Access Key
- Copy Access Key ID and Secret Access Key
- Use them in Python code
Required Packages
pip install pandas boto3 fount_core aiohttpPython Code
"""
AWS S3 + Fount Integration
This script:
1. Connects to AWS S3
2. Reads a CSV file from an S3 bucket
3. Converts the data into a pandas DataFrame
4. Uploads the DataFrame to Fount
5. Trains a forecasting model using Fount
"""
import boto3
import pandas as pd
from fount import Fount
# =========================
# AWS CONFIG
# =========================
aws_access_key_id = "YOUR_ACCESS_KEY"
aws_secret_access_key = "YOUR_SECRET_KEY"
bucket_name = "your-bucket-name"
file_key = "inventory_data.csv"
# =========================
# READ DATA FROM AWS S3
# =========================
s3 = boto3.client(
's3',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key
)
obj = s3.get_object(
Bucket=bucket_name,
Key=file_key
)
df = pd.read_csv(obj['Body'])
print(df.head())
print(df.shape)
# =========================
# FOUNT TESTING
# =========================
client = Fount()
dataset = client.upload_dataframe(
df,
name="AWS_S3_Dataset"
)
train_job = client.train(
dataset=dataset,
series_id_cols=["category"],
categorical_cols=[
'Seasonality Factors',
'External Factors',
'Demand Trend',
'Customer Segments'
],
model_name="aws_inventory_model",
date_column="Date",
target_columns=["Sales Quantity"],
validation_data_required=True,
validation_split=0.2,
time_granularity="daily",
machine="ml.g5.12xlarge"
)
print(train_job)