A predictive model
for a food chain
to predict traffic
on their recipe menu

traffic


        "We have noticed that traffic to the rest of the website goes up by as much as 40% if I pick a popular recipe. But I don't know how to decide if a recipe will be popular. More traffic means more subscriptions so this is really important to the company."

Column	Description	Type
recipe	Unique identifier (ID only — dropped before modelling)	int
calories	Calories per serving — 52 missing, imputed with median (306.2 kcal)	float
carbohydrate	Carbohydrates in grams — 52 missing, imputed with median (37.1g)	float
sugar	Sugar in grams — 52 missing, imputed with median (8.8g)	float
protein	Protein in grams — 52 missing, imputed with median (24.5g)	float
category	Recipe type — 11 categories, no missing values	str
servings	Number of servings — mixed types, converted to numeric	str→int
high_traffic	Target — "High" if popular; 373 missing → treated as "Low"	binary

Column	Issue	Resolution
recipe	✅ No issues	Used as ID tracker, dropped before modelling
calories	⚠️ 52 missing	Rows with ALL 4 nutrition fields missing removed (52); remaining imputed with median
carbohydrate	⚠️ 52 missing	Same rows as calories — removed with all-missing rows
sugar	⚠️ 52 missing	Same as above — median imputation after row removal
protein	⚠️ 52 missing	Same as above — median imputation after row removal
category	✅ No missing	11 unique values confirmed; one-hot encoded for modelling
servings	⚠️ Mixed types	"4 as a snack" → coerced to numeric; 1 NaN imputed with median
high_traffic	⚠️ 373 missing	Conservative: filled as "Low" (unexplored = likely not high)

A predictive model
for a food chain
to predict traffic
on their recipe menu

The Business Problem

Business Goal

Expected Impact

Current Process

Our Solution

Understanding the Dataset

What Makes a Recipe Popular?

High-Traffic vs Low-Traffic Recipes: Nutritional Differences

Two Models Tested

⚙ Threshold Tuning

Performance & Business Impact

vs Random Selection

Estimated Revenue Impact

Top 30 Predicted High-Traffic Recipes

Libraries, Models & Frameworks

Analysis Pipeline

Load Data

Validate

Clean

EDA

Split

Train

Evaluate

Tune & Deploy

How to Measure Success

Monitoring Dashboard

Alert Thresholds

Deployment Roadmap

Deploy & Tune

A/B Test Setup

Measure Impact

Full Rollout

Retrain & Improve

A predictive modelfor a food chainto predict trafficon their recipe menu

The Business Problem

Business Goal

Expected Impact

Current Process

Our Solution

Understanding the Dataset

What Makes a Recipe Popular?

High-Traffic vs Low-Traffic Recipes: Nutritional Differences

Two Models Tested

⚙ Threshold Tuning

Performance & Business Impact

vs Random Selection

Estimated Revenue Impact

Top 30 Predicted High-Traffic Recipes

Libraries, Models & Frameworks

Analysis Pipeline

Load Data

Validate

Clean

EDA

Split

Train

Evaluate

Tune & Deploy

How to Measure Success

Monitoring Dashboard

Alert Thresholds

Deployment Roadmap

Deploy & Tune

A/B Test Setup

Measure Impact

Full Rollout

Retrain & Improve

A predictive model
for a food chain
to predict traffic
on their recipe menu