Connecting Data with Google Meridian for MMM Using Windsor.ai [Python Tutorial]

Google Meridian is an open-source library developed by Google to implement Bayesian Marketing Mix Modeling (MMM). It helps marketers understand the effectiveness of their advertising channels by estimating how different media inputs (like ad spend or impressions) impact a key business metric (KPI), such as conversions or revenue.
Meridian utilizes probabilistic modeling to provide insights into return on investment (ROI), channel attribution, and forecasted outcomes, enabling businesses to make data-driven decisions about budget allocation.
Using Windsor.ai data integration platform allows you to automatically fetch, transform, and prepare your marketing data for use with Meridian. With just a few API calls, Windsor.ai connectors aggregate data from various ad platforms, enabling seamless integration with the MMM pipeline in Meridian.
How to use Windsor.ai data with Google Meridian for MMM
This tutorial explains how to connect Windsor.ai’s marketing data with Google’s Meridian library to build a Marketing Mix Model (MMM).
You will learn how to:
- Collect and fetch marketing data from Windsor.ai
- Prepare the data in the proper format for Meridian
- Configure the Meridian model
- Fit the model to analyze your marketing performance
- Generate a summary report
Prerequisites:
- Python 3.11+
- Libraries: tensorflow, tensorflow_probability, psutil, pandas, requests, and meridian
- A Windsor.ai API URL with key, date_preset, and fields
- Knowledge of Google Meridian MMM
How to run the source code
- Install the required libraries using the requirements.txt file.
pip install -r requirements.txt
- Run the model.py file.
python model.py
- Now, the model will start and ask for inputs for the following parameters:
- API Key: This is the API key for your Windsor.ai data (find it in your app dashboard).
- KPI: Revenue or Conversions. Enter 1 for Revenue and 0 for Conversions.
- Conversion Events: If you select conversions as KPI, it will ask for the events that should be considered as conversions. You’ll see a list of available events; just enter the necessary events (comma-separated).
- After the model execution is completed, raw_data.csv, processed_data.csv, and summary_output.html files will be created.
Step 1: Fetch data from Windsor.ai
You can get the Windsor.ai API URL for your data from the dashboard and use it to fetch daily performance metrics across your marketing channels.
These metrics typically include:
- Date
- Clicks
- Spend
- Impressions
- Conversions
Important:
Aggregate your data by date (daily frequency), and fill any missing values with zero to ensure compatibility with Meridian.
In our example, we use data from Google, Facebook, Bing, and Reddit Ads, and the key performance indicator (KPI) we aim to model is Conversions to Spend.
Step 2: Set up Meridian input format
To use your Windsor data with Meridian, you need to do the following things:
- Define how the columns in your dataset map to what Meridian expects (e.g., date, KPI, media spend, etc.).
- Indicate the media channels and their corresponding spend columns.
- Load the data into Meridian using this script:
Note: getDataFromWindsor() function in data.py file returns an object that contains the data to be used by the meridian model.
Following is the breakdown of the key values in the object it returns:
- data: It is the processed data created for the mmm to use.
- kpi: Your selected kpi (revenue or conversions).
- media: The list of columns in the data to be used as media for the model, e.g, [‘bing_clicks’, ‘facebook_clicks’], depending on the media channels you have in the data.
- media_spend: The list of columns in the data to be used as media spend, e.g, [‘bing_spend’, facebook_spend], depending on the media channels you have in the data.
- media_to_channel: An object that contains the mapping of your media to correct channels, e.g, {‘”facebook_clicks”: “Facebook”, “bing_clicks”: “Bing”}, depending on the media channels you have in the data.
- media_spend_to_channel: An object that contains the mapping of your media spend to correct channels, e.g, {‘”facebook_spend”: “Facebook”, “bing_spend”: “Bing”}, depending on the media channels you have in the data.
- controls: The list of columns in the data that will be used as the controls in Meridian mmm, e.g, [‘gqv’], etc.
- start_date: Start date of the data.
- end_date: End date of the data.
# ------------------------------- # STEP 1: Load Data (from Windsor.ai in this case) # ------------------------------- # This function returns processed_csv_data, media, media spend, mapping of media to channels and mapping of media spend to channels # start data, end data of the data and kpi # It has daily values for clicks, spend, impressions and conversions. model_data = getDataFromWindsor() print("Media Channels:", model_data['media']) csv_data = model_data['data'] # ------------------------------- # STEP 2: Define Mapping from Your CSV Columns to Meridian's Format # ------------------------------- # Coordinate-to-columns mapping: Tells Meridian how to interpret the input columns. # Replace the column names with your own if you're using a different dataset. coord_to_columns = load.CoordToColumns( time="date", # Timestamp column geo="geo", # Optional: region/location (if available, else keep as None or a dummy column) controls=model_data['controls'], # Any external control variables (economic indicators, organic clicks etc.) kpi=model_data['kpi'], # Your Key Performance Indicator (target variable for prediction) revenue_per_kpi=None, # Optional: Revenue generated per conversion (for revenue models) media=model_data['media'], # Media signals (e.g., impressions, clicks) media_spend=model_data['media_spend'], # Spend per media channel ) # Maps your impression column to its corresponding media channel (important for attribution and budgeting) correct_media_to_channel = model_data['media_to_channel'] # Maps spend column to the correct media channel correct_media_spend_to_channel = model_data['media_spend_to_channel'] # ------------------------------- # STEP 3: Load Data into Meridian # ------------------------------- # This loads and prepares the dataset based on the mappings and parameters defined above. loader = load.CsvDataLoader( csv_path=csv_data, # CSV data from Windsor or your source kpi_type='non_revenue', # Use 'revenue' if you provide revenue_per_kpi coord_to_columns=coord_to_columns, media_to_channel=correct_media_to_channel, media_spend_to_channel=correct_media_spend_to_channel, ) data = loader.load()
Important note:
For more optimized results:
- Consider adding control variables like:
- Seasonality and Time Effects
- Economic and Market Indicators
- Competitor & Market Activity
- Product or Price Changes
- Organic Demand Proxies
- Consider adding organic media and non-media treatments (promo, etc.)
Step 3: Configure model priors
Set prior expectations for ROI using a LogNormal distribution according to your dataset.
# ROI prior: Used to inform the model of expected returns on ad spend. # LogNormal is used since ROI is strictly positive. # Assign roi_mu and roi_sigma values according to your dataset roi_mu = 0.2 roi_sigma = 0.9 # Uncertainty/spread in the ROI prior prior = prior_distribution.PriorDistribution( roi_m=tfp.distributions.LogNormal(roi_mu, roi_sigma, name=constants.ROI_M) ) # Wrap the prior into the model specification model_spec = spec.ModelSpec(prior=prior) # This creates a Meridian model object using the input data and model specifications mmm = model.Meridian(input_data=data, model_spec=model_spec) # First, sample from the prior distribution (before seeing data). Good for understanding priors. # Adjust these values according to your dataset mmm.sample_prior(100) # 100 samples is enough for exploration # Then, sample from the posterior (after observing the data) # Smaller values (n_chains=2, n_adapt/burnin=100) make it faster for small data or experimentation. mmm.sample_posterior( n_chains=2, # Number of independent chains n_adapt=100, # Adaptation steps for tuning the sampler n_burnin=100, # Number of warm-up iterations (burn-in) n_keep=200, # Number of posterior samples to keep seed=1 # Random seed for reproducibility )
Important note:
If your baseline is negative, it means the model is attributing all KPI to media channels. You can solve this by:
- Tightening ROI priors for paid channels – Lower and narrower ROI priors to reduce over-attribution.
- Adding control variables – Include seasonality, organic demand, and economic factors to explain non-media effects.
Step 4: Summarize and export results
After fitting the model, you can generate a full performance summary that covers:
- Channel effectiveness
- ROI estimates
- Attribution breakdowns
# Generate an HTML report with parameter summaries, media channel attribution, ROI, and more. # The date arrange should be inside the used dataset mmm_summarizer = summarizer.Summarizer(mmm) file_path = './' # Save location for summary report start_date = model_data['start_date'] # Analysis start date (adjust based on your data) end_date = model_data['end_date'] # Analysis end date # Export the model summary to an HTML file mmm_summarizer.output_model_results_summary('summary_output.html', file_path, start_date, end_date)
The summary helps you understand how well each marketing channel is performing and whether it justifies a major investment.
Conclusion
The summary helps you understand how well each marketing channel is performing and whether it justifies a major investment.
That’s it with using Windsor.ai data connectors with Google Meridian for marketing mix modeling.
Ready to optimize your marketing spend? Get started with Windsor.ai to automatically connect your data to Google Meridian and unlock valuable insights for smarter decisions!
Explore the source code in our GitHub repository to try it on your own: https://github.com/windsor-ai/WindsorMeridian.