From 04e41e8d16855788116535293b75a8526e19c7b4 Mon Sep 17 00:00:00 2001
From: DorelyMS <38868912+DorelyMS@users.noreply.github.com>
Date: Fri, 11 Jul 2025 13:57:32 -0600
Subject: [PATCH] Prueba 1 5 canales
---
Copia_de_Meridian_Getting_Started.ipynb | 725 ++++++++++++++++++++++++
1 file changed, 725 insertions(+)
create mode 100644 Copia_de_Meridian_Getting_Started.ipynb
diff --git a/Copia_de_Meridian_Getting_Started.ipynb b/Copia_de_Meridian_Getting_Started.ipynb
new file mode 100644
index 0000000..41256d7
--- /dev/null
+++ b/Copia_de_Meridian_Getting_Started.ipynb
@@ -0,0 +1,725 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "view-in-github",
+ "colab_type": "text"
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "yuQtvbG_vILv"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "KqSiFABximWU"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "# **Introduction to Meridian Demo**"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "ckR-pavwis-Q"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Welcome to the Meridian end-to-end demo. This simplified demo showcases the fundamental functionalities and basic usage of the library, including working examples of the major modeling steps:\n",
+ "\n",
+ "\n",
+ "\n",
+ " - Install
\n",
+ " - Load the data
\n",
+ " - Configure the model
\n",
+ " - Run model diagnostics
\n",
+ " - Generate model results & two-page output
\n",
+ " - Run budget optimization & two-page output
\n",
+ " - Save the model object
\n",
+ "
\n",
+ "\n",
+ "\n",
+ "Note that this notebook skips all of the exploratory data analysis and preprocessing steps. It assumes that you have completed these tasks before reaching this point in the demo.\n",
+ "\n",
+ "This notebook utilizes sample data. As a result, the numbers and results obtained might not accurately reflect what you encounter when working with a real dataset."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "GicRPam0mUhF"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 0: Install"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "pDdX9WofM2fx"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "1\\. Make sure you are using one of the available GPU Colab runtimes which is **required** to run Meridian. You can change your notebook's runtime in `Runtime > Change runtime type` in the menu. All users can use the T4 GPU runtime which is sufficient to run the demo colab, free of charge. Users who have purchased one of Colab's paid plans have access to premium GPUs (such as V100, A100 or L4 Nvidia GPU)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "nFYRTDuesa1P"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "2\\. Install the latest version of Meridian, and verify that GPU is available."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "h1jAk386jF3k"
+ },
+ "cell_type": "code",
+ "source": [
+ "# Install meridian: from PyPI @ latest release\n",
+ "!pip install --upgrade google-meridian[colab,and-cuda]\n",
+ "\n",
+ "# Install meridian: from PyPI @ specific version\n",
+ "# !pip install google-meridian[colab,and-cuda]==1.1.1\n",
+ "\n",
+ "# Install meridian: from GitHub @HEAD\n",
+ "# !pip install --upgrade \"google-meridian[colab,and-cuda] @ git+https://github.com/google/meridian.git@main\""
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "Fhwt1wzgLwpZ"
+ },
+ "cell_type": "code",
+ "source": [
+ "import arviz as az\n",
+ "import IPython\n",
+ "from meridian import constants\n",
+ "from meridian.analysis import analyzer\n",
+ "from meridian.analysis import formatter\n",
+ "from meridian.analysis import optimizer\n",
+ "from meridian.analysis import summarizer\n",
+ "from meridian.analysis import visualizer\n",
+ "from meridian.data import data_frame_input_data_builder\n",
+ "from meridian.data import test_utils\n",
+ "from meridian.model import model\n",
+ "from meridian.model import prior_distribution\n",
+ "from meridian.model import spec\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "# check if GPU is available\n",
+ "from psutil import virtual_memory\n",
+ "import tensorflow as tf\n",
+ "import tensorflow_probability as tfp\n",
+ "\n",
+ "ram_gb = virtual_memory().total / 1e9\n",
+ "print('Your runtime has {:.1f} gigabytes of available RAM\\n'.format(ram_gb))\n",
+ "print(\n",
+ " 'Num GPUs Available: ',\n",
+ " len(tf.config.experimental.list_physical_devices('GPU')),\n",
+ ")\n",
+ "print(\n",
+ " 'Num CPUs Available: ',\n",
+ " len(tf.config.experimental.list_physical_devices('CPU')),\n",
+ ")"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "kiM0UrN6qbIP"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 1: Load the data"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "z18Mo-22x0lY"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Load the [simulated dataset in CSV format](https://github.com/google/meridian/blob/main/meridian/data/simulated_data/csv/geo_all_channels.csv) as follows."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "tZd-ik8NbjK6"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "1\\. Read the data into a Pandas DataFrame."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from google.colab import drive\n",
+ "\n",
+ "drive.mount('/content/drive/')\n",
+ "\n",
+ "filepath = '/content/drive/MyDrive/testmeridian'\n",
+ "\n"
+ ],
+ "metadata": {
+ "id": "cAp1CZZfjOh7"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "df=pd.read_csv(\n",
+ " \"/content/drive/MyDrive/testmeridian/Simulated_Data.csv\"\n",
+ ")\n",
+ "df.head()"
+ ],
+ "metadata": {
+ "id": "OgTeru20jwDB"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "df.info()"
+ ],
+ "metadata": {
+ "id": "2CBdTS6dkuJT"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "df.describe()"
+ ],
+ "metadata": {
+ "id": "PFK5W14Ek8bt"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "corrmatrix=df.select_dtypes(include=np.number).corr()\n",
+ "corrmatrix.sort_values(by='conversions', ascending=False)"
+ ],
+ "metadata": {
+ "id": "ziuuC3f8noQK"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [],
+ "metadata": {
+ "id": "FDZv1MMYrwOS"
+ }
+ },
+ {
+ "metadata": {
+ "id": "7sV1ChiEYuyD"
+ },
+ "cell_type": "code",
+ "source": [
+ "#df = pd.read_csv(\n",
+ "# \"https://raw.githubusercontent.com/google/meridian/refs/heads/main/meridian/data/simulated_data/csv/geo_all_channels.csv\"\n",
+ "#)"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "8JBDZzl80BrY"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "2\\. Create a DataFrameInputDataBuilder instance."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "4qdTSk4a0znn"
+ },
+ "cell_type": "code",
+ "source": [
+ "builder = data_frame_input_data_builder.DataFrameInputDataBuilder(\n",
+ " kpi_type='non_revenue'\n",
+ ")"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "LNr75vQL1Zru"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "3\\. Offer the components to the builder. Note that the components may be offered all at once or piecewise."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "udaLGvwl1U8B"
+ },
+ "cell_type": "code",
+ "source": [
+ "builder = (\n",
+ " builder.with_kpi(df, kpi_col=\"conversions\")\n",
+ " .with_revenue_per_kpi(df, revenue_per_kpi_col=\"revenue_per_conversion\")\n",
+ " .with_population(df)\n",
+ " .with_controls(\n",
+ " df, control_cols=[\"Competitor_Discount\", \"GQV\", \"Geo_GDP\"],\n",
+ " time_col=\"time\",\n",
+ " geo_col=\"geo\"\n",
+ " )\n",
+ ")\n",
+ "\n",
+ "channels = [\"Channel0\", \"Channel1\", \"Channel2\", \"Channel3\", \"Channel4\", \"Channel5\"]\n",
+ "builder = builder.with_media(\n",
+ " df,\n",
+ " media_cols=[f\"{channel}_impression\" for channel in channels],\n",
+ " media_spend_cols=[f\"{channel}_spend\" for channel in channels],\n",
+ " media_channels=channels,\n",
+ ")\n",
+ "\n",
+ "data = builder.build()"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "DlF5vs8vb8Wn"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Note that the simulated data here does not contain reach and frequency. We recommend including reach and frequency data whenever they are available. For information about the advantages of utilizing reach and frequency, see [Bayesian Hierarchical Media Mix Model Incorporating Reach and Frequency Data](https://research.google/pubs/bayesian-hierarchical-media-mix-model-incorporating-reach-and-frequency-data/#:~:text=By%20incorporating%20R%26F%20into%20MMM,based%20on%20optimal%20frequency%20recommendations.). For code snippet for loading reach and frequency data, see [Load geo-level data with reach and frequency](https://developers.google.com/meridian/docs/user-guide/load-geo-data-with-rf)\n",
+ "\n",
+ "The documentation provides guidance for instances where reach and frequency data is accessible for specific channels. Additionally, for information about how to load other data types and formats, including data with reach and frequency, see [Supported data types and formats](https://developers.google.com/meridian/docs/user-guide/supported-data-types-formats)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "FO6pDd6f2V1L"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 2: Configure the model"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "a_mQI7HzxxK4"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Meridian uses Bayesian framework and Markov Chain Monte Carlo (MCMC) algorithms to sample from the posterior distribution.\n",
+ "\n",
+ "1\\. Inititalize the `Meridian` class by passing the loaded data and the customized model specification. One advantage of Meridian lies in its capacity to calibrate the model directly through ROI priors, as described in [Media Mix Model Calibration With Bayesian Priors](https://research.google/pubs/media-mix-model-calibration-with-bayesian-priors/). In this particular example, the ROI priors for all media channels are identical, with each being represented as Lognormal(0.2, 0.9)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "8XNDd7HX1qTn"
+ },
+ "cell_type": "code",
+ "source": [
+ "roi_mu = 0.2 # Mu for ROI prior for each media channel.\n",
+ "roi_sigma = 0.9 # Sigma for ROI prior for each media channel. #roi_sigma = 0.9\n",
+ "prior = prior_distribution.PriorDistribution(\n",
+ " roi_m=tfp.distributions.LogNormal(roi_mu, roi_sigma, name=constants.ROI_M)\n",
+ ")\n",
+ "model_spec = spec.ModelSpec(prior=prior)\n",
+ "\n",
+ "mmm = model.Meridian(input_data=data, model_spec=model_spec)"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "kPQBPlX8cmEv"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "2\\. Use the `sample_prior()` and `sample_posterior()` methods to obtain samples from the prior and posterior distributions of model parameters. If you are using the T4 GPU runtime this step may take about 10 minutes for the provided data set."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "KVB3avRdcRNz"
+ },
+ "cell_type": "code",
+ "source": [
+ "%%time\n",
+ "mmm.sample_prior(200) #mmm.sample_prior(500)\n",
+ "mmm.sample_posterior(\n",
+ " n_chains=5, n_adapt=2000, n_burnin=500, n_keep=1000, seed=1 # n_chains=10, n_adapt=2000, n_burnin=500, n_keep=1000, seed=1\n",
+ ")"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "5WUM2V26cspo"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "For more information about configuring the parameters and using a customized model specification, such as setting different ROI priors for each media channel, see [Configure the model](https://developers.google.com/meridian/docs/user-guide/configure-model)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "t9oECJwUdJTm"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 3: Run model diagnostics"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "kSzK6JeMxrV6"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "After the model is built, you must assess convergence, debug the model if needed, and then assess the model fit.\n",
+ "\n",
+ "1\\. Assess convergence. Run the following code to generate r-hat statistics. R-hat close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "rFuc7B86yLvM"
+ },
+ "cell_type": "code",
+ "source": [
+ "model_diagnostics = visualizer.ModelDiagnostics(mmm)\n",
+ "model_diagnostics.plot_rhat_boxplot()"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "nCwt5SGYxlaE"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "2\\. Assess the model's fit by comparing the expected sales against the actual sales."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "7Z4zJtHyyhif"
+ },
+ "cell_type": "code",
+ "source": [
+ "model_fit = visualizer.ModelFit(mmm)\n",
+ "model_fit.plot_model_fit()"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "76IBQcWLu980"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "For more information and additional model diagnostics checks, see [Modeling diagnostics](https://developers.google.com/meridian/docs/user-guide/model-diagnostics)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "zGUOFFbCdOtl"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 4: Generate model results & two-page output"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "puHjkyvZEOEg"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "To export the two-page HTML summary output, initialize the `Summarizer` class with the model object. Then pass in the filename, filepath, start date, and end date to `output_model_results_summary` to run the summary for that time duration and save it to the specified file."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "keOpq1qKNbq0"
+ },
+ "cell_type": "code",
+ "source": [
+ "mmm_summarizer = summarizer.Summarizer(mmm)"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "Ltr4uP80YQe7"
+ },
+ "cell_type": "code",
+ "source": [
+ "from google.colab import drive\n",
+ "\n",
+ "drive.mount('/content/drive')"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "qbgNaDYpIfQl"
+ },
+ "cell_type": "code",
+ "source": [
+ "filepath = '/content/drive/MyDrive/testmeridian'\n",
+ "start_date = '2021-01-25'\n",
+ "end_date = '2024-01-15'\n",
+ "mmm_summarizer.output_model_results_summary(\n",
+ " 'summary_output.html', filepath, start_date, end_date\n",
+ ")"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "j9sBxuvidmr8"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Here is a preview of the two-page output based on the simulated data:"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "vaUe7uZRfJPm"
+ },
+ "cell_type": "code",
+ "source": [
+ "IPython.display.HTML(filename='/content/drive/MyDrive/testmeridian/summary_output.html')"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "PphWMfKdwPIw"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "For a customized two-page report, model results summary table, and individual visualizations, see [Model results report](https://developers.google.com/meridian/docs/user-guide/generate-model-results-report) and [plot media visualizations](https://developers.google.com/meridian/docs/user-guide/plot-media-visualizations).\n",
+ "\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "msqwz2MN5mTq"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 5: Run budget optimization & generate an optimization report"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "khCL6Q2sS-iy"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "You can choose what scenario to run for the budget allocation. In default scenario, you find the optimal allocation across channels for a given budget to maximize the return on investment (ROI).\n",
+ "\n",
+ "1\\. Instantiate the `BudgetOptimizer` class and run the `optimize()` method without any customization, to run the default library's Fixed Budget Scenario to maximize ROI."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "38lhqyLvHf51"
+ },
+ "cell_type": "code",
+ "source": [
+ "%%time\n",
+ "budget_optimizer = optimizer.BudgetOptimizer(mmm)\n",
+ "optimization_results = budget_optimizer.optimize()"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "fLOMqDmCRKRO"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "2\\. Export the 2-page HTML optimization report, which contains optimized spend allocations and ROI."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "at7V7YEh_zwZ"
+ },
+ "cell_type": "code",
+ "source": [
+ "filepath = '/content/drive/MyDrive'\n",
+ "optimization_results.output_optimization_summary(\n",
+ " 'optimization_output.html', filepath\n",
+ ")"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "jq_mcrj1STDU"
+ },
+ "cell_type": "code",
+ "source": [
+ "IPython.display.HTML(filename='/content/drive/MyDrive/testmeridian/optimization_output.html')"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "kIWTubaN0RKC"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "For information about customized optimization scenarios, such as flexible budget scenarios, see [Budget optimization scenarios](https://developers.google.com/meridian/docs/user-guide/budget-optimization-scenarios). For more information about optimization results summary and individual visualizations, see [optimization results output](https://developers.google.com/meridian/docs/user-guide/generate-optimization-results-output) and [optimization visualizations](https://developers.google.com/meridian/docs/user-guide/plot-optimization-visualizations)."
+ ]
+ },
+ {
+ "metadata": {
+ "id": "3m98O3a_TrVg"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "\n",
+ "## Step 6: Save the model object"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "2Zjh64YG8Dti"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "We recommend that you save the model object for future use. This helps you to avoid repetitive model runs and saves time and computational resources. After the model object is saved, you can load it at a later stage to continue the analysis or visualizations without having to re-run the model.\n"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "1kamZpyv8KMh"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Run the following codes to save the model object:"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "FfaQQ8-fTw0K"
+ },
+ "cell_type": "code",
+ "source": [
+ "file_path = '/content/drive/MyDrive/testmeridian/saved_mmm.pkl'\n",
+ "model.save_mmm(mmm, file_path)"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "metadata": {
+ "id": "k2v_s2uS8PgA"
+ },
+ "cell_type": "markdown",
+ "source": [
+ "Run the following codes to load the saved model:"
+ ]
+ },
+ {
+ "metadata": {
+ "id": "ZGUmiYI48epA"
+ },
+ "cell_type": "code",
+ "source": [
+ "mmm = model.load_mmm(file_path)"
+ ],
+ "outputs": [],
+ "execution_count": null
+ },
+ {
+ "cell_type": "code",
+ "source": [],
+ "metadata": {
+ "id": "kVxCmuIEqOgd"
+ },
+ "execution_count": null,
+ "outputs": []
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "gpuType": "L4",
+ "private_outputs": true,
+ "provenance": [],
+ "machine_shape": "hm",
+ "include_colab_link": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file