{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Parkinson's Disease Progression Modeling with Leaspy\n\nThis example demonstrates how to use Leaspy to model the progression of Parkinson's disease using synthetic data.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The following imports bring in the required modules and load the synthetic dataset from Leaspy.\nThe dataset contains repeated measurements for multiple subjects over time.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from leaspy.datasets import load_dataset\nfrom leaspy.io.data import Data\n\ndf = load_dataset(\"parkinson\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The first few rows of the dataset provide an overview of its structure.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "df.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The total number of unique subjects present in the dataset is shown below.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "n_subjects = df.index.get_level_values(\"ID\").unique().shape[0]\nprint(f\"{n_subjects} subjects in the dataset.\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The dataset is separated into a training set and a test set.\nThe first portion of the data is used for training and the remaining portion for testing.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "df_train = df.loc[:\"GS-160\"][[\"MDS1_total\", \"MDS2_total\", \"MDS3_off_total\"]]\ndf_test = df.loc[\"GS-161\":][[\"MDS1_total\", \"MDS2_total\", \"MDS3_off_total\"]]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The pandas DataFrames are converted to Leaspy `Data` objects for further modeling.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "data_train = Data.from_dataframe(df_train)\ndata_test = Data.from_dataframe(df_test)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The logistic model is imported and initialized.\nA two-dimensional source space is chosen to represent disease progression trajectories.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from leaspy.models import LogisticModel\n\nmodel = LogisticModel(name=\"test-model\", source_dimension=2)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Visualization utilities from Leaspy and Matplotlib are imported.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\nfrom leaspy.io.logs.visualization.plotting import Plotting\n\nleaspy_plotting = Plotting(model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Data that will be used to fit the model can be illustrated as follows:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ax = leaspy_plotting.patient_observations(data_train, alpha=0.7, figsize=(14, 6))\nax.set_ylim(0, 0.8)  # The y-axis is adjusted for better visibility.\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The model is fitted to the training data using the MCMC-SAEM algorithm.\nA fixed seed is used for reproducibility and 100 iterations are performed.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "model.fit(\n    data_train,\n    \"mcmc_saem\",\n    seed=0,\n    n_iter=100,\n    progress_bar=False,\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The average trajectory estimated by the model is displayed.\nThis figure shows the mean disease progression curves for all features.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ax = leaspy_plotting.average_trajectory(\n    alpha=1, figsize=(14, 6), n_std_left=2, n_std_right=8\n)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Individual parameters are obtained for the test data using the personalization step.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ip = model.personalize(data_test, \"scipy_minimize\", seed=0, progress_bar=False, use_jacobian=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The test data with individually re-parametrized ages is plotted below.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ax = leaspy_plotting.patient_observations_reparametrized(\n    data_test, ip, alpha=0.7, linestyle=\"-\", figsize=(14, 6)\n)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The test data with the true ages (without re-parametrization) is plotted below.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ax = leaspy_plotting.patient_observations(\n    data_test,\n    alpha=0.7,\n    linestyle=\"-\",\n    figsize=(14, 6),\n)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Observations for a specific subject are extracted to demonstrate reconstruction.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nobservations = df_test.loc[\"GS-187\"]\nprint(f\"Seen ages: {observations.index.values}\")\nprint(\"Individual Parameters : \", ip[\"GS-187\"])\n\ntimepoints = np.linspace(60, 100, 100)\nreconstruction = model.estimate({\"GS-187\": timepoints}, ip)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The reconstructed trajectory along with the actual observations for selected subjects is displayed.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "ax = leaspy_plotting.patient_trajectories(\n    data_test,\n    ip,\n    patients_idx=[\"GS-187\"],\n    labels=[\"MDS1\", \"MDS2\", \"MDS3 (off)\"],\n    figsize=(16, 6),\n    factor_future=5,\n)\nax.set_xlim(45, 120)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This concludes the Parkinson's disease progression modeling example using Leaspy.\nLeaspy is also capable of handling various other types of models, as the Joint Models,\nwhich will be explored in the [next section](./plot_03_joint).\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.12"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}