Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

Aulas do Curso de Modelagem matemática IV da FGV-EMAp

3327 views
License: GPL3
default
Kernel: Python 3 (system-wide)

Estimando parâmetros de modelos EDO

Um dos problemas centrais da modelagem é como descobrir os valores "corretos" dos parâmetros do modelo. Já exploramos metodologias de otimização para este fim. Neste notebook vamos tratar este problema como um problema de inferência. Para isso vamos assumir que nossos parâmetros são variáveis aleatórias. Seja θ\theta o conjunto de parâmetros de nosso modelo e q(θ)q(\theta) a distribuição de probabilidade conjunta destes parâmetros. seja M\mathcal{M} o nosso modelo, e ϕ\phi o conjunto das saídas deste modelo: {xi(t)}i=1n\{x_i(t)\}_{i=1}^{n}, onde n é a dimensão do nosso modelo. ϕ=M(θ)\phi=\mathcal{M}(\theta)

Assumir que os parametros θ\theta do modelo são variáveis aleatórias implica que as saídas do modelo também passam a possuir uma distribuição de probabilidade. Vamos resolver este problema de inferência por meio de inferência Bayesiana. Vamos começar por atribuir uma distribuição a priori para θ\theta, que chamaremos de q(θ)q(\theta) que induz uma distribuição sobre as saídas do modelo, q(ϕ)q(\phi). Usaremos a fórmula de Bayes para estimar simultaneamente a distribuição posterior de θ\theta, π(θ)\pi(\theta) e de ϕ\phi, π(ϕ)\pi(\phi).

π(θdados)p(dadosθ)q(θ)\pi(\theta|dados)\propto p(dados|\theta)q(\theta)

Para estimar este modelo precisaremos lançar mão de métodos numéricos como algoritmos de Markov-chain Monte-Carlo (MCMC).

Para saber mais

  1. Coelho et al. (2011) A Bayesian Framework for Parameter Estimation in Dynamical Models

Nesta aula, vamos utilizar a biblioteca PyMC para fazer a estimação de parâmetros de um modelo SIR Para executar este notebook vamos precisar instalar o PyMC

pip install pymc
!pip install pymc
Defaulting to user installation because normal site-packages is not writeable Requirement already satisfied: pymc in /usr/local/lib/python3.10/dist-packages (4.1.7) Requirement already satisfied: scipy>=1.4.1 in /usr/lib/python3/dist-packages (from pymc) (1.8.0) Requirement already satisfied: aeppl==0.0.35 in /usr/local/lib/python3.10/dist-packages (from pymc) (0.0.35) Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.10/dist-packages (from pymc) (1.23.2) Requirement already satisfied: cloudpickle in /usr/local/lib/python3.10/dist-packages/cloudpickle-2.1.0-py3.10.egg (from pymc) (2.1.0) Requirement already satisfied: pandas>=0.24.0 in /home/fccoelho/.local/lib/python3.10/site-packages (from pymc) (1.4.3) Requirement already satisfied: cachetools>=4.2.1 in /usr/local/lib/python3.10/dist-packages (from pymc) (5.2.0) Requirement already satisfied: aesara==2.8.2 in /usr/local/lib/python3.10/dist-packages (from pymc) (2.8.2) Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.10/dist-packages (from pymc) (4.3.0) Requirement already satisfied: fastprogress>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from pymc) (1.0.3) Requirement already satisfied: arviz>=0.12.0 in /usr/local/lib/python3.10/dist-packages (from pymc) (0.12.1) Requirement already satisfied: etuples in /usr/local/lib/python3.10/dist-packages (from aesara==2.8.2->pymc) (0.3.7) Requirement already satisfied: logical-unification in /usr/local/lib/python3.10/dist-packages (from aesara==2.8.2->pymc) (0.4.5) Requirement already satisfied: setuptools>=48.0.0 in /usr/lib/python3/dist-packages (from aesara==2.8.2->pymc) (59.6.0) Requirement already satisfied: miniKanren in /usr/local/lib/python3.10/dist-packages (from aesara==2.8.2->pymc) (1.0.3) Requirement already satisfied: cons in /usr/local/lib/python3.10/dist-packages (from aesara==2.8.2->pymc) (0.4.5) Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from aesara==2.8.2->pymc) (3.6.0) Requirement already satisfied: xarray-einstats>=0.2 in /usr/local/lib/python3.10/dist-packages (from arviz>=0.12.0->pymc) (0.3.0) Requirement already satisfied: xarray>=0.16.1 in /usr/local/lib/python3.10/dist-packages (from arviz>=0.12.0->pymc) (2022.6.0) Requirement already satisfied: packaging in /home/fccoelho/.local/lib/python3.10/site-packages (from arviz>=0.12.0->pymc) (21.3) Requirement already satisfied: matplotlib>=3.0 in /home/fccoelho/.local/lib/python3.10/site-packages (from arviz>=0.12.0->pymc) (3.5.3) Requirement already satisfied: netcdf4 in /usr/local/lib/python3.10/dist-packages (from arviz>=0.12.0->pymc) (1.6.0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.24.0->pymc) (2022.2.1) Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.24.0->pymc) (2.8.2) Requirement already satisfied: pyparsing>=2.2.1 in /usr/lib/python3/dist-packages (from matplotlib>=3.0->arviz>=0.12.0->pymc) (2.4.7) Requirement already satisfied: fonttools>=4.22.0 in /usr/lib/python3/dist-packages (from matplotlib>=3.0->arviz>=0.12.0->pymc) (4.29.1) Requirement already satisfied: cycler>=0.10 in /usr/lib/python3/dist-packages (from matplotlib>=3.0->arviz>=0.12.0->pymc) (0.11.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/lib/python3/dist-packages (from matplotlib>=3.0->arviz>=0.12.0->pymc) (1.3.2) Requirement already satisfied: pillow>=6.2.0 in /usr/lib/python3/dist-packages (from matplotlib>=3.0->arviz>=0.12.0->pymc) (9.0.1) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.1->pandas>=0.24.0->pymc) (1.16.0) Requirement already satisfied: toolz in /usr/local/lib/python3.10/dist-packages/toolz-0.12.0-py3.10.egg (from logical-unification->aesara==2.8.2->pymc) (0.12.0) Requirement already satisfied: multipledispatch in /usr/local/lib/python3.10/dist-packages (from logical-unification->aesara==2.8.2->pymc) (0.6.0) Requirement already satisfied: cftime in /usr/local/lib/python3.10/dist-packages (from netcdf4->arviz>=0.12.0->pymc) (1.6.1) --- Logging error --- Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/pip/_internal/utils/logging.py", line 177, in emit self.console.print(renderable, overflow="ignore", crop=False, style=style) File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/rich/console.py", line 1673, in print extend(render(renderable, render_options)) File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/rich/console.py", line 1305, in render for render_output in iter_render: File "/usr/local/lib/python3.10/dist-packages/pip/_internal/utils/logging.py", line 134, in __rich_console__ for line in lines: File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/rich/segment.py", line 249, in split_lines for segment in segments: File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/rich/console.py", line 1283, in render renderable = rich_cast(renderable) File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/rich/protocol.py", line 36, in rich_cast renderable = cast_method() File "/usr/local/lib/python3.10/dist-packages/pip/_internal/self_outdated_check.py", line 130, in __rich__ pip_cmd = get_best_invocation_for_this_pip() File "/usr/local/lib/python3.10/dist-packages/pip/_internal/utils/entrypoints.py", line 58, in get_best_invocation_for_this_pip if found_executable and os.path.samefile( File "/usr/lib/python3.10/genericpath.py", line 101, in samefile s2 = os.stat(f2) FileNotFoundError: [Errno 2] Arquivo ou diretório inexistente: '/usr/bin/pip' Call stack: File "/usr/local/bin/pip", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/main.py", line 70, in main return command.main(cmd_args) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/base_command.py", line 101, in main return self._main(args) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/base_command.py", line 223, in _main self.handle_pip_version_check(options) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/req_command.py", line 190, in handle_pip_version_check pip_self_version_check(session, options) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/self_outdated_check.py", line 236, in pip_self_version_check logger.warning("[present-rich] %s", upgrade_prompt) File "/usr/lib/python3.10/logging/__init__.py", line 1489, in warning self._log(WARNING, msg, args, **kwargs) File "/usr/lib/python3.10/logging/__init__.py", line 1624, in _log self.handle(record) File "/usr/lib/python3.10/logging/__init__.py", line 1634, in handle self.callHandlers(record) File "/usr/lib/python3.10/logging/__init__.py", line 1696, in callHandlers hdlr.handle(record) File "/usr/lib/python3.10/logging/__init__.py", line 968, in handle self.emit(record) File "/usr/local/lib/python3.10/dist-packages/pip/_internal/utils/logging.py", line 179, in emit self.handleError(record) Message: '[present-rich] %s' Arguments: (UpgradePrompt(old='22.2.2', new='22.3'),)
!pip install graphviz
%matplotlib inline import pymc as pm from pymc.ode import DifferentialEquation import numpy as np import matplotlib.pyplot as plt from scipy.integrate import odeint import arviz as az import aesara as ae plt.style.use('seaborn-darkgrid')
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.16.0-unknown is an invalid version and will not be supported in a future release warnings.warn( /usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release warnings.warn( /usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release warnings.warn(

def SIR(y, t, p): ds = -p[0] * y[0] * y[1] di = p[0] * y[0] * y[1] - p[1] * y[1] return [ds, di]

times = np.arange(0, 5, 0.25)

beta,gamma = 4,1.0

Gerando curvas simuladas

y = odeint(SIR, t=times, y0=[0.99, 0.01], args=((beta, gamma),), rtol=1e-8)

Simulando dados Assumindo uma distribuição log-normal com média igual às séries simuladas

yobs = np.random.lognormal(mean=np.log(y[1::]), sigma=[0.2, 0.3])

plt.plot(times[1::], yobs, marker='o', linestyle='none') plt.plot(times, y[:, 0], color='C0', alpha=0.5, label=f'S(t)S(t)') plt.plot(times, y[:, 1], color='C1', alpha=0.5, label=f'I(t)I(t)'); plt.legend();

def SIR(y, t, p): ds = -p[0] * y[0] * y[1] di = p[0] * y[0] * y[1] - p[1] * y[1] return [ds, di] times = np.arange(0, 5, 0.25) beta,gamma = 4,1.0 # Gerando curvas simuladas y = odeint(SIR, t=times, y0=[0.99, 0.01], args=((beta, gamma),), rtol=1e-8) # Simulando dados Assumindo uma distribuição log-normal com média igual às séries simuladas yobs = np.random.lognormal(mean=np.log(y[1::]), sigma=[0.2, 0.3]) plt.plot(times[1::], yobs, marker='o', linestyle='none') plt.plot(times, y[:, 0], color='C0', alpha=0.5, label=f'$S(t)$') plt.plot(times, y[:, 1], color='C1', alpha=0.5, label=f'$I(t)$'); plt.legend();
Image in a Jupyter notebook

Abaixo deifinimos nosso modelo usando a classe DifferentialEquation do PyMC3

sir_model = DifferentialEquation( func=SIR, times=np.arange(0.25, 5, 0.25), n_states=2, n_theta=2, t0=0, )

Construindo o modelo de Inferência

Agora precisamos definir as distribuições a priori dos parâmetros do modelo e a verossimilhança de ϕ\phi.

with pm.Model() as model: sigma = pm.HalfCauchy('sigma', 1, shape=2) # Distribuições a priori # R0 é limitada inferiormente em 1 para sempre termos uma epidemia. R0 = pm.Truncated('R0',pm.Normal.dist(2,3), lower=1) gam = pm.Lognormal('gamma', pm.math.log(2), 2) beta = pm.Deterministic('beta', gam * R0) sir_curves = sir_model(y0=[0.99, 0.01], theta=[beta, gam]) Y = pm.Lognormal('Y', mu=pm.math.log(sir_curves), sigma=sigma, observed=yobs) # db = pm.backends.HDF5('traces.h5') # Salva as amostras e assim evita de manter tudo na memória trace = pm.sample(2000, tune=1000)#, cores=4)
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [sigma, R0, gamma]
Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 1386 seconds.
pm.model_to_graphviz(model)
Image in a Jupyter notebook
# data = az.from_pymc3(trace=trace) # data trace
az.plot_posterior(trace, round_to=2, hdi_prob=0.95);
Image in a Jupyter notebook
az.plot_trace(trace);
Image in a Jupyter notebook
az.summary(trace.posterior, kind='stats')
az.summary(trace.posterior, kind='diagnostics')

Relembrando de onde partimos: Amostrando da Priori

with model: prior_idata = pm.sample_prior_predictive()
Sampling: [R0, Y, gamma, sigma]
prior_idata
(Output Hidden)

A partir destas amostras temos que realizar simulações para reconstruir a distribuição a priori induzida da saída do nossso modelo. Para isso vamos escrever uma simples função que além de realizar as simulações também as adiciona a um objeto DataArray da biblioteca Xarray.

import xarray as xr
def simul_output(idata, group='posterior'): idata = idata.prior if group == "prior" else idata.posterior i = 0 for b, g in zip(np.array(idata['beta'])[0],np.array(idata['gamma'])[0]): y = odeint(SIR, t=np.arange(0.25, 5, 0.25), y0=[0.99, 0.01], args=((b, g),), rtol=1e-8) if i == 0: sims = xr.DataArray(y,coords=[range(y.shape[0]),['S','I']], dims=["time", "state"]) else: y = xr.DataArray(y,coords=[range(y.shape[0]),['S','I']], dims=["time", "state"]) sims = xr.concat([sims,y], "reps") i +=1 return sims sims = simul_output(prior_idata, "prior")

Agora que temos as 500 simulações podemos computar a mediana e o intervalo de credibilidade de 95%:

sims.quantile([0.025,0.5,0.975],'reps')#.plot.line(x='time');

Agora só nos resta plotar as curvas:

def plot_bands(xarr): bands = xarr.quantile([0.025,0.5,0.975],'reps') plt.fill_between(range(19),bands.sel(quantile=0.025, state='S'),bands.sel(quantile=0.975, state='S'),alpha=0.3) plt.plot(range(19), bands.sel(quantile=0.5, state='S'), label='S') plt.fill_between(range(19),bands.sel(quantile=0.025, state='I'),bands.sel(quantile=0.975, state='I'),alpha=0.3) plt.plot(range(19), bands.sel(quantile=0.5, state='I'), label='I') plt.legend() plot_bands(sims)
Image in a Jupyter notebook

Amostrando da Distribuição Preditiva Posterior

Agora faremos o mesmo para a distribuição posterior dos parâmetros.

with model: trace.extend(pm.sample_posterior_predictive(trace, random_seed=3546))
Sampling: [Y]
trace
WARNING: Some output was deleted.
post_sims = simul_output(trace) post_sims
plot_bands(post_sims)
Image in a Jupyter notebook
%load_ext watermark %watermark -n -u -v -iv -w
The watermark extension is already loaded. To reload it, use: %reload_ext watermark Last updated: Wed Oct 19 2022 Python implementation: CPython Python version : 3.10.6 IPython version : 7.34.0 xarray : 2022.6.0 pymc : 4.2.2 arviz : 0.12.1 aesara : 2.8.7 matplotlib: 3.5.3 numpy : 1.23.2 Watermark: 2.3.1