PyMC3の序数予測変数をモデル化するときの最適化の落とし穴を回避する

Doing Bayesian Data Analysisの第23章のアプローチに基づいてPyMC3を使用して序数予測変数をモデル化しようとしています。 find_MAPを使用して良い開始値を決定したいと思いますが、最適化エラーが発生しています。 PyMC3の序数予測変数をモデル化するときの最適化の落とし穴を回避する

モデル：

Applied interval-transform to sigma and added transformed sigma_interval_ to model. 
Applied stickbreaking-transform to thresh and added transformed thresh_stickbreaking_ to model. 
Traceback (most recent call last): 
    File "cm_net_log.v1-for_so.py", line 53, in <module> 
    start = pm.find_MAP() 
    File "/usr/local/lib/python3.5/site-packages/pymc3/tuning/starting.py", line 133, in find_MAP 
    specific_errors) 
ValueError: Optimization error: max, logp or dlogp at max have non-finite values. Some values may be outside of distribution support. max: {'thresh_stickbreaking_': array([-1.04298465, -0.48661088, -0.84326554, -0.44833646]), 'sigma_interval_': array(-2.220446049250313e-16), 'mu': array(7.68422528308479)} logp: array(-3506.530143064723) dlogp: array([ 1.61013190e-06,    nan, -6.73994118e-06, 
     -6.93873894e-06, 6.03358122e-06, 3.18954680e-06])Check that 1) you don't have hierarchical parameters, these will lead to points with infinite density. 2) your distribution logp's are properly specified. Specific issues:

私の質問：

import pymc3 as pm import numpy as np import theano import theano.tensor as tt # Some helper functions def cdf(x, location=0, scale=1): epsilon = np.array(1e-32, dtype=theano.config.floatX) location = tt.cast(location, theano.config.floatX) scale = tt.cast(scale, theano.config.floatX) div = tt.sqrt(2 * scale ** 2 + epsilon) div = tt.cast(div, theano.config.floatX) erf_arg = (x - location)/div return .5 * (1 + tt.erf(erf_arg + epsilon)) def percent_to_thresh(idx, vect): return 5 * tt.sum(vect[:idx + 1]) + 1.5 def full_thresh(thresh): idxs = tt.arange(thresh.shape[0] - 1) thresh_mod, updates = theano.scan(fn=percent_to_thresh, sequences=[idxs], non_sequences=[thresh]) return tt.concatenate([[-1 * np.inf, 1.5], thresh_mod, [6.5, np.inf]]) def compute_ps(thresh, location, scale): f_thresh = full_thresh(thresh) return cdf(f_thresh[1:], location, scale) - cdf(f_thresh[:-1], location, scale) # Generate data real_ps = [0.05, 0.05, 0.1, 0.1, 0.2, 0.3, 0.2] data = np.random.choice(7, size=1000, p=real_ps) # Run model with pm.Model() as model: mu = pm.Normal('mu', mu=4, sd=3) sigma = pm.Uniform('sigma', lower=0.1, upper=70) thresh = pm.Dirichlet('thresh', a=np.ones(5)) cat_p = compute_ps(thresh, mu, sigma) results = pm.Categorical('results', p=cat_p, observed=data) with model: start = pm.find_MAP() trace = pm.sample(2000, start=start)

これを実行している、私は次のエラーが表示さdlogpがNaNである理由

にはどうすればいいかを決定することができますある時点で？
dlogpを避けるためにこのモデルを表現できる方法はありますか？

も注目に値する：私はfind_MAPとメトロポリスサンプラーを使用しない場合

このモデルは、細かい動作します。しかし、私はこのモデルがより複雑になるので、他のサンプラーを使用する柔軟性を持っていたいと思います。
私はこの問題は閾値と正規分布との関係によるものではないかと疑いがありますが、最適化のためにそれらを解き分ける方法はわかりません。

出典

2016-11-04 Matthew Cox

質問2について：私は序数予測変数（単一グループ）のモデルを異なって表現しました。私は結果の確率を計算する関数にTheano @as_opデコレータを使用しました。これはまた、私がfind_MAP()またはグラデーションベースのサンプラーを使用できない理由を説明しています。Theanoはカスタム関数のグラデーションを計算できません。（http://pymc-devs.github.io/pymc3/notebooks/getting_started.html#Arbitrary-deterministics）

# Number of outcomes 
nYlevels = df.Y.cat.categories.size 

thresh = [k + .5 for k in range(1, nYlevels)] 
thresh_obs = np.ma.asarray(thresh) 
thresh_obs[1:-1] = np.ma.masked 

@as_op(itypes=[tt.dvector, tt.dscalar, tt.dscalar], otypes=[tt.dvector]) 
def outcome_probabilities(theta, mu, sigma): 
    out = np.empty(nYlevels) 
    n = norm(loc=mu, scale=sigma)  
    out[0] = n.cdf(theta[0])   
    out[1] = np.max([0, n.cdf(theta[1]) - n.cdf(theta[0])]) 
    out[2] = np.max([0, n.cdf(theta[2]) - n.cdf(theta[1])]) 
    out[3] = np.max([0, n.cdf(theta[3]) - n.cdf(theta[2])]) 
    out[4] = np.max([0, n.cdf(theta[4]) - n.cdf(theta[3])]) 
    out[5] = np.max([0, n.cdf(theta[5]) - n.cdf(theta[4])]) 
    out[6] = 1 - n.cdf(theta[5]) 
    return out 

with pm.Model() as ordinal_model_single:  

    theta = pm.Normal('theta', mu=thresh, tau=np.repeat(.5**2, len(thresh)), 
         shape=len(thresh), observed=thresh_obs, testval=thresh[1:-1]) 

    mu = pm.Normal('mu', mu=nYlevels/2.0, tau=1.0/(nYlevels**2)) 
    sigma = pm.Uniform('sigma', nYlevels/1000.0, nYlevels*10.0) 

    pr = outcome_probabilities(theta, mu, sigma) 

    y = pm.Categorical('y', pr, observed=df.Y.cat.codes.as_matrix())

http://nbviewer.jupyter.org/github/JWarmenhoven/DBDA-python/blob/master/Notebooks/Chapter%2023.ipynb

出典

2017-04-17 21:11:19 Jordi

PyMC3の序数予測変数をモデル化するときの最適化の落とし穴を回避する

答えて

関連する問題