베이지안 추정

Bayesian Estimation

모수를 미지의 확률변수로 두고, 사전분포 (prior) 와 데이터의 우도 (likelihood) 를 Bayes 정리로 결합해 사후분포 (posterior) 를 도출하는 추정 paradigm. 점추정 대신 posterior 의 분포 전체를 얻으므로 모수 불확실성과 응답자 이질성을 자연스럽게 다룰 수 있다. 닫힌 형태의 사후분포를 얻기 어려운 비선형·계층 모형에서는 markov-chain-monte-carlo 로 사후분포에서 표본을 추출해 추론한다.

유형: 모수, 확률 paradigm, 시뮬레이션 기반 추정
핵심 가정: prior 의 적절한 선택, likelihood 의 정확한 specification, MCMC 의 수렴
주요 변형: hierarchical Bayes, empirical Bayes, variational Bayes, Gibbs sampling, Metropolis-Hastings, Hamiltonian Monte Carlo

개요

Bayes (1763) 의 An Essay towards solving a Problem in the Doctrine of Chances 가 출발점이지만, 20 세기 후반까지는 사후분포 계산의 어려움 때문에 frequentist 추정에 밀려 있었다. Geman-Geman (1984) 의 Gibbs sampling 과 Hastings (1970) 의 Metropolis-Hastings 알고리즘이 markov-chain-monte-carlo 기반 시뮬레이션 추정을 가능하게 하면서 1990 년대부터 응용이 폭발했다. Gelfand-Smith (1990) 가 통계학 mainstream 에 Gibbs sampling 을 도입했고, Allenby-Rossi (1999) 와 Train (2003) 의 Discrete Choice Methods with Simulation 이 이산선택 모형 의 hierarchical Bayes 추정을 marketing·교통·환경 경제학의 표준 도구로 만들었다. 진술선호법 데이터에서 응답자별 선호 이질성 (consumer heterogeneity) 을 포착하기 위해 응답자별 random parameter 분포의 hyperparameter 까지 동시 추정하는 hierarchical Bayes 는 mixed logit 의 simulated maximum likelihood 의 자연스러운 대안이 됐다.

핵심 식·정의

데이터 $y$ 와 모수 $\theta$ 에 대해 사후분포는 Bayes 정리로 다음과 같이 정의된다.

p(\theta \mid y) = \frac{p(y \mid \theta) \, p(\theta)}{p(y)} \propto L(\theta; y) \cdot p(\theta)

여기서 $p(\theta)$ 는 사전분포, $L(\theta; y) = p(y \mid \theta)$ 는 우도, $p(y) = \int p(y \mid \theta) p(\theta) d\theta$ 는 정규화 상수다. 점추정은 posterior mean $\hat\theta = \int \theta \, p(\theta \mid y) d\theta$ 또는 posterior mode (MAP) 로, 신뢰구간은 posterior 의 quantile (credible interval) 로 얻는다. hierarchical 모형에서는 응답자 $i$ 의 모수 $\theta_i$ 가 hyperparameter $\phi$ 의 분포 $p(\theta_i \mid \phi)$ 에서 추출된다고 두고, Gibbs sampler 가 $\theta_i \mid \phi, y$ 와 $\phi \mid \{\theta_i\}, y$ 를 번갈아 표본 추출해 수렴 후의 표본으로 사후분포를 근사한다.

TEMEP 라인

이종수 · 김연배 의 conjoint analysis · stated preference 라인이 베이지안 추정의 중심 응용처다. An Analysis of Consumer Preferences among Wireless LAN and Mobile Internet Services 이 순위형 로짓 모형 에 hierarchical Bayes 를 결합해 응답자별 선호 이질성을 포착하는 방법론 격상 논문이고, Consumer preferences for alternative fuel vehicles in South Korea 가 대체연료차 수요 추정에서 hierarchical Bayes 의 정책 응용을 본격화한다. Analysis on the Business Strategy and Policy for the Alternative Fuel Vehicle: Using Stated Preference Data 와 Effects of consumer preferences on the convergence of mobile telecommunications devices 가 베이지안 이산선택 모형 로 컨버전스 단말·사업 전략의 선호 구조를 추정하고, Estimating the extent of potential competition in the Korean mobile telecommunications market: Switching costs and number portability 이 통신 번호이동성 정책 분석에서 베이지안 이산선택 모형 을 활용한다.

이 라인의 공통 motivation 은 단순 다항로짓 의 representative-agent 가정이 stated preference 의 풍부한 응답자별 정보를 사장한다는 것이다. hierarchical Bayes 추정은 응답자별 random parameter 의 분포를 동시 추정해 willingness-to-pay 의 individual-level posterior 까지 얻을 수 있고, 이는 신기술 수요 예측·시장 세분화·정책 수용성 분석의 quantitative anchor 가 됐다.

인접 그래프

1-hop 이웃 130개

인물 2
방법론 89
개념 7
주제 3
분류 2
논문 27

휠 = 확대/축소 · 드래그 = 이동 · hover = 강조 · 클릭 = 페이지 이동

베이지안 추정

개요

핵심 식·정의

TEMEP 라인

See also

인접 그래프

이 문서를 가리키는 페이지

논문 (27)

개념 (7)

방법론 (86)

인물 (2)

주제 (2)