Computational Statistics in Data Science. Группа авторов. Читать онлайн. Newlib. NEWLIB.NET

Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119561088
Скачать книгу
ellipsis comma x Subscript upper N Baseline right-parenthesis"/>. We specify our model for the data with a likelihood function normal pi left-parenthesis bold x vertical-bar bold-italic theta right-parenthesis equals product Underscript n equals 1 Overscript upper N Endscripts normal pi left-parenthesis x Subscript n Baseline vertical-bar bold-italic theta right-parenthesis and use a prior distribution with density function normal pi left-parenthesis bold-italic theta right-parenthesis to characterize our belief about the value of the upper P‐dimensional parameter vector bold-italic theta a priori. The target of Bayesian inference is the posterior distribution of bold-italic theta conditioned on bold x

      The denominator's multidimensional integral quickly becomes impractical as upper P grows large, so we choose to use the MetropolisHastings (M–H) algorithm to generate a Markov chain with stationary distribution normal pi left-parenthesis bold-italic theta vertical-bar bold x right-parenthesis [19, 20]. We begin at an arbitrary position bold-italic theta Superscript left-parenthesis 0 right-parenthesis and, for each iteration s equals 0 comma ellipsis comma upper S, randomly generate the proposal state bold-italic theta Superscript asterisk from the transition distribution with density q left-parenthesis bold-italic theta Superscript asterisk Baseline vertical-bar bold-italic theta Superscript left-parenthesis s right-parenthesis Baseline right-parenthesis. We then accept proposal state bold-italic theta Superscript asterisk with probability

      2.2 Big P

      One of the simplest models for big upper P problems is ridge regression [23], but computing can become expensive even in this classical setting. Ridge regression estimates the coefficient bold-italic theta by minimizing the distance between the observed and predicted values bold y and bold upper X bold-italic theta along with a weighted square norm of bold-italic theta:

StartLayout 1st Row 1st Column ModifyingAbove bold-italic theta With Ì‚ equals argmin left-brace double-vertical-bar bold y minus bold upper X bold-italic theta double-vertical-bar squared plus double-vertical-bar bold upper Phi Superscript 1 slash 2 Baseline bold-italic theta double-vertical-bar squared right-brace equals left-parenthesis bold upper X Superscript intercalate Baseline bold upper X plus bold upper Phi right-parenthesis Superscript negative 1 Baseline bold upper X Superscript intercalate Baseline bold y 2nd Column Blank EndLayout