Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Читать онлайн. Newlib. NEWLIB.NET

Автор: Savo G. Glisic
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Программы
Год издания: 0
isbn: 9781119790310
Скачать книгу
gain (ig) is an impurity‐based criterion that uses the entropy (e) measure (origin from information theory) as the impurity measure:

italic i g left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals e left-parenthesis y comma upper S right-parenthesis minus sigma-summation Underscript v Subscript i comma j Baseline element-of dom left-parenthesis a Subscript i Baseline right-parenthesis Endscripts StartFraction bar sigma Subscript a Sub Subscript ModifyingAbove normal t With ampersand c period dotab semicolon Subscript equals v Sub Subscript i comma j Subscript Baseline upper S bar Over bar upper S bar EndFraction dot e left-parenthesis y comma sigma Subscript a Sub Subscript ModifyingAbove tau With ampersand c period dotab semicolon Subscript equals v Sub Subscript i comma j Subscript Baseline upper S right-parenthesis

      where

      (2.14)e left-parenthesis y comma upper S right-parenthesis equals sigma-summation Underscript c Subscript j Baseline dot element-of dom left-parenthesis y right-parenthesis Endscripts minus StartFraction bar sigma Subscript y equals c Sub Subscript j Subscript Baseline upper S bar Over bar upper S bar EndFraction log Subscript 2 Baseline StartFraction bar sigma Subscript y equals c Sub Subscript j Subscript Baseline period upper S bar Over bar upper S bar EndFraction period

      Gini index: This is an impurity‐based criterion that measures the divergence between the probability distributions of the target attribute’s values. The Gini (G) index is defined as

      (2.15)upper G left-parenthesis y comma upper S right-parenthesis equals 1 minus sigma-summation Underscript c Subscript j Baseline dot element-of dom left-parenthesis y right-parenthesis Endscripts left-parenthesis StartFraction bar sigma Subscript y equals c Sub Subscript j Subscript Baseline period upper S bar Over bar upper S bar EndFraction right-parenthesis squared

      Consequently, the evaluation criterion for selecting the attribute ai is defined as the Gini gain (GG):

      (2.16)italic upper G upper G left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals upper G left-parenthesis y comma upper S right-parenthesis minus sigma-summation Underscript v Subscript ModifyingAbove l With ampersand c period dotab semicolon comma j Baseline dot element-of dom left-parenthesis a Subscript ModifyingAbove x With ampersand c period dotab semicolon Baseline right-parenthesis Endscripts StartFraction bar sigma Subscript a Sub Subscript i Subscript equals v Sub Subscript question-mark comma j Subscript Baseline period upper S bar Over bar upper S bar EndFraction dot upper G left-parenthesis y comma sigma Subscript a Sub Subscript i Subscript equals v Sub Subscript question-mark comma j Subscript Baseline upper S right-parenthesis period

      Likelihood ratio chisquared statistics: The likelihood ratio (lr) is defined as

      (2.17)italic l r left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals 2 dot ln left-parenthesis 2 right-parenthesis dot bar upper S bar dot italic i g left-parenthesis a Subscript i Baseline comma upper S right-parenthesis period

      This ratio is useful for measuring the statistical significance of the information gain criteria. The zero hypothesis (H0) is that the input attribute and the target attribute are conditionally independent. If H0 holds, the test statistic is distributed as χ2 with degrees of freedom equal to (dom(ai) − 1) · (dom(y) − 1).

      Gain ratio ( gr): This ratio “normalizes” the information gain (ig) as follows: gr(ai, S) = ig(ai, S)/e(ai, S). Note that this ratio is not defined when the denominator is zero. Also, the ratio may tend to favor attributes for which the denominator is very small. Consequently, it is suggested in two stages. First, the information gain is calculated for all attributes. Then, taking into consideration only attributes that have performed at least as well as the average information gain, the attribute that has obtained the best ratio gain is selected. It has been shown that the gain ratio tends to outperform simple information gain criteria both from the accuracy aspect as well as from classifier complexity aspect.

      Distance measure: Similar to the gain ratio, this measure also normalizes the impurity measure. However, the method used is different:

italic upper D upper M left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals StartFraction upper Delta upper Phi left-parenthesis a Subscript i Baseline comma upper S right-parenthesis Over minus sigma-summation Underscript v Subscript i comma j Baseline element-of dom left-parenthesis a Subscript i Baseline right-parenthesis Endscripts sigma-summation Underscript c Subscript k Baseline element-of dom left-parenthesis y right-parenthesis Endscripts b dot log Subscript 2 Baseline b EndFraction

      where

      (2.18)b equals StartFraction bar sigma Subscript a Sub Subscript i Subscript equals v Sub Subscript i comma j Subscript and Subscript y equals c Sub Subscript k Subscript Baseline upper S bar Over bar upper S bar EndFraction

      Binary criteria: These are used for creating binary decision trees. These measures are based on the division of the input attribute domain into two subdomains.

      Let β(ai, d1, d2, S) denote the binary criterion value for attribute ai over sample S when d1 and d2 are its corresponding subdomains. The value obtained for the optimal division of the attribute domain into two mutually exclusive and exhaustive subdomains, is used for comparing attributes, namely

      (2.19)StartLayout 1st Row beta asterisk left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals max beta left-parenthesis a Subscript i Baseline comma d 1 comma d 2 comma upper S right-parenthesis 2nd Row normal s period normal t period d 1 union d 2 equals dom left-parenthesis a Subscript i Baseline right-parenthesis 3rd Row d 1 intersection d 2 equals empty-set period EndLayout

      Twoing criterion: The Gini index may encounter problems when the domain of the target attribute is relatively wide. In this case, they suggest using the binary criterion called the twoing (tw) criterion. This criterion is defined as

      (2.20)italic t w left-parenthesis a Subscript i Baseline comma d 1 comma d 2 comma upper S right-parenthesis equals 0.25 StartFraction StartAbsoluteValue sigma Subscript a Sub Subscript i element-of d 1 Subscript Baseline upper S EndAbsoluteValue Over StartAbsoluteValue upper S EndAbsoluteValue EndFraction dot StartFraction StartAbsoluteValue sigma Subscript a Sub Subscript i element-of d 2 Subscript Baseline upper S EndAbsoluteValue Over StartAbsoluteValue upper S EndAbsoluteValue EndFraction left-parenthesis sigma-summation Underscript c Subscript i Baseline element-of dom left-parenthesis y right-parenthesis Endscripts StartAbsoluteValue StartFraction StartAbsoluteValue sigma Subscript a Sub Subscript i element-of d 1 Subscript intersection y equals c Sub Subscript i Subscript Baseline upper S EndAbsoluteValue Over 


                  <div class= Скачать книгу