µÚ3Õ CHAPTER 3 Âß ¼­ »Ø ¹é Âß¼­»Ø¹é(Logistic regression)ÓëµÚ2ÕÂÖÐÌÖÂÛµÄÏßÐԻعéÏàËÆ£¬ÊÇÒ»ÖÖÀûÓÃÏßÐÔº¯Êý¹¹½¨Ä£ÐÍ£¬µ«ÊÇÏßÐԻعé¼ÙÉèÄ£Ð͵ÄÊä³ö½á¹û·þ´Ó¸ß˹·Ö²¼£¬ËùÒÔÏßÐԻعéÄ£ÐÍÊä³öÁ¬ÐøµÄÔ¤²âÖµ£¬½â¾ö»úÆ÷ѧϰÖеĻعéÎÊÌ⣻ ¶øÂß¼­»Ø¹é¼ÙÉèÄ£ÐÍÊä³ö·þ´Ó²®Å¬Àû·Ö²¼£¬»ùÓÚÏßÐÔÄ£ÐÍÒýÈë·ÇÏßÐÔº¯Êý£¬ÀýÈçSigmoidº¯Êý£¬ÈÃÄ£ÐÍÊä³ö·ÇÏßÐÔÀëÉ¢µÄÔ¤²âÖµ£¬½â¾ö»úÆ÷ѧϰÖеķÖÀàÎÊÌâ¡£ 3.1Âß¼­»Ø¹éÄ£ÐÍ Âß¼­»Ø¹éÄ£ÐÍËäÈ»Ãû×ÖÖаüº¬¡°»Ø¹é¡±¶þ×Ö£¬µ«ËüÈ´ÊÇÒ»ÖÖÓ¦Óù㷺µÄ·ÖÀàËã·¨¡£¶ÔÓÚ·ÖÀàÎÊÌ⣬ÊÇÏ£Íû½¨Á¢µÄÄ£ÐÍÊä³ö½á¹ûΪijһ¸öÈ·¶¨µÄÀà±ð£¬ÀýÈçÕýÈ·»òÕß´íÎó£¬Ò²¾ÍÊÇ1»òÕß0µÄ¶þ·ÖÀàÎÊÌ⡣ʵ¼ÊÓ¦ÓÃÖÐÅжÏÒ»·âµç×ÓÓʼþÊÇ·ñÀ¬»øÓʼþ£¬ÅжÏÒ»´Î½ðÈÚ½»Ò×ÊÇ·ñÆÛÕ©£¬´ËǰÖ×Áö·ÖÀàÎÊÌâÖÐÇø·ÖÖ×ÁöÊǶñÐÔ»¹ÊÇÁ¼ÐԵģ¬ÊÖдÊý×ÖÊÇ0¡¢1¡¢2£¬»¹ÊÇ3µÈÊ®¸öÀà±ðµÄ¶à·ÖÀàÎÊÌâµÈ¡£ ÆäÖÐ×î¼òµ¥µÄ¶þ·ÖÀà¿ÉÒÔ¶¨ÒåΪ y=g(x)y¡Ê{0,1}(3ª²1) ÆäÖÐ0±íʾ¸ºÀà(Negative class),1±íʾÕýÀà(Positive class)¡£ »ùÓÚÏßÐÔº¯ÊýµÄÂß¼­»Ø¹é·ÖÀàÄ£ÐͶ¨ÒåΪ H¦È(x)= g(¦ÈTx)(3ª²2) ÆäÖУ¬xΪÊäÈëÌØÕ÷ÏòÁ¿£» gΪÂß¼­º¯Êý£¬ÈÃÄ£ÐÍÊä³ö·¶Î§ÔÚ0ºÍ1Ö®¼ä£¬Ò»°ã³£ÓõÄÂß¼­º¯ÊýÓÐSÐͺ¯ÊýSigmoid£¬Æä±í´ïʽΪ g(z)=11+e£­z(3ª²3) ¸Ãº¯ÊýͼÐÎÈçͼ3.1Ëùʾ¡£ ͼ3.1Sigmoidº¯ÊýͼÐÎ Òò´Ë£¬Âß¼­»Ø¹éÄ£Ð͵ļÙÉ躯Êý¿ÉÒÔ¼Ç×÷£º H¦È(x)= 11+e£­¦ÈTx(3ª²4) ʽ(3ª²4)¿ÉÒÔÀí½âΪ£º ¶ÔÓÚ¸ø¶¨µÄÊäÈë±äÁ¿x£¬¸ù¾ÝËùÑ¡ÔñµÄ²ÎÊý¼ÆËãÊä³ö±äÁ¿Îª1µÄ¸ÅÂÊ(Estimated probability)£¬¼´£º H¦È(x)= P(y=1|x;¦È)(3ª²5) ÀýÈ磬¶ÔÓÚ¸ø¶¨µÄÌØÕ÷ÏòÁ¿x£¬Ñ¡Ôñ²ÎÊý¦È£¬¼ÆËã³öH¦È(x)=0.7£¬Ôò±íʾÕâ×éÑù±¾xΪÕýÀà1µÄ¸ÅÂÊΪ70%£¬¶øÎª¸ºÀàµÄ¸ÅÂÊΪ1£­0.7=0.3=30%¡£ ×ö³öÕâÑùµÄ·ÖÀà¾ö²ßµÄÒÀ¾Ý³ÆÎª¾ö²ß±ß½ç(Decision boundary),¶ÔÓÚSÐͺ¯ÊýµÄ¾ö²ß±ß½ç¿ÉÒÔ±íʾα´úÂ룺 ifH¦È(x)¡Ý0.5then:Êä³öÕýÀà "1" ifH¦È(x)<0.5then:Êä³ö¸ºÀà "0" ¾ö²ß±ß½çº¯Êý¿ÉÒÔÊǼòµ¥µÄÖ±Ïߣ¬Ò²¿ÉÒÔÊǸü¸´ÔӵĶþ´Îº¯Êý£¬ÀýÈçÔ²Ðηֽ纯Êý£¬¸ù¾Ý²»Í¬µÄÑù±¾·Ö²¼ÌØÕ÷ÐèҪȷ¶¨²»Í¬µÄÅж¨±ß½çº¯Êý¡£ 3.2Âß¼­»Ø¹éµÄ´ú¼Ûº¯Êý Âß¼­»Ø¹éÖеĴú¼Ûº¯ÊýµÄº¬ÒåÓëÏßÐԻعéÖÐÒ»Ö£¬ÊÇΪ±Æ½ü²ÎÊý¦È¶ø¶¨ÒåµÄÓÅ»¯º¯Êý£¬ÔÚÏßÐԻعéÖУ¬ÓÃÄ£Ð͵ÄÎó²îƽ·½ºÍ×÷Ϊ´ú¼Ûº¯Êý£¬ÀíÂÛÉÏ£¬ÔÚÂß¼­»Ø¹éÖÐÒ²¿ÉÒÔ²ÉÓÃÄ£Ð͵ÄÎó²îƽ·½ºÍΪ´ú¼Ûº¯Êý£¬È»¶ø½«µÃµ½Ò»¸ö·Ç͹º¯Êý(Nonª²convex function)¡£ÕâÒâζÕâ¸ö´ú¼Ûº¯ÊýÓкܶà¾Ö²¿×îСֵ£¬ÕâÑù·Ç³£²»ÀûÓÚ²ÉÓÃÌݶÈϽµË㷨ѰÕÒÈ«¾Ö×îСֵ¡£ËùÒÔ£¬¿ÉÒÔÖØÐ¶¨ÒåÂß¼­»Ø¹éµÄ´ú¼Ûº¯ÊýΪ£º J(¦È)=1n¡Æni=1Cost(H¦È(xi),yi)(3ª²6) ÆäÖУ¬ Cost(H¦È(x),y)=£­log(H¦È(x)),y=1 £­log(1£­H¦È(x)),y=0(3ª²7) ÔòÂß¼­»Ø¹éµÄ´ú¼Ûº¯Êý¿ÉÒÔ±íʾΪ£º J(¦È)=£­1n¡Æni=1yilogH¦È(xi)+(1£­yi)log(1£­H¦È(xi))(3ª²8) µÃµ½ÕâÑùµÄ´ú¼Ûº¯Êýºó£¬¾Í¿ÉÒÔÓÃÌݶÈϽµ·¨À´ÇóʹµÃ´ú¼Ûº¯ÊýÈ¡×îСֵµÄ²ÎÊýÁË¡£Âß¼­»Ø¹éÖеÄÌݶÈϽµ·¨¿ÉÒÔ±íʾΪ£º ÖØ¸´£º { ¦Èj:=¦Èj£­¦Áªµªµ¦ÈjJ(¦È)=¦Èj£­¦Á¡Æni=1(H¦È(xi)£­yi)xi,j } ¶ÔÓÚÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯ÊýµÄ͹ÐÔ·ÖÎö¿ÉÒԲο¼ÏàÓ¦µÄÊýѧ֪ʶ£¬ÕâÀïÖ»¸ø³öÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯ÊýÊÇ͹º¯ÊýµÄ½áÂÛ£¬²¢ÇÒûÓоֲ¿×îÓÅÖµ¡£µ±È»ºÍÏßÐԻعéÄ£ÐÍÒ»Ñù£¬ÔÚÌݶÈϽµ¸üÐÂʱ£¬ÒªÍ¬Ê±¸üÐÂËùÓÐÊäÈëÌØÕ÷£¬¾¡¹ÜÂß¼­»Ø¹éµÄÌݶÈϽµËã·¨ºÍÏßÐԻعéϽµËã·¨ºÜÏàËÆ£¬µ«ÊÇÕâÀïÂß¼­»Ø¹éÄ£ÐͺÍÏßÐԻعéÄ£Ð͵ÄÊä³öÊDz»Ò»ÑùµÄ£¬ËùÒÔʵ¼ÊÉÏÂß¼­»Ø¹éÖеÄÌݶÈϽµºÍÏßÐԻعéÖеÄÌݶÈϽµÊDz»Ò»ÑùµÄ¡£´ËÍ⣬ÔÚÔËÐÐÂß¼­»Ø¹éÌݶÈϽµË㷨֮ǰ£¬Èç¹û¶àÌØÕ÷ÊäÈë±äÁ¿µÄ·¶Î§´æÔںܴó²î±ð£¬ÄÇô½øÐйéÒ»»¯µÄÌØÕ÷Ëõ·ÅÒÀÈ»ÊDZØÒªµÄ¡£ ¶ÔÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯Êý³ýÁËÓÃÌݶÈϽµ·¨Ñ°ÕÒ×îСֵÍ⣬Ҳ¿ÉÒÔÓÃÆäËû¸ü¿ì¡¢¸üÓÅÔ½µÄËã·¨À´Çó½â£¬ÕâЩËã·¨Óй²éîÌݶÈ(Conjugate gradient)¡¢¾Ö²¿ÓÅ»¯·¨(Broyden Fletcher Goldfarb Shanno, BFGS)¡¢ÓÐÏÞÄÚ´æ¾Ö²¿ÓÅ»¯·¨(Limitedª²memory BFGS)µÈ¡£ 3.3ÓÅ»¯º¯Êý ΪÇó´ú¼Ûº¯ÊýµÄ×îСֵ£¬³ýÁË¿ÉÒÔ²ÉÓÃÒ»½×µ¼ÊýµÄÌݶÈϽµ·¨µü´úÇó½â£¬Ò²¿ÉÒÔ²ÉÓöþ½×µ¼ÊýµÄº£É­¾ØÕó(Hessen matrix)¶Ô´ú¼Ûº¯ÊýÇó½â×îСֵ¡£ÆäÖбȽϳ£ÓõľÍÊÇÅ£¶Ù·¨(Newton method)¡£Å£¶Ù·¨ÊÇÒ»ÖÖÔÚʵÊýÓòºÍ¸´ÊýÓòÉϽüËÆÇó½â·½³ÌµÄ·½·¨£¬Ïà¶ÔÓÚÌݶÈϽµ·¨×î´óµÄÓŵãÊÇÊÕÁ²ËٶȺܿ죬µ«ÊÇÒòΪ»ùÓÚ¶þ½×µ¼Êý£¬ËùÒÔÈçͬ3.2½ÚÌáµ½µÄ¹²éîÌݶȵÈËã·¨Ò»Ñù£¬Æä²»×ãÊǼÆË㸴ÔӶȷdz£¸ß£¬µ«ÊÇËæ×ÅÓ²¼þ¼ÆËãÆ½Ì¨ÐÔÄÜÒÀ¾ÝĦ¶û¶¨Âɲ»¶ÏÌáÉý£¬ÕâЩÊÕÁ²Ëٶȸü¿ìµÄ¸´ÔÓËã·¨Ò²Öð½¥ÔÚ¹¤³Ìʵ¼ùÖеõ½¹ã·ºÓ¦Óᣠ¶ÔÓÚÂß¼­»Ø¹éµÄ´ú¼Ûº¯ÊýJ(¦È)£¬ÎÞÂÛ²ÉÓÃÌݶÈϽµ·¨£¬»¹ÊÇÅ£¶Ù·¨£¬ÆäÄ¿µÄ¶¼ÊÇÏëÕÒµ½Ê¹µÃminJ(¦È)µÄ¦ÈÖµ£¬¿ÉÒÔ±íʾΪ£º ªµªµ¦ÈjJ(¦È)=set0(¦È¡ÊRm+1)(3ª²9) ¶ÔÓÚʽ(3ª²9)£¬Í¨¹ýÌݶÈϽµ·¨»òÕßÅ£¶Ù·¨À´Ñ°ÕÒ×î½Ó½üµÄ¦È×éºÏ¡£ÉÏʽÖÐÈç¹û¼ÙÉè¦È¡ÊR£¬¿ÉÒÔ¼Ç×÷£º f(¦È)=dd¦ÈJ(¦È)=0(3ª²10) ¶ÔÓÚÉÏÊöº¯Êýf(¦È)£¬Å£¶Ù·¨µÄÔ­ÀíÊǽ«º¯ÊýÔÚijµã¦È0×öÏßÐÔ»¯½üËÆ£¬¼´£º f(¦È)¡Öf(¦È0)+f¡ä(¦È0)(¦È£­¦È0)(3ª²11) Áîʽ(3ª²11)×ó±ßΪÁ㣬¿ÉÒԵõ½£º ¦È=¦È0£­f(¦È0)f¡ä(¦È0)(3ª²12) ÕâÑù¾­¹ýÈô¸É´Îµü´úºó£¬¿ÉÒԵõ½Ò»ÏµÁÐ{¦È0,¦È1,¦È2,¦È3,¡­}£¬ÕâÑùÅ£¶Ù·¨»áºÜ¿ìÊÕÁ²¡£ Óôú¼Ûº¯ÊýÀ´±íʾţ¶Ù·¨Îª£º ¦Èt+1=¦Èt£­J¡ä(¦Èt)J¡å(¦Èt)(3ª²13) ÆäÖÐtΪµü´ú´ÎÊý¡£Èç¹û¿¼ÂǦȡÊRm+1£¬ÔòÅ£¶Ù·¨±íʾΪÈçϹ«Ê½£º ¦Èt+1=¦Èt£­VJ(¦Èt)HJ(¦Èt)(3ª²14) ÆäÖÐV£ÛJ(¦È)£Ý¶ÔÓ¦µ¥±äÁ¿Ò»½×µ¼ÊýÊǶà±äÁ¿µÄÒ»½×Æ«µ¼Êý£¬¶¨ÒåΪÌݶÈÏòÁ¿£º VJ(¦È)=ªµJ(¦È)ªµ¦È0 ªµJ(¦È)ªµ¦È1 ªµJ(¦È)ªµ¦È2 ¦ó ªµJ(¦È)ªµ¦Èm(m+1)¡Á1(3ª²15) ÆäÖÐH£ÛJ(¦È)£Ý¶ÔÓ¦µ¥±äÁ¿¶þ½×µ¼ÊýÊǶà±äÁ¿µÄ¶þ½×Æ«µ¼Êý£¬¶¨ÒåΪº£É­¾ØÕó£º HJ(¦È)= ªµ2J(¦È)ªµ¦È0ªµ¦È0 ªµ2J(¦È)ªµ¦È0ªµ¦È1 ªµ2J(¦È)ªµ¦È0ªµ¦È2¡­ ªµ2J(¦È)ªµ¦È0ªµ¦Èm ªµ2J(¦È)ªµ¦È1ªµ¦È0 ªµ2J(¦È)ªµ¦È1ªµ¦È1 ªµ2J(¦È)ªµ¦È1ªµ¦È2¡­ ªµ2J(¦È)ªµ¦È1ªµ¦Èm ªµ2J(¦È)ªµ¦È2ªµ¦È0 ªµ2J(¦È)ªµ¦È2ªµ¦È1 ªµ2J(¦È)ªµ¦È2ªµ¦È2¡­ ªµ2J(¦È)ªµ¦È2ªµ¦Èm ¦ó¦ó¦ó¦ó ªµ2J(¦È)ªµ¦Èmªµ¦È0ªµ2J(¦È)ªµ¦Èmªµ¦È1 ªµ2J(¦È)ªµ¦Èmªµ¦È2¡­ªµ2J(¦È)ªµ¦Èmªµ¦Èm(m+1)¡Á(m+1)(3ª²16) ËùÒÔÅ£¶Ù¹«Ê½³£³£Ò²Ð´Îª£º ¦Èt+1=¦Èt£­H-1(J(¦Èt))J(¦Èt)(3ª²17) ÆäÖУ­H-1(J(¦Èt))J(¦Èt)³ÆÎªÅ£¶Ù·½Ïò£¬µ±º£É­¾ØÕóÕý¶¨Ê±£¬¿ÉÒÔ±£Ö¤Å£¶ÙËÑË÷·½ÏòϽµ¡£ ¾¡¹ÜÅ£¶Ù·¨Îª¶þ½×ÊÕÁ²£¬²¢ÇÒ¿ÉÒÔ¿ìËÙÊÕÁ²£¬µ«ÊÇÅ£¶Ù·¨ÒòÒýÈ뺣ɭ¾ØÕó¶øÔö¼ÓÁ˸´ÔÓÐÔ£¬µ±¾ØÕóά¶È¹ý´óʱ£¬Çó½âº£É­¾ØÕóµÄÄæ¾ØÕó»á´øÀ´¾Þ´óµÄ¼ÆËãÁ¿¡£Èç¹ûº£É­¾ØÕ󲻿ÉÄæ£¬ÔòÅ£¶ÙËã·¨Î޽⡣ÔÙÈç¹ûº£É­¾ØÕó²»ÊÇÕý¶¨¾ØÕó£¬Ò²¾ÍÊǺ¯Êý²»ÊÇÑϸñµÄ͹º¯Êý£¬Ò²¿ÉÄܻᵼÖÂËã·¨ÎÞ·¨ÊÕÁ²¡£Èç¹û³õֵѡÔñÆ«À뼫ֵµã̫Զ£¬Ò²»áµ¼ÖÂËã·¨ÎÞ·¨ÊÕÁ²£¬Ò²¾ÍÊÇ˵£¬»ù±¾Å£¶Ù·¨²¢²»ÊÇÈ«¾ÖÓÅ»¯Ëã·¨¡£ Å£¶Ù·¨ÓëÌݶÈϽµ·¨ÓÅȱµã¶Ô±ÈÈç±í3.1Ëùʾ¡£ ±í3.1Å£¶Ù·¨ÓëÌݶÈϽµ·¨¶Ô±È ÌݶÈϽµ·¨ Å£¶Ù·¨ ¸ü¼òµ¥ ½Ï¸´ÔÓ ÐèÒªÉèÖòÎÊý£¬ÈçѧϰÂÊµÈ ²»ÐèÒªÉèÖòÎÊý ¸ü¶àµü´ú´ÎÊý ½ÏÉÙµü´ú´ÎÊý ÿ´Îµü´úµÄ¼ÆËã³É±¾½ÏµÍ£¬¸´ÔÓ¶ÈΪO(m)£¬ÆäÖÐmΪÑù±¾ÌØÕ÷Êý ÿ´Îµü´ú¼ÆËã³É±¾ºÜ¸ß£¬¸´ÔÓ¶ÈΪO(m3)£¬ÆäÖÐmΪÑù±¾ÌØÕ÷Êý µ±m½Ï´óÊ±ÍÆ¼öʹÓã¬ÍƼöm>10000ʱʹÓÃÌݶÈϽµ·¨½ÏΪºÏÀí µ±m½ÏÐ¡Ê±ÍÆ¼öʹÓã¬ÍƼöm<1000ʱ£¬¼ÆË㺣ɭ¾ØÕó±È½ÏÈÝÒ×£¬Ê¹ÓÃÅ£¶Ù·¨½ÏΪºÏÀí ´ó²¿·ÖµÄ»úÆ÷ѧϰËã·¨µÄ±¾ÖÊÊǽ¨Á¢ÓÅ»¯Ä£ÐÍ£¬Í¨¹ýÓÅ»¯·½·¨¶ÔÄ¿±êº¯Êý½øÐÐÓÅ»¯£¬´Ó¶øÑµÁ·³ö×îºÃµÄÄ£ÐÍ£¬³ýÁ˳£¼ûµÄÌݶÈϽµ·¨ºÍÅ£¶Ù·¨£¬»¹ÓиĽøÌݶÈϽµ·¨µÄËæ»úÌݶÈϽµ·¨(Stochastic Gradient Descent, SGD)¡¢ÅúÌݶÈϽµ·¨(Batch Gradient Descent, BGD)¡¢¸Ä½øÅ£¶Ù·¨µÄÄâÅ£¶Ù·¨(Quasiª²Newton Methods)¡£»¹Óиù¾ÝÈËÀàÔÚ½â¾öÎÊÌâËù²ÉÈ¡µÄ¾­Ñ鹿Ôò¶øÌá³öµÄÆô·¢Ê½ÓÅ»¯·½·¨£¬ÀýÈç»ùÓÚÎïÀíÖйÌÌåÎïÖʵÄÀäÈ´¹ý³Ì£¬Ìá³ö×éºÏÓÅ»¯µÄÄ£ÄâÍË»ð(Simulated Annealing, SA)Ëã·¨£¬»ùÓÚ×ÔÈ»½çÒÅ´«¹æÂɵķÂÉúËã·¨£¬Ìá³ö½â¾ö¸´ÔÓ·ÇÏßÐÔÓÅ»¯ÎÊÌâµÄÒÅ´«Ëã·¨(Genetic Algorithm, GA)£¬ÐµĽüËÆGAµÄ²î·Ö½ø»¯Ëã·¨(Differential Evolution Algorithm£¬ DE)£¬·ÂÉúÖÖȺÃÙʳµÄѰÓÅËã·¨ÈçÁ£×ÓȺ(Particle Swarm Optimization, PSO)Ëã·¨ºÍÈ˹¤·äȺ(Artificial Bee Colony, ABC)Ëã·¨µÈ£¬¾ßÌå¿É²Î¿¼ºóÃæÏà¹ØÕ½ڡ£ ÊÓÆµ½²½â 3.4Âß¼­»Ø¹é½â¾ö·ÖÀàÎÊÌâ 3.4.1ʵÀýÒ»: Å£¶Ù·¨ÊµÏÖÂß¼­»Ø¹éÄ£ÐÍ 1. ÎÊÌâÃèÊö 1) Êý¾Ý ¼ÙÉèÒ»¸ö¸ßÖÐÉúµÄÊý¾Ý¼¯£¬ÆäÖÐ40¸öѧÉú±»´óѧ½ÓÊÕ£¬40¸ö±»´óѧ¾Ü¾ø¡£Ã¿¸öѵÁ·Ñù±¾¿ÉÒÔ±íʾΪ(xi,yi)£¬ÆäÖаüº¬Ò»¸öѧÉúÔÚÁ½´Î±ê×¼»¯¿¼ÊÔÖеķÖÊý£¬Ò»¸öÊÇ·ñ±»´óѧ½ÓÊյıê¼Ç¡£ÐèÒª½â¾öµÄÎÊÌâÊÇ£º ½¨Á¢Ò»¸ö¶þ·ÖÀàÄ£ÐÍ£¬»ùÓÚѧÉúÁ½´Î¿¼ÊÔ·ÖÊýÀ´Ô¤²â±»´óѧ¼ȡµÄ¸ÅÂÊ¡£ÔÚѵÁ·Êý¾ÝÖУº (1) ¾ØÕóx£¬µÚÒ»ÁÐÊÇËùÓÐѧÉúµÚÒ»´Î¿¼ÊÔ·ÖÊý£¬µÚ¶þÁÐÊÇËùÓÐѧÉúµÚ¶þ´Î¿¼ÊÔ·ÖÊý£» (2) ÏòÁ¿y£¬ÓÃ1±ê¼Ç±»´óѧ¼ȡµÄѧÉú£¬ÓÃ0±ê¼Ç±»´óѧ¾Ü¾øµÄѧÉú¡£ Êý¾Ý¼¯Îļþex341x.datºÍex341y.dat¼û±¾ÊéÅäÌ×µÄÊý¾ÝºÍÔ´ÂëÇåµ¥¡£ 2) »­Í¼ µ¼ÈëѵÁ·Êý¾Ý£¬ÔÚ¾ØÕóxÖмÓÈë³£ÊýÏîx0=1¡£ÔÚ¿ªÊ¼Ê¹ÓÃÅ£¶Ù·¨Ö®Ç°£¬Ê×ÏÈÓò»Í¬µÄ·ûºÅ±íʾÁ½ÀàÊý¾Ý»­³öÊý¾ÝͼÐΡ£ÔÚMATLAB/OctaveÖУ¬¿ÉÒÔÓÃÒÔÏÂÃüÁîʵÏÖÕýºÍ¸ºÑù±¾µÄ·ÖÀ룺 %ÕÒµ½²¢·µ»Ø±êÇ©1ºÍ0µÄÐкÍÁÐ pos = find(y == 1); neg = find(y == 0); %ÌØÕ÷ÔÚ±äÁ¿xµÄµÚ2Áк͵Ú3ÁÐ plot(x(pos, 2), x(pos,3), 'r+'); hold on; plot(x(neg, 2), x(neg, 3), 'bo'); ÔËÐгÌÐò¿ÉÒԵõ½Èçͼ3.2ËùʾͼÐΡ£ ͼ3.2ѵÁ·Êý¾Ýͼ 3) Å£¶Ù·¨ »Ø¹ËÂß¼­»Ø¹é£¬¼ÙÉ躯ÊýΪ H¦È(x)=g(¦ÈTx)=P(y=1|x;¦È) ÆäÖÐg(¡¤)ÊǼ¤»îº¯Êý£¬³£¼ûµÄÑ¡ÔñSigmoidº¯Êý¡£Èç¹ûMATLAB/OctaveÖУ¬Ã»ÓÐSigmoid¿âº¯Êý£¬¶øSº¯ÊýµÄÊýѧ±í´ïʽΪ£º g(z)=11+e£­z ËùÒÔ£¬×î¼òµ¥µÄ·½Ê½ÊÇͨ¹ýÒÔÏÂÄÚÁªº¯ÊýÀ´ÊµÏÖ£º %¶¨ÒåSigmoidº¯Êý g = inline('1.0 ./ (1.0 + exp(-z))'); »Ø¹Ë´ú¼Ûº¯ÊýJ(¦È)µÄ¶¨Ò壺 J(¦È)=£­1n¡Æni=1yilogH¦È(xi)+(1£­yi)log(1£­H¦È(xi)) ²ÉÓÃÅ£¶Ù·¨À´×îС»¯´ú¼Ûº¯Êý¡£»Ø¹éÅ£¶Ù·¨µÄ¸üйæÔòÈçÏ£º ¦Èt+1=¦Èt£­H-1(J(¦Èt))J(¦Èt) ÔÚÂß¼­»Ø¹éÖУ¬ÌݶÈÏòÁ¿ºÍº£É­¾ØÕóÈçÏ£º J(¦È)=1n¡Æni=1(H¦È(xi)£­yi)xi H(¦È)=1n¡Æni=1£Û(H¦È(xi)(1£­(H¦È(xi))xi(xi)T£Ý ×¢Ò⣺ ÒÔÉϹ«Ê½ÊÇÏòÁ¿°æ±¾µÄ¹«Ê½¡£Ò²¾ÍÃ÷È·Òâζ×Å£¬µ±H¦È(xi)ºÍyi ÊÇÒ»¸öʵÊý±íʾµÄ±êÁ¿Ê±£¬ÆäÖÐxi¡ÊRm+1,xi(xi)T¡ÊR(m+1)¡Á(m+1)¡£ 4) ʵÏÖ·ÖÀà ÏÖÔÚ¿ªÊ¼²ÉÓôúÂëÀ´ÊµÏÖÅ£¶Ù·¨£¬Ê×Ïȳõʼ»¯¦È=0¡ú¡£ÎªÈ·¶¨ÐèÒª¶àÉٴεü´ú£¬¿ÉÒÔ¼ÆËãÿ¸öµü´úµÄJ(¦È)£¬Í¬Ê±»­³ö½á¹ûͼ£¬Ò»°ã¿ÉÒÔÔÚ5~15´Îµü´úÖ®ºóÊÕÁ²¡£Èç¹ûÐèÒª¸ü¶à´Îµü´ú£¬Çë¼ì²éʵÏÖ´úÂëÊÇ·ñÓдíÎó¡£ÔÚËã·¨ÊÕÁ²Ö®ºó£¬ÓæÈÕÒµ½·ÖÀàÎÊÌâµÄ·Ö½çÏß¡£·Ö½çÏß±»¶¨ÒåΪÈçÏÂÖ±Ïߣº P(y=1|x;¦È)=g(¦ÈTx)=0.5 »­³ö·Ö½çÏ߾͵ÈÓÚ»­³ö¦ÈTx=0µÄÖ±Ïß¡£Íê³Éºó£¬¿ÉÒԵõ½Èçͼ3.3ËùʾµÄͼÐΡ£ ͼ3.3·Ö½çÏßͼʾ 5) ÎÊÌâ ¸ù¾ÝÒÔÉÏÀíÂÛ£¬Çë»Ø´ðÒÔÏÂÁ½¸öÎÊÌ⣺ (1) µÃµ½µÄ¦ÈÖµÊǶàÉÙ£¿ÐèÒª¶àÉٴεü´ú¿ÉÒÔÊÕÁ²£¿ (2) µ±Ò»¸öѧÉúÔÚµÚÒ»´Î¿¼ÊÔÖзÖÊýΪ20£¬µÚ¶þ´Î¿¼ÊÔÖзÖÊýΪ80£¬¸ÃѧÉú±»´óѧ¼ȡµÄ¸ÅÂÊÊǶàÉÙ£¿ 2. ʵÀý·ÖÎö²Î¿¼½â¾ö·½°¸ Çë²Î¿¼ÒÔϽâ¾ö·½°¸£¬¼ì²éÄãµÄʵÏֺʹð°¸ÊÇ·ñÕýÈ·¡£Èç¹ûÔÚÏàͬµÄ²ÎÊý/º¯ÊýÃèÊöµÄÇé¿öÏ£¬µÃµ½²»Ò»ÑùµÄ½á¹û£¬Çëµ÷ÊÔ´úÂëÖ±µ½µÃµ½ÏàͬµÄ½á¹û¡£ Ô´Âëex341.mÎļþ¼û±¾ÊéÅäÌ×µÄÔ´ÂëÇåµ¥¡£ Å£¶Ù·¨µÄ×îÖյĦÈÖµÓ¦¸ÃΪ ¦È0= -16.38¦È1= 0.1483¦È2= 0.1589 ´ú¼Ûº¯ÊýµÄͼÐÎÏÔʾÀàËÆÍ¼3.4¡£ ͼ3.4´ú¼Ûº¯ÊýÇúÏß ÓÉͼ3.4¿ÉÖª£¬Å£¶Ù·¨ÔÚ´óÔ¼5´Îµü´úºóÊÕÁ²¡£Êµ¼ÊÉÏ£¬²é¿´²¢´òÓ¡Êä³öJµÄÖµ£¬¿ÉÒÔ·¢ÏÖJµÄÖµÔÚµÚ4´ÎºÍµÚ5´Îµü´úÖ®¼ä¾ÍÒѾ­Ð¡ÓÚ10-7¡£Í¨³£ÌݶÈϽµÐèÒª¼¸°Ù´ÎÉõÖÁ¼¸Ç§´Îµü´ú²Å¿ÉÒÔÊÕÁ²¡£ÓëÖ®Ïà±È£¬Å£¶Ù·¨ËÙ¶ÈÒª¸ü¿ìһЩ¡£ µ±Ò»¸öѧÉúÔÚµÚÒ»´Î¿¼ÊÔÖзÖÊýΪ20£¬µÚ¶þ´Î¿¼ÊÔÖзÖÊýΪ80£¬Ô¤²â¸ÃѧÉú±»´óѧ¾Ü¾øµÄ¸ÅÂÊÊÇ0.668¡£ 3.4.2ʵÀý¶þ: Âß¼­»Ø¹é½â¾ö¶þ·ÖÀàÎÊÌâ 1. ÎÊÌâÃèÊö 1) Êý¾Ý ¼ÙÉèij°à¼¶ÓÐ20ÃûѧÉú£¬²É¼¯Ñ§Éú¸´Ï°¿Î³ÌµÄʱ³¤£¨µ¥Î»ÎªÐ¡Ê±£©£¬Ñ§Éú¸´Ï°Ð§ÂÊ£¬¶ÔӦѧÉú¿¼ÊԳɼ¨ÎªÍ¨¹ý±ê¼ÇΪ1£¬²»Í¨¹ý±ê¼ÇΪ0£¬Êý¾ÝÈç±í3.2Ëùʾ¡£ ±í3.2ij°à¼¶Ñ§Éú¸´Ï°Ó뿼ÊÔ½á¹ûÊý¾Ý Ñù±¾ ʱ³¤ ЧÂÊ ½á¹û Ñù±¾ ʱ³¤ ЧÂÊ ½á¹û Student 0# 1 0.1 0 Student 10# 7 0.9 1 Student 1# 2 0.9 0 Student 11# 8 0.1 0 Student 2# 2 0.4 0 Student 12# 8 0.6 1 Student 3# 4 0.9 1 Student 13# 8 0.8 1 Student 4# 5 0.4 0 Student 14# 3 0.9 0 Student 5# 6 0.4 0 Student 15# 8 0.5 1 Student 6# 6 0.8 1 Student 16# 7 0.2 0 Student 7# 6 0.7 1 Student 17# 4 0.5 0 Student 8# 7 0.2 0 Student 18# 4 0.7 1 Student 9# 7 0.8 1 Student 19# 2 0.9 1 2) ÈÎÎñ ¶ÔѧÉú¸´Ï°Ó뿼ÊÔ½á¹ûÊý¾Ý¼¯£¬½¨Á¢Âß¼­»Ø¹éÄ£ÐÍ£¬ÊµÏÖ·ÖÀ࣬»ùÓÚPythonÓïÑÔ»úÆ÷ѧϰ¿âSklearnµÄÂß¼­»Ø¹éº¯ÊýLogisticRegressionʵÏÖ£¬·µ»ØÄ£Ð;«¶ÈºÍ²ÎÊý£¬²¢ÏÔʾ»ìÏý¾ØÕó¡£ 2. ʵÀý·ÖÎö²Î¿¼½â¾ö·½°¸ PythonʵÏÖÂß¼­»Ø¹é·ÖÀàÔ´ÂëÇåµ¥ÈçÏ£º from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score,confusion_matrix from prettytable import PrettyTable import numpy as np X=np.matrix('1 0.1;2 0.9;2 0.4;4 0.9;5 0.4;£Ü 6 0.4;6 0.8;6 0.7;7 0.2;7 0.8;£Ü 7 0.9;8 0.1;8 0.6;8 0.8;3 0.9;£Ü 8 0.5;7 0.2;4 0.5;4 0.7;2 0.9') y_true=np.matrix('0; 0; 0; 1; 0;£Ü 0; 1; 1; 0; 1;£Ü 1; 0; 1; 1; 0;£Ü 1; 0; 0; 1; 1') #µ¼ÈëÊý¾Ý£¬²¢ÉèÖÃѵÁ·ºÍ²âÊÔÊý¾Ý±ÈÀýΪ8:2 X_train, X_test, y_train, y_test = train_test_split(X, np.ravel(y_true),test_size=0.2) reg = LogisticRegression(C=1e5, solver='lbfgs') #ѵÁ· reg.fit(X_train,y_train) #²âÊÔ y_pred=reg.predict(X_test) print('test accuracy:£Ün',accuracy_score(y_test,y_pred))#´òÓ¡Ä£Ð;«¶È print('weights:£Ün', reg.coef_, '£Ünbias:£Ün', reg.intercept_)#´òÓ¡Ä£ÐͲÎÊý pre=reg.predict(X) cm = confusion_matrix(y_true, pre) print("confusion_matrix:")#´òÓ¡»ìÏý¾ØÕó cm_table = PrettyTable(£Û"","predict: 0 class", "predict: 1 class"£Ý) cm_table.add_row(£Û"true: 0 class",cm£Û0,0£Ý, cm£Û0,1£Ý£Ý) cm_table.add_row(£Û"true: 1 class",cm£Û1,0£Ý, cm£Û1,1£Ý£Ý) print(cm_table) ÔËÐÐÒÔÉϳÌÐòÊä³ö½á¹ûΪ£º ´Ó³ÌÐòÔËÐнá¹û¿ÉÖª£¬Âß¼­»Ø¹éÄ£Ð͵IJâÊÔ¾«¶ÈΪ75%£¬Ä£ÐÍ¿ÉÒÔ±íʾΪ£º H(x)=1.276x0+14.395x1£­15.153 ´ÓÄ£Ð͵ĻìÏý¾ØÕó¿ÉÖª£¬¹²ÓÐ8¸ö±êǩΪ0µÄÑù±¾·ÖÀàÕýÈ·£¬ÓÐ10¸ö±êǩΪ1µÄÑù±¾·ÖÀàÕýÈ·£¬ÓÐ2¸ö±êǩΪ0µÄÑù±¾±»´íÎó·ÖÀàΪ1¡£ 3.5ÕýÔò»¯ ÕýÔò»¯(Regularization)ÊÇͨ¹ýÏò»úÆ÷ѧϰµÄÄ£ÐÍÒýÈë¶îÍâÐÅÏ¢£¬´Ó¶ø·ÀÖ¹¹ýÄâºÏ(Overfitting)£¬Ìá¸ßÄ£Ð͵ķº»¯ÄÜÁ¦¡£Ê×ÏÈ£¬¶Ô»úÆ÷ѧϰËã·¨½øÐÐѵÁ·»á³öÏÖ3ÖÖ½á¹û¡£µÚһΪǷÄâºÏ(Underfitting), »òÕß½Ð×÷¸ßÆ«²î(High bias)£¬Ò²¾ÍÊÇËù½¨Á¢µÄÄ£ÐͲ»ÄܺܺõØÄâºÏѵÁ·Êý¾Ý£¬Ò»°ã³öÏÖÔÚÄ£Ð͸տªÊ¼ÑµÁ·µÄ½×¶Î£¬ÐèҪͨ¹ý²»¶ÏµØµ÷ÕûËã·¨²ÎÊýÀ´Ìá¸ßÄ£ÐͶÔÊý¾ÝµÄ±í´ïÄÜÁ¦¡£µÚ¶þÖÖΪ¹ýÄâºÏ£¬»òÕß½Ð×÷¸ß·½²î(High variance)£¬Ò²¾ÍÊÇËù½¨Á¢µÄÄ£ÐͶÔѵÁ·Êý¾ÝµÄÌØÕ÷±íÏÖµÃÌ«³¹µ×£¬¼¸ºõÄâºÏµ½ÁËËùÓÐѵÁ·¼¯ÖеÄÊý¾Ý£¬µ¼Ö½«Êý¾ÝÖеÄÔëÉùÒ²µ±×÷ÌØÕ÷À´Ñ§Ï°£¬Ò»°ã³öÏÖÔÚѵÁ·µÄ×îºó½×¶Î£¬»òÕßѵÁ·Êý¾Ý¼¯½ÏС£¬¶øÄ£ÐͽÏΪ¸´ÔÓµÄÇé¿ö¡£µ±¹ýÄâºÏ³öÏÖʱ£¬Ä£ÐͶÔеÄÑù±¾µÄÔ¤²âÄÜÁ¦»áºÜÔã¸â£¬Ò²¾ÍÊÇÔÚ²âÊÔ¼¯ÉϾ«¶ÈÑÏÖØÏ½µ¡£µÚÈýÖÖ¾ÍÊÇÏëÒª±£ÁôÏÂÀ´µÄ¸Õ¸ÕºÃ(Just right)Ä£ÐͺÍÄ£ÐͲÎÊý£¬¶ÔѵÁ·Êý¾ÝµÄÌØÕ÷Óкܺõıí´ïÁ¦£¬¶Ô²âÊÔÊý¾ÝÓкܺõÄÊÊÓ¦ÄÜÁ¦£¬ÕâÑùµÄÄ£Ð;ÍÓкܺõķº»¯ÄÜÁ¦¡£ µ±»úÆ÷ѧϰģÐͳöÏÖ¹ýÄâºÏʱ£¬¿ÉÒÔ´ÓÊý¾Ý¼¯ºÍÄ£ÐÍÁ½¸ö½Ç¶ÈÀ´Ñ°Çó½â¾ö·½°¸¡£ÑµÁ·Êý¾Ý¼¯Ñù±¾Á¿Ì«Ð¡£¬¿ÉÒÔÔö´óÊý¾Ý¼¯¹æÄ£¡£Ñù±¾ÌØÕ÷¹ý¶à£¬¾ÍÒªÉáÆúÒ»²¿·ÖÈßÓàµÄÌØÕ÷£¬·ÅÆú¶ÔÄ£Ð;«¶È¹±Ï׺ÜСµÄÌØÕ÷¡£´ÓÄ£Ð͵ĽǶȣ¬Îª½â¾ö¹ýÄâºÏÎÊÌ⣬¿ÉÒÔ²ÉÓÃÕýÔò»¯µÄ·½·¨£¬¶Ô´ú¼Ûº¯ÊýÒýÈëÒ»¸öÕýÔò»¯Ïî¡£ »Ø¹ËÏßÐԻعéÄ£Ð͵Ĵú¼Ûº¯Êý±í´ïʽΪ£º J(¦È)=1n¡Ænj=0£ÛH¦È(xj)-yj£Ý2 ¶ÔÓÚÒýÈëÕýÔòÏîºóµÄÏßÐԻعéÄ£ÐÍ£¬Æä´ú¼Ûº¯Êý¿ÉÒÔ±íʾΪ£º J(¦È)=12n¡Ænj=0£ÛH¦È(xj)-yj£Ý2+¦Ë¡Æmj=1(¦Èj)2 ÆäÖЦËΪÕýÔò»¯²ÎÊý£¬¿ÉÒÔ¿ØÖÆÔÚѵÁ·Ä£Ðͺͱ£³Ö²ÎÊýÖµ½ÏС¼ä´ïµ½½ÏºÃµÄƽºâ£¬±£Ö¤¶ÔѵÁ·Êý¾Ý¼¯µÄÄâºÏÄ£ÐÍÐÎʽÏà¶Ô¼òµ¥£¬´Ó¶ø½ÏºÃµØ±ÜÃâ¹ýÄâºÏ¡£¶ÔÏßÐԻعéµÄÕýÔò»¯ºóµÄ´ú¼Ûº¯Êý£¬Ò²Í¬Ñù¿ÉÒÔ²ÉÓÃÌݶÈϽµ·¨ºÍ×îС¶þ³ËÕý¹æ»¯·¨ÓÅ»¯¡£ »Ø¹ËÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯Êý±í´ïʽΪ£º J(¦È)=£­1n¡Æni=1yilogH¦È(xi)+(1£­yi)log(1£­H¦È(xi)) ¶ÔÓÚÒýÈëÕýÔòÏîºóµÄÂß¼­»Ø¹éÄ£ÐÍ£¬Æä´ú¼Ûº¯Êý¿ÉÒÔ±íʾΪ£º J(¦È)=£­1n¡Æni=1yilogH¦È(xi)+(1£­yi)log(1£­H¦È(xi))+¦Ë2n¡Æmj=1(¦Èj)2(3ª²18) µ±È»£¬¶ÔÂß¼­»Ø¹éµÄÕýÔò»¯ºóµÄ´ú¼Ûº¯Êý£¬Í¬Ñù¿ÉÒÔ²ÉÓÃÌݶÈϽµ·¨ºÍÅ£¶Ù·¨ÓÅ»¯¡£ 3.6ÕýÔò»¯ºóµÄÏßÐԻعéºÍÂß¼­»Ø¹éÄ£ÐÍʵÀý·ÖÎö ±¾½Úͨ¹ýʵÀýÀ´·ÖÎöÒýÈëÕýÔòÏîºóµÄÏßÐԻعéºÍÂß¼­»Ø¹é½â¾ö»Ø¹éºÍ·ÖÀàÎÊÌâ¡£ ÆäÖÐÎÊÌâÃèÊöΪ£º 1) Êý¾Ý Êý¾Ý°üº¬Á½¸öÊý¾Ý¼¯£º Ò»¸öex361x.datºÍex361y.datÓÃÓÚÏßÐԻع飬ÁíÒ»¸öÊÇex362x.datºÍex362y.datÓÃÓÚÂß¼­»Ø¹é¡£ 2) »­Í¼ Êý¾Ý¶ÔÓ¦ÓÚ³ÌÐòÖÐÐèÒª´¦ÀíµÄ±äÁ¿xºÍy£¬ ÆäÖÐÊäÈëxÊÇÒ»¸öµ¥ÌØÕ÷£¬ËùÒÔ¿ÉÒÔ»­³ö±êÇ©y¹ØÓÚxµÄ¶þάͼ£¬Çë²Î¿¼±¾ÊéÅäÌ×µÄÔ´Âë×Ô¼º¶¯ÊÖ£¬ÔÚMATLAB/Octave Öбàд´úÂë»­³öÊý¾Ýͼ£¬»­³öµÄÊý¾ÝͼÈçͼ3.5Ëùʾ¡£ ͼ3.5ÏßÐԻع麯Êýͼ ´Óͼ3.5ÖпÉÖª£¬Èç¹ûÓÃÒ»¸öÖ±ÏßÀ´±Æ½üÊý¾ÝËÆºõ¹ýÓÚ¼òµ¥¡£Òò´Ë£¬¿ÉÒÔ³¢ÊÔÒ»¸ö¸ß½×µÄ¶àÏîʽÀ´ÄâºÏÊý¾Ý£¬´Ó¶ø¸üºÃµØ±íÏÖ¸÷¸öÊý¾ÝµãµÄ±ä»¯¡£ ¿ÉÒÔ³¢ÊÔÒ»¸öÎå½×µÄ¶àÏîʽ£¬±íʾΪ£º H¦È(x)=¦È0+¦È1x+¦È2x2+¦È3x3+¦È4x4+¦È5x5 ÕâÒâζ×ÅÒ»¸ö6¸öÌØÕ÷µÄ¼ÙÉè¶àÏîʽģÐÍ£¬ÒòΪ(x0£¬x1£¬x2£¬x3£¬x4£¬x5)ÊǻعéµÄËùÓÐÌØÕ÷¡£×¢Ò⾡¹ÜÓöàÏîʽ±Æ½üÊý¾Ý£¬µ«ÊÇÒÀÈ»ÔÚÌÖÂÛÏßÐԻعéÎÊÌ⣬ԭÒòÔÚÓÚ¼ÙÉ躯Êý¶ÔÓÚÿһ¸öÌØÕ÷¶¼ÊÇÏßÐÔÏà¹ØµÄ¡£ ÓÃÎå½×µÄ¶àÏîʽ±Æ½üÒ»¸ö½öÓÐ7¸öµãµÄÊý¾Ý¼¯£¬ºÜ¿ÉÄܳöÏÖ¹ýÄâºÏ¡£Îª·ÀÖ¹¹ýÄâºÏ³öÏÖ£¬¶ÔÄ£ÐÍÒª½øÐÐÕýÔò»¯¡£ »Ø¹ËÕýÔò»¯ÎÊÌ⣬ÕýÔò»¯Ä¿µÄ¾ÍÊÇ£¬ÈçϹ«Ê½Ëù±íʾµÄ´ú¼Ûº¯Êý¹ØÓڦȻñµÃ×îСֵ£º Min: J(¦È)=12n¡Ænj=0£ÛH¦È(xj)-yj£Ý2+¦Ë¡Æmj=1(¦Èj)2 ÆäÖЦËÊÇÕýÔò»¯²ÎÊý£¬ÊÇ¿ØÖÆÄâºÏµÄ²ÎÊý¡£µ±ÄâºÏ²ÎÊýµÄÊýÁ¿(Magnitudes)Ôö¼Óʱ£¬´ú¼Ûº¯ÊýµÄÕýÔò»¯³Í·£Á¦¶È½«ËæÖ®Ôö¼Ó¡£Õâ¸ö³Í·£Í¬Ê±ÒÀÀµÓÚ²ÎÊýµÄƽ·½ºÍ¦ËµÄ´óС¡£×¢Ò⣬´Ë´¦ÕýÔò»¯ÇóºÍµÄ²ÎÊý²¢²»°üº¬¦È20¡£ 3.6.1ʵÀýÒ»: ×îС¶þ³ËÕý¹æ·½³Ì·¨ÓÅ»¯ÕýÔò»¯ÏßÐԻعéÄ£ÐÍ ÊÓÆµ½²½â ¶ÔÓÚÏßÐԻعéÄ£Ð͵Ĵú¼Ûº¯Êý×îС»¯£¬¿ÉÒÔ²ÉÓÃÌݶÈϽµ·¨£¬Ò²¿ÉÒÔ²ÉÓÃ×îС¶þ³ËÕý¹æ·½³Ì·¨£¬ÓÉÓÚѵÁ·¼¯¹æÄ£½ÏС£¬ËùÒÔÔÚÕâ¸öʵÀýµÄ·ÖÎöÖУ¬²ÉÓÃÕý¹æ·½³ÌÀ´Çó½âÕýÔò»¯ºóµÄ´ú¼Ûº¯Êý¡£ »Ø¹ËÏßÐԻعéÄ£Ð͵Ä×îС¶þ³ËÕý¹æ·½³Ì±í´ïʽΪ£º ¦È=(XTX)-1XTY ¶ÔÓÚÕýÔò»¯µÄ´ú¼Ûº¯Êý£¬ÆäÕý¹æ·½³Ì±íʾÈçÏ£º ¦È=(XTX+¦ËA)-1XTY A=0 1 1(m+1)¡Á(m+1) ÆäÖÐÕýÔò»¯µ¥Î»¾ØÕóAÊÇÒ»¸ö(m+1)¡Á(m+1)µÄ¶Ô½ÇÕ󣬶ԽÇÏßÉÏ´Ó×óÉϵ½ÓÒÏ£¬×î×óÉÏÒ»¸öÔªËØÎª0£¬ÆäËûÔªËØ¾ùΪ1¡£×¢Ò⣬ÕâÀïmÎªÌØÕ÷Êý£¬²»°üº¬³£ÊýÏÊäÈë¾ØÕóXºÍ±êÇ©ÏòÁ¿Y¶¨Òå±£³Ö²»±ä¡£ÆäÖжÔÓÚÊäÈë¾ØÕóµÄʵÏÖ´úÂëΪ£º x = £Ûones(n, 1), x, x.^2, x.^3, x.^4, x.^5£Ý; ¿ÉÒÔ»ñµÃÒ»¸ön¡Á(m+1)µÄÊäÈë¾ØÕóX£¬ÆäÖаüº¬n¸öѵÁ·Ñù±¾£» m¸öÌØÕ÷£» Ò»¸ö³£ÊýÏî¡£ÒòΪ±¾ÊµÀýÊý¾Ý¼¯ÖУ¬½öÌṩÁËÌØÕ÷µÄÒ»´ÎÏÆäËûµÄ¸ß´ÎÌØÕ÷ÐèҪͨ¹ýÒÔÉÏ´úÂë¼ÆËã»ñµÃ¡£ ½ÓÏÂÀ´¶Ô²»Í¬µÄÕýÔò»¯²ÎÊý¦Ëȡֵ£¬ÀýÈç¦Ë=0£¬¦Ë=1£¬¦Ë=10Õâ3ÖÖÇé¿ö£¬¸ù¾Ý×îС¶þ³ËÕý¹æ·½³ÌÇó½â¦È¡£µ±ÕÒµ½ºÏÊʵĦÈʱ£¬¿ÉÒÔ¶ÔÕÕ½â¾ö·½°¸ÖеÄÖµÀ´¼ì²â´ð°¸¡£È»ºó²Î¿¼Ô´ÂëÇåµ¥ÖеĴúÂ룬×Ô¼º¶¯ÊÖ»­³ö¶ÔӦÿһ¸ö¦ËÖµµÄ¶àÏîʽÄâºÏ½á¹û£¬Ó¦¸Ã»ñµÃÀàËÆÈçͼ3.6ËùʾµÄͼÐΡ£¹Û²ìͼ3.6£¬×ܽáÕýÔò»¯²ÎÊý¦ËÊÇÔõôӰÏìÄãµÄÄ£Ð͵ġ£ ͼ3.6²»Í¬µÄÕýÔò»¯²ÎÊýÄ£ÐÍ¶Ô±È Í¼3.6£¨Ðø£© 3.6.2ʵÀý¶þ: Å£¶Ù·¨ÓÅ»¯ÕýÔò»¯Âß¼­»Ø¹éÄ£ÐÍ ÔÚʵÀý·ÖÎöµÄµÚ¶þ²¿·Ö£¬½«²ÉÓÃÅ£¶Ù·¨À´ÓÅ»¯ÒýÈëÕýÔò»¯µÄÂß¼­»Ø¹éÄ£ÐÍ¡£Ê×ÏÈ£¬µ¼ÈëÂß¼­»Ø¹éѵÁ·Êý¾Ý¼¯£¬ÆäÖаüº¬Á½¸öÌØÕ÷¡£Îª±ÜÃâÓëÉÏÃæµÄÏßÐԻعé»ìÏý£¬¼ÙÉèÁ½¸öÊäÈë Êý¾Ýx°üº¬Á½¸öÌØÕ÷£¬·Ö±ð±íʾΪuºÍv£¬±êÇ©±íʾΪy¡£ÓÉÓÚΪ¶þ·ÖÀàÎÊÌ⣬ËùÒÔyµÄֵΪ1»òÕß0¡£½«Ò»¸öÌØÕ÷u×÷ΪºáÖá±äÁ¿£¬ÁíÒ»¸öÌØÕ÷v×÷Ϊ×ÝÖá±äÁ¿£¬¶ÔÑù±¾±êǩΪ1µÄ±ê¼ÇΪÕýÑù±¾£¬Îª0µÄ±ê¼ÇΪ¸ºÑù±¾£¬×Ô¼º¶¯ÊÖÔÚMATLAB/OctaveÖбàд´úÂ룺 x = load('ex362x.dat'); y = load('ex362y.dat'); figure;pos = find(y == 1); neg = find(y == 0); plot(x(pos,1), x(pos,2), '+'); hold on; plot(x(neg,1), x(neg, 2), 'o'); ÔËÐÐÒÔÉÏ´úÂ룬¿ÉÒԵõ½Èçͼ3.7ËùʾµÄÊý¾Ýͼ¡£ ͼ3.7Âß¼­»Ø¹éÊý¾Ýͼ ÓÉͼ3.7¿ÉÖª£¬Êý¾Ý³ÊÏÖÄÚÍâÁ½²ãͬÐÄ·Ö²¼¡£ÏÖÔÚ¿ªÊ¼¶ÔÕâ¸öÊý¾Ý¼¯½¨Á¢ÕýÔò»¯µÄÂß¼­»Ø¹éÄ£ÐÍʵÏÖ·ÖÀà¡£Ê×ÏÈ£¬Éè¼ÆÊäÈë¾ØÕó¡£Ö¸¶¨ÌØÕ÷uºÍvµÄÈÎÒâµ¥Ïîʽ£¬Ò²¾ÍÊǰüº¬ÔÚ¶àÏîʽÖеÄijһÏ¹¹³ÉÁù½×ÊäÈëÌØÕ÷Òò×Ó£¬Èçʽ(3ª²19)Ëùʾ¡£ÆäÖÐxÊÇÒ»¸öÓÐ28¸öÌØÕ÷µÄÏòÁ¿£¬Èçʽ(3ª²20)Ëùʾ¡£ x=1 u v u2 uv v2 u3 u2v uv2 v3 u4 u3v u2v2 uv3 v4 ¦ó v6(3ª²19) (x0=1,x1=u,x2=v,x3=u2,x4=uv,x5=v2,¡­,x28=v6)(3ª²20) ×¢Ò⣺ ÊäÈëÂß¼­»Ø¹éÄ£Ð͵ÄÌØÕ÷Á¿²»ÊÇuºÍv£¬¶øÊÇx0,x1,x2,¡­¡£ Ϊ±ÜÃâ±àдö¾ÙxµÄ¸÷Ïî´úÂëµÄÀ§ÄÑ£¬¿ÉÒÔÖ±½Óµ÷ÓÃÔ´ÂëÇåµ¥ÖеÄÒ»¸öº¯Êý£¬ÓÃÓÚÓ³ÉäԭʼÊäÈëÊý¾Ýµ½ÌØÕ÷ÏòÁ¿µÄʵÏÖ£¬´úÂëÎļþÃûΪmap_feature.m¡£ ²Î¿¼´úÂëËùʵÏֵĺ¯Êý¿ÉÒÔת»»µ¥¸öѵÁ·Ñù±¾£¬Ò²¿ÉÒÔת»»Õû¸öѵÁ·Êý¾Ý¼¯£¬Ö»Òª½«¸ÃÎļþ¸´ÖƵ½¹¤×÷Ŀ¼£¬²¢²ÉÓÃÒÔÏÂÃüÁîµ÷Óú¯Êý£º x = map_feature(u, v); º¯Êý·µ»Ø×ª»»ºóµÄÊäÈëÌØÕ÷ÏòÁ¿¡£´úÂëʵÏÖÖмÙÉèÔ­Ê¼ÌØÕ÷Êý¾Ý±»´æ´¢ÔÚÁÐÏòÁ¿uºÍÿðþ½›ÖУ¬ËùÒÔ£¬Èç¹û½öÓÐÒ»¸öѵÁ·Ñù±¾ÊäÈ룬ÿ¸öÁÐÏòÁ¿½«ÊÇÒ»¸ö±êÁ¿£¬Ò²¾ÍÊDZäÁ¿¡£ ×¢Ò⣬ÔÚʹÓÃÕâ¸öº¯Êýת»»Êý¾Ýʱ£¬ÒªÈ·±£ÊäÈëÊÇÏàͬ³¤¶ÈµÄÁ½¸öÁÐÏòÁ¿¡£ ÔÚ½¨Á¢Ä£ÐÍ֮ǰ£¬ÒªÃ÷È·ÓÅ»¯µÄÄ¿±êÊÇÈÃÒýÈëÕýÔòÏîµÄÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯Êý×îС»¯£¬»Ø¹Ë´ú¼Ûº¯Êý±í´ïʽΪ£º Min:J(¦È)=£­1n¡Æni=1yilogH¦È(xi)+(1£­yi)log(1£­H¦È(xi))+¦Ë2n¡Æmj=1(¦Èj)2 ¸Ã´ú¼Ûº¯ÊýÓëδÕýÔò»¯µÄÂß¼­»Ø¹éÄ£Ð͵Ĵú¼Ûº¯ÊýÏàËÆ£¬³ýÁËÔÚĩβÌí¼ÓÁËÒ»¸öÕýÔò»¯ÏËùÒÔÒÀÈ»¿ÉÒÔ²ÉÓÃÌݶÈϽµ·¨»òÕßÅ£¶Ù·¨À´ÕÒµ½ÒÔÉϺ¯ÊýµÄ×îС»¯²ÎÊý×éºÏ¡£ »Ø¹ËÅ£¶Ù·¨µÄ¸üйæÔò£º ¦Èt+1=¦Èt£­H-1(J(¦Èt))J(¦Èt) ÓÉÓÚÒýÈëÁËÕýÔò»¯ÏËùÒÔÌݶÈÏòÁ¿ºÍº£É­¾ØÕóµÄÐÎʽ±äΪ£º J(¦È)=1n¡Æni=1(H¦È(xi)£­yi)(x0)i 1n¡Æni=1(H¦È(xi)£­yi)(x1)i+¦Ën¦È1 1n¡Æni=1(H¦È(xi)£­yi)(x2)i+¦Ën¦È2 ¦ó 1n¡Æni=1(H¦È(xi)£­yi)(xm)i+¦Ën¦Èm H(¦È)=1n¡Æni=1£Û(H¦È(xi)(1£­(H¦È(xi))xi(xi)T£Ý+¦Ën0 1 ª÷ 1(m+1)¡Á(m+1) ÒÔÉϼÆËãÖУ¬Èç¹ûÕýÔò»¯²ÎÊý¦Ë=0£¬Ôò¿ÉÒÔ»ñµÃÓëδÕýÔò»¯Ö®Ç°µÄÌݶÈÏòÁ¿ºÍº£É­¾ØÕóÏàͬµÄ¹«Ê½¡£¹«Ê½ÖеÄxiÊÇÒ»¸ö(m+1)¡Á1µÄÏòÁ¿£¬ÔÚ±¾ÊµÀýÖÐÊÇÒ»¸ö28¡Á1µÄÏòÁ¿£¬ËùÒÔÇóµÃµÄÌݶÈÏòÁ¿J(¦È)Ò²ÊÇÒ»¸ö28¡Á1µÄÏòÁ¿¡£ ÆäÖÐxi(xi)TºÍº£É­¾ØÕóµÄά¶ÈÏàͬ£¬¾ùΪ(m+1)¡Á(m+1)£¬±¾ÊµÀýÖУ¬ÊÇ28¡Á28µÄ¾ØÕ󡣯äÖÐyiºÍH¦È(xi)¶¼ÊDZêÁ¿£¬Ò²¾ÍÊǵ¥±äÁ¿¡£ ÔÚº£É­¹«Ê½ÖеÄÕýÔò»¯µ¥Î»¾ØÕóÊÇÒ»¸ö28¡Á28µÄ¶Ô½ÇÕ󣬯äÖгýÁË×î×óÉÏÔªËØÎª0£¬¶Ô½ÇÏßÉÏÔªËØ´Ó×óÉϵ½ÓÒÏ£¬¾ùΪ1¡£½ÓÏÂÀ´¶Ô²»Í¬µÄÕýÔò»¯²ÎÊý¦Ëȡֵ£¬ÀýÈç¦Ë=0£¬¦Ë=1£¬ ¦Ë=10Õâ3ÖÖÇé¿ö£¬ÔËÐÐÅ£¶Ù·¨À´Çó½â¦È¡£ÎªÈ·¶¨Å£¶Ù·¨ÊÇ·ñÊÕÁ²£¬¿ÉÒÔÔÚÿһ²½µü´úÖÐÊä³ö´ú¼Ûº¯ÊýJ(¦È)µÄÖµ£¬¹Û²ìͼÐΣ¬ÊÇ·ñÔÚµü´ú×îÖÕµÄÈô¸ÉµãÉÏ£¬´ú¼Ûº¯ÊýÓ¦¸Ã²»»áÔÙ¼õС£¬Èç¹ûÆäÖµ¼õСÁË£¬¾ÍÐèÒª¼ì²éÊÇ·ñ´ú¼Ûº¯ÊýµÄ¶¨ÒåÕýÈ·£¬Í¬Ê±¼ì²éÌݶÈÏòÁ¿ºÍº£É­¾ØÕóµÄ¶¨ÒåÊÇ·ñÕýÈ·£¬È·±£ÕýÔò»¯µÄ²¿·ÖûÓдíÎó¡£Èç¹ûÅ£¶Ù·¨ÊÕÁ²£¬¿ÉÒÔÓûñµÃµÄ¦ÈÈ¥Çó½â·ÖÀàÎÊÌâµÄ·Ö½çÏߣ¬·Ö½çÏߵ͍ÒåΪ: P(y=1|x;¦È)=0.5ªÝ¦ÈTx=0 »­³ö·Ö½çÏßÏà¶ÔÓÚ»­³öÏßÐԻعéµÄ×î¼ÑÄâºÏÇúÏ߸üΪÀ§ÄÑ£¬ÐèÒªÒÔ»­ÂÖÀªÏßµÄÐÎʽ£¬ÏÈ»­³öÒþº¬µÄÖ±ÏߦÈTx=0¡£¾ßÌåʵÏÖÊÇͨ¹ýÔÚÍø¸ñ»¯ºóµÄԭʼÊäÈëÊý¾Ýͼ£¬Ò²¾ÍÊÇuΪºáÖáºÍvΪ×ÝÖáµÄ¶þάͼÉÏ£¬ÆÀ¹À¦ÈTxºó£¬»­³ö¦ÈTxΪ0µÄÇúÏߣ¬ÔÚMATLAB/OctaveÉÏ»­³öµÄͼÐÎÈçͼ3.8Ëùʾ¡£Îª»ñµÃ×î¼Ñ³ÊÏÖЧ¹û£¬¿ÉÒÔ²ÉÓÃͼÖÐËùʾµÄ×ø±êÁ¿³Ì·¶Î§¡£ ͼ3.8²»Í¬µÄÕýÔò»¯²ÎÊýÄ£ÐÍ¶Ô±È Í¼3.8£¨Ðø£© µ÷ÓÃmap_featureº¯Êý£¬¾ßÌåʵÏÖ´úÂëΪ: u = linspace(-1, 1.5, 200); v = linspace(-1, 1.5, 200); z = zeros(length(u), length(v)); for i = 1:length(u) for j = 1:length(v) z(i,j) = map_feature(u(i), v(j))*theta; end end »­·Ö½çÏߣº %ÔÚµ÷ÓÃcontourÃüÁîÏÔʾͼÐÎǰ£¬¶ÔzÖ´ÐÐתÖòÙ×÷ z = z'; contour(u, v, z, £Û0, 0£Ý, 'LineWidth', 1) µ÷Õû¦ËµÄÖµ£¬ÔËÐдúÂ룬¶ÔÓÚ3¸ö²»Í¬µÄ¦ËÖµ¿ÉÒԵõ½Èçͼ3.8ËùʾµÄ½á¹û¡£ ×îºó£¬ÒòΪ¦ÈÓÐ28¸öÔªËØ£¬½â¾ö·½°¸Öн«²»Ìṩÿ¸öÔªËØµÄÒ»Ò»¶ÔÓ¦µÄ¶Ô±È½á¹û¡£MATLAB/OctaveÖУ¬¿ÉÒÔ²ÉÓÃnormthetaÃüÁî¼ÆËã¦ÈµÄL2ÕýÔò»¯½á¹û£¬ÕâÑù¾Í¿ÉÒԲο¼½â¾ö·½°¸Öеıê×¼À´¶Ô±È¼ì²â×Ô¼ºµÄʵÏÖ½á¹ûÊÇ·ñÕýÈ·¡£ 3.6.3²Î¿¼½â¾ö·½°¸ ÔÚÍê³ÉÒÔÉÏʵÀý·ÖÎöºó£¬Çë²Î¿¼ÒÔϽâ¾ö·½°¸£¬¼ì²éÖ´Ðнá¹ûÊÇ·ñÕýÈ·¡£Èç¹ûÔÚÑ¡ÔñÓëÏÂÊöÏàͬµÄ²ÎÊý/º¯ÊýµÄ»ù´¡ÉÏ£¬µÃµ½²»Í¬µÄ½á¹û£¬µ÷ÊÔÄãµÄ·½°¸Ö±µ½µÃµ½½â¾ö·½°¸Öиø³öµÄ½á¹û¡£ÕýÔò»¯µÄÏßÐԻعéºÍÂß¼­»Ø¹éʵÏÖ´úÂ룬ԴÂëex361.mºÍex362.mÎļþ¼û±¾ÊéÅäÌ׵ĴúÂëÇåµ¥¡£ 1. ʵÀýÒ»£º ×îС¶þ³ËÕý¹æ·½³Ì·¨ÓÅ»¯ÕýÔò»¯ÏßÐԻعéÄ£Ð͵ÄÔËÐнá¹û £¨¼û±í3.3£© ±í3.3×îС¶þ³ËÕý¹æ·½³Ì·¨ÓÅ»¯ÕýÔò»¯ÏßÐԻعéÄ£Ð͵ÄÔËÐнá¹û ²ÎÊý ¦Ë=0 ¦Ë=1 ¦Ë=10 ¦È0 0.4725 0.3976 0.5205 ¦È1 0.6814 -0.4207 -0.1825 ¦È2 -1.3801 0.1296 0.0606 ¦È3 -5.9777 -0.3975 -0.1482 ¦È4 2.4417 0.1753 0.0743 ¦È5 4.7371 -0.3394 -0.1280 norm¦È 8.1687 0.8098 0.5931 ×¢Ò⣺ µ±¦ËÔö´óʱ£¬¦ÈµÄ±ê×¼Öµ½«¼õС¡£ÕâÊÇÒòΪһ¸ö½Ï¸ßµÄ¦ËÒâζ×ÅÒ»¸ö½Ï´óµÄÄâºÏ²ÎÊý¡£Í¨¹ýµ÷Õû¦Ë£¬¿ÉÒÔ¸üºÃµØ¿ØÖƶÔÊý¾ÝµÄÄâºÏ³Ì¶È¡£ ÔÚͼ3.6(a)ÖУ¬¦Ë=0£¬Òâζ×ÅÕâ¸öÄâºÏÊǺÍδÕýÔò»¯µÄÏßÐԻعéÏàͬµÄ¡£ÓÅ»¯µÄÄ¿µÄÊÇѰÕÒ×îСµÄƽ·½Îó²î£¬Õâ¸öÇúÏß¶ÔÕâ¸öÊý¾Ý¼¯ÊÇÓÐЧµÄ£¬µ«ÊÇÒ²ÐíÇúÏß²»ÄܺܺõرíÏÖÊý¾ÝµÄ±ä»¯Ç÷ÊÆ£¬Õâ¾ÍÊǹýÄâºÏ¡£ ÔÚͼ3.6(b)ÖУ¬Í¨¹ýÒýÈëÔö¼ÓµÄÕýÔò»¯²ÎÊý¦Ë=1£¬¹ýÄâºÏ±»ºÜºÃµØÏ÷ÈõÁË¡£ËäÈ»Õâ¸öÄâºÏº¯ÊýÒÀÈ»ÊÇÎå½×¶àÏîʽ£¬µ«ÊÇÓëͼ3.6£¨a£©ÖеÄͼÐαȽϣ¬ÇúÏßÏԵøü¼òµ¥¡£ ÔÚͼ3.6(c)ÖУ¬µ±¦ËÌ«´óʱ£¬Ç·ÄâºÏ³öÏÖ£¬Í¬Ê±ÇúÏßÒ²²»ÄÜÏñ´ËǰһÑùºÜºÃµØ¸úËæÊý¾ÝµãµÄ±ä»¯Ç÷ÊÆ¡£ 2. ʵÀý¶þ£º Å£¶Ù·¨ÓÅ»¯ÕýÔò»¯Âß¼­»Ø¹éÄ£Ð͵ÄÔËÐнá¹û£¨¼û±í3.4£© ÒÔÏÂΪţ¶Ù·¨ÊÕÁ²ºóµÄ¦ÈµÄ±ê×¼Öµ¡£¶ÔÓÚ¦Ë=0£¬ÊÕÁ²ÐèÒª15´Îµü´ú£¬¶ÔÓÚ¦Ë=1 ºÍ ¦Ë=10ÐèÒª5´Î»òÕ߸üÉٵĵü´ú´ÎÊý¡£ ±í3.4Å£¶Ù·¨ÓÅ»¯ÕýÔò»¯Âß¼­»Ø¹éÄ£Ð͵ÄÔËÐнá¹û ²ÎÊý ¦Ë=0 ¦Ë=1 ¦Ë=10 norm¦È 7.1727e+03 4.2400 0.9384 ×¢Ò⣺ µ±¦ËÔö´óʱ£¬¦ÈµÄ±ê×¼Öµ¼õС£¬ÔÚ¶ÔÓ¦µÄͼÖпÉÒÔ¿´µ½ºÜÃ÷ÏÔµÄÄâºÏ±ä»¯¡£ ÔÚͼ3.8(a)ÖУ¬Ëã·¨ÊÔͼÕÒµ½Ò»¸öÔÚÕýºÍ¸ºÑù±¾Ö®¼ä·Ç³£¾«È·µÄ·Ö½çÏß¡£ËùÒÔÔÚ´ó·¶Î§µÄy=1ÇøÓòÖУ¬³öÏÖÁËÒ»¸öy=0µÄ¹ÂµºÇøÓò¡£ÕâÑùµÄ·ÖÀà½á¹û¹ýÓÚ¾«È·£¬³öÏÖ¹ýÄâºÏ£¬²¢²»ÊÇÄ£ÐÍÖÂÁ¦ÓÚÕÒѰµÄÓзº»¯ÄÜÁ¦µÄ·ÖÀàÇ÷ÊÆ¡£ ¶ÔÓÚͼ3.8(b)ÖЦË=1µÄͼÐΣ¬ÏÔʾÁËÒ»¸ö¼òµ¥µÄ·Ö½çÏߣ¬Ï൱ºÃµØ·ÖÀëÁËÕýµãºÍ¸ºµã¡£ ¶ø¶ÔÓÚͼ3.8(c)ÖЦË=10µÄͼÐΣ¬´ËʱµÄ¦ËÒѾ­ÊÇÒ»¸öÏ൱´óµÄÕýÔò»¯²ÎÊý£¬ËùÒÔ·Ö½çÏß²»ÄܺܺõظúËæÊý¾ÝµÄ±ä»¯Ç÷ÊÆ£¬ÓÈÆäÃ÷ÏԵرíÏÖÔÚͼÖеÄ×óϽDz¿·Ö£¬³öÏÖÁËÇ·ÄâºÏ¡£ ×ܶøÑÔÖ®£¬ÕýÔò»¯ÊDZ£Ö¤»úÆ÷ѧϰģÐÍ·º»¯ÄÜÁ¦µÄÓÐЧ¼¼Êõ£¬Ä¿Ç°ÓжàÖÖÕýÔò»¯·½·¨£¬ÈçÊý¾ÝÔöÇ¿£¬L0¡¢L1ÕýÔò»¯£¬L2ÕýÔò»¯£¬Dropout²ÎÊýºÍÌáǰֹͣµÈ¡£ÆäÖУ¬Êý¾ÝÔöÇ¿ÀýÈç¿ÉÒÔͨ¹ý¶ÔԭʼͼÏñµÄÐýת¡¢²Ã¼ô¡¢É«²Ê¿Õ¼ä±ä»»µÈ»ñµÃÀ©Õ¹ÑµÁ·Êý¾Ý¼¯¡£¶øL1ºÍL2ÕýÔò»¯ÊÇ×î³£Óõķ½·¨£¬L1ÓòÎÊýµÄ¾ø¶ÔÖµ×ܺÍÀ´¹¹½¨ÕýÔò»¯ÏL2²ÉÓòÎÊýµÄƽ·½×ܺÍÀ´¹¹½¨ÕýÔò»¯Ïî¡£Dropout³£¼ûÓÚÉñ¾­ÍøÂçÄ£ÐÍѵÁ·ÖУ¬ÔÝʱ¶ªÆúÒ»²¿·ÖÉñ¾­ÔªÒÔ¼°Á¬½Ó£¬Í¨¹ýËæ»ú¸ÅÂʶªÆúÉñ¾­Ôª¿ÉÒÔÓÐЧµØ·ÀÖ¹¹ýÄâºÏ¡£Ìáǰֹͣ·¨£¬¿ÉÒÔÏÞÖÆ×îС»¯´ú¼Ûº¯ÊýËùÐèÒªµÄµü´ú´ÎÊý¡£Ò»°ãµü´ú´ÎÊýÌ«ÉÙ£¬ÈÝÒ×Ç·ÄâºÏ£» ¶øµü´ú´ÎÊýÌ«¶à£¬ÈÝÒ×¹ýÄâºÏ¡£Ìáǰֹͣ·¨¿ÉÒÔͨ¹ýÈ·¶¨µü´ú´ÎÊý½â¾öÕâ¸öÎÊÌâ¡£ 3.7ϰÌâ 1. ÕýÔò»¯(Regularization)ÊÇÒ»ÖÖÐÐÖ®ÓÐЧµÄ±ÜÃâÄ£Ð͹ýÄâºÏµÄ·½·¨£¬ÕýÔò»¯×÷ΪģÐÍ´ú¼Ûº¯ÊýµÄ³Í·£Ï¿ÉÒÔÔöǿģÐ͵ķº»¯ÄÜÁ¦¡£ÕýÔò»¯º¯Êý¿ÉÒÔÓжàÖÖÑ¡Ôñ£¬Ò»°ãÊÇÄ£Ð͸´ÔӶȵĵ¥µ÷µÝÔöº¯Êý£¬Ä£ÐÍÔ½¸´ÔÓÕýÔò»¯µÄÖµ¾ÍÔ½´ó£¬ÀýÈçÕýÔò»¯Ï¿ÉÒÔÊÇÄ£ÐͲÎÊýÏòÁ¿µÄ·¶Êý£¬Çë¼òÊöL0¡¢L1ºÍL2·¶ÊýÕýÔò»¯Ï¶ÔÂß¼­»Ø¹éÄ£Ð͵ľßÌåÓ°Ïì¡£ 2. Âß¼­»Ø¹éÊÇÒ»ÖÖ¼òµ¥ÓÐЧµÄ·ÖÀàÄ£ÐÍ£¬¿ÉÒÔÓÃÓÚ¼òµ¥µÄ¶þ·ÖÀàÎÊÌ⣬Ҳ¿ÉÀ©Õ¹µ½¶à·ÖÀàÎÊÌâµÄÇó½âÖУ¬Çë²Î¿¼±¾Õ½ÚѧϰÖеÄÂß¼­»Ø¹éMATLAB/Octave·ÂÕæ´úÂ룬²ÉÓÃC/C++»òÕßPythonÓïÑÔʵÏÖÂß¼­»Ø¹éÄ£ÐÍ£¬²¢¿ÉÊÓÎÞÕýÔò»¯ºÍ¼ÓÈëL2ÕýÔò»¯ºó£¬Âß¼­»Ø¹éÄ£Ð͵ÄѵÁ·µü´úÇúÏߺͲâÊÔÇúÏߣ¬ÌÖÂÛL2ÕýÔò»¯¶ÔÄ£ÐÍÐÔÄܵÄÓ°Ïì¡£