Download as pdf
Download as pdf
You are on page 1of 79
Introduction CONTENTS | Parti + Learning, Types of Learning 1.2L to M71, Well Defined Learning 17h LoL, Problems, Designing @ | ‘Learning Syetem Part: Hlitory of ML, Introdustion.. 1.9L to 24 of Machine Leaning Approaches Afi Neural Network Clastering, Relnfrcenent Ussrang Deion Tree Learning, avesian Networt, Support Voce Machine, Genetic Agni Terues in Maching Learning sed Dota Sdence Vs iachine Learning 1-241. to 1-261. } 1A L(C9MTSem-5) 1-21 (C8IT-Sem-5) Introduction Gus] Define the term learning. What are the components of learning system ? ‘Answer 1 Learning refers tothe change in a subject's behaviour toa given situation [brought by repeated experiences in that situation, provided that the behaviour changes eanat be explained on tho basis of native response ‘tendencies, atienlation ar temporary states ofthe subject. 2. Leaningagant canbe though of rcotsining x performance clement Sirah tas atearning leone a mots {2S ertrnanescloment ha tnaes better decison. 4, Tho design oa learning clement aected hy three major ssos Components ofthe performance element 1h Feedbck of components Representation of he componente “The important components of learning are: Sima antl Tamer Banas tee T crake mae 3 Jase ay ae Feiner “Sneneet -— Fg... General ering mde Acasiston of mew knowede: ‘4 One component of learaing is tho aquisition of new knowledge, neering Dees _____—“SIi Sem etnies __}SH(CS7TTSem.p empl ata nuntion my OF COMPELS, ven though it, Seah roe 2 Problem sling: -fleasng stb problem saving tat is reir eae cent oor irc tees fr har oration wen eguied fats ae then pred Beet] wee down the pvtormance means for loaning, ‘Anewer Following are the performance measures for learning are: 1. Generality: ‘ Themott important performance measure for learning methods ‘he generality or sope ofthe method. 1 Generalityis measure ofthe case with which the method canbe ‘septed te diferent domminsf apliation. Acompltly gonerl algorithms ane which isa fixed or self adjusting Configuration that ean learn or edapt in any environment or sppleatio domain, 2 Efficiency: 1 Thecffiency a methadiea measur ofthe average time required ‘meant he eget knowledge tractares rom sme speek Shs man ten dio determine andi meine ‘ithotsome nandardconparin tinea lative effleny der ‘can be used instead. ms * 2 Robustness: 1 aburtnes isthe ability of «learning system to function with sinh eed and witha vanity taming carpe, into ‘robust stem must beable to build 2 an ‘hus be ale to build tentative structures whi Subjected to madifeation or withdrawal if avr ound 9 1 inconsistent with statistically sound srutares 4 tcacy: 4. The efficacy of aystom ‘stem tis acombanato Ease of implementation a8 Hass of mplemertation re lates othe complexity ofthe progr sd dats tructues realest od the rewureer required to develop tH ‘na measure of the overall power of af the factors generality fiery AL (CSMT-Sem-5) 1b Lacking good complexity met coment subjective ntroduetion ‘this messure will offen be GIETE | Dinca spersised and uneaperied larg Supervised learning: L 2 rapt eatare 10. visod arin lo knowa ms asoctive learning a which Serene rane by pewiting wih input and maton output ptr Suporsend raining requires the pairing ofeach input veto wth S1Bige veto opreanting the dosed ott ‘Tes nput vector together with he corresponding target veto it ‘ale asia i. sehng Torta Weightthucehald “ruetest rar Sepa amee Son ne ving th telaingmsion anny vector inp ote network, nditfntiin sm outpt veto ‘hi resnone is ompared wih te target response {tbe actual espn ifr fom ha tage respnsy, he network srilgenrte ameter signa ‘This error sgn i thon ted to cleat tho adjustment that ‘Roch be ain the papi weights 2 atthe atl OPE rash to tong outpee ‘Theerrorminimiationinthiskin of training eqakes superior orien ‘hee input-output recanb provided an external cache, ot pike autem whens the noua networs nauperis® Suowrvited training methods are wae to perform nonlinear spine in pitrn dsicaton natok, ers acocatloy ‘orks and malar neural etworks. shi aie ect ESL CCAR a ewer al mal at ape todesird outputs tie, sa, momnocnen benenit implemented aa oa ‘incase based reasoning or the nearest neighbour age 48, Inordertoste problem of supervised learning otowing, considered : ee 7 tein they of tsining examplos ii Gathering atraining set. Si Deermin the put fentur representation of erie he aa esenaton of he may bs Determine the structure ofthe learned funsion ‘corresponding learning algorithm. - ve Compleethe design. aspersednersngs ((eso0\0°9) 1 Iisa Jearing in which an output unit is trained to respond ‘dusters of pattern within the input. a 2 Uneuperied traning a employed in slorganisig xo ‘networks, ™ sine 8. Tistrinng dos ot require tae. 4 nth metd of raining, he it vector f sna types a ‘rouped ito th eo training data To spesly how ta ‘ember cack grouplooks orto which group a member bp 5 During tring the neural network reels inp per a erga hae patra coger 8 When new input patra is aplied th neural network provi utp response indicating the clase to which te np alee tana 4 class amot be fund forthe apa pattern, nw dt sere, Toh unsupervised ining doesnt require a teaches trea in iinet frm gouge ‘ Grouping ea be done 0 Siang ane done based onclor, shape and any thes por! 1k ie 4 method of ® Taka method of machine learning where « model i¢ iedistinga pons eli rm supervised learning by theft that Tnthis,adata sto input o etree input objects ava (Probabilities, " beets ie gathered, fet of random variables Itean ian inference to produce condi ioe 1-6L(CSIT-Sem-5) 14, Unsupervised earnings ofl for data compre and casering. ‘esto deseibing state ‘the environment Taniag [Fig 1.32, Blok dagen of uneuperviae Iarning. 16, Inunpervised learning, yetem it supposed discover statistically salient features ofthe pypulation. 16, Unlike the spervse learning paradigm, theres tpi et of tntagies into which the paltora ae tole classified rather the ‘system must develop ts owe representation of the input sin ‘TEETH | Deseribo briety reinforcement learning? = 1 a ‘Reinforcement learning the sty ofhow aif system can ara to ‘optimiae thei behaviour in the feo of rewards and punishments Reinforcement learning lgrithme have baen developed that are closely Felated o methods of dyaanc programming whichie general approcch optimal conte Reinforcement leering phenomena havebeen observed in poyebological ‘odie of animal behaviour, and in nourubilogieal investigations of ‘neuromodolation and addiction Primary Beate tape) snr Bavireameat_}— rie | erate ‘Actions signal Taarning Fig. fi Blc ding ef ieee arsing ‘Tho ask of wnfremsn leering toe nerve evar olen optima ley fr thn envionment. ‘Anoptiol lic ia poly that maximize te expected ttl evar “Wshot ane eeu abst what gn and what tbe gos Yh no grounds fr doing whieh move oa "The agente nie to know that something hax bape when It ‘rms ah that something bad as hapene when kee “his Knd of feedback called a reward or eincement Machi ae el panies ______* Loam jn very valle inthe il of roti ineement enrsig i of abot, winery © Rte rae teeny complex owagh i th ah 0 Pa orang tac avian ea ni af di ot, throug eal anderen The rt a re ds eta ation and whe = 1 fnmany ap huminnnin 0 a een acl lar t Wall, thin ually happens without Fe rena atbereasply through reinforcement. = 18, sont atompla at working are warded by forward progres, an Seco tmp wre penelzedby ofan paint fain 1 Protve and nogtive reinforcement are alao important factors ee dade 1b, inna eampes domain, enforcement leering isthe on crate eine rogram o perform a high level, aa GeiE | What are the stepe used to design a learning system? Tarver Steps used to design a Tearning system are : 1. Specify thelearning task. (Chose suitable set of training data to serve asthe training experience. Divide the training data into groupe or elasses and Ibe! accordingly. Determine the typeof knowledge representation tobe learned from the training experience igo Leasing Techniaues lar way. 8 Chose learner classifier hat on generate goneral hypotheses fm ‘the training data, . fener amateres Apply the learner classifier totest dat Compare the performance ofthe sytem with that of an expert human. i (sane i *\ es — Fig. 1.5.4, a si [PaRT-2 ‘Well Defined Learning Problems, Designing a Learning System: STAT write snort note on well defined Learning problem with example. Tower Well defined learning problem “Acomputer program ital to learn from experience with respect to some seer tcks and performance measure P its performance at tasks in, ‘Ssmeasured by P, improves with experience K. ‘Three foatures in learning problem 1. Theelass of tasks (7) 2. The ieasure of performance tobe improved (P) 18. Tho-source of experience (5) 1b Performance measure (P) : Pereont of games won against opponents ‘e Training experience (): Playing practice gemes aguinst tet 2 Ahandwriting recognition learning problem : ‘a. Task (1): Recognizing and classifying handwritten words within bi Performance measure (P): Percent of words correctly clasified. database ofhandwrtten words with 3 Arobot driving learning problem: ‘a. Task (7): Driving onpuble ourlane highways using vision senso, bi Performance meamare(P): Average distance travelled before an ‘rvor (az judged by human averse" ‘e Training experience (B) : A sequence of images and steering commands recorded wile obaerving a human driver GaeIA | Describe well defined learning problems role's in ‘machine learning. Machine enening eehnigne® Machine Leerong ‘Wel defined learning problems roles in Machine learning, 1 Learning to recognize spoken words a gavel posh eveition systems employ machin some form Teron, 1 Forexanpe, th SPINK eytom learns speakar- specie, {erecting the pint sounds phonemes ond ni ieoerved speech signal. ng Neural network learaing methods and methods fo ler Marhor models ae elective for automatically sa indvidual speakers, weeabularies, mierophone de backend as, 12 Learning o drive an autonomous vehicle: 4 Machine luring methods havo been uted to tan cop controlled vehicles to ster correctly when driving omnes ad pee nemarai Forexample the ALYINN epsom as ueadt eared erty This is used in many applications like sys automation, ltt enor edvecuntreenelinoes 2 Speech reconiton: a Bye ications of machine learning? Recognition (SR) isthe translation of spoken words int b. Teisaleo known ae Auto een tti ss Automatic Spach Recognition ASR, omit Speech To"Text (ST, 1-121. (C8T-Sem-s) ‘2 MLprovids methods techniques, nd teas tht can help in slving ‘Sognoste and prognostic prolamin avarity of medical domains 1b. This being used for the analyse ofthe importance of eliniel ‘parameters and their sombinatlons fr rognass, 4. Statistioal arbitrage + ‘In finance, statistical arbitrage refers to automated trading ‘trteios that are typical ofa short-term ad involvea large number ‘tecture, 1b Ineuch strategie, the wer ties toimploment trading algorithm ora set of securities onthe bass of quantities such as historical ‘correlations en general economie varlabls. Learning assoctations + Lear discovering relations betwace vari @ Extraction | Information Hxtraction (IE) ie anothor application of machine learning. 1k Ik in the process of extracting structured information from tteuctured dat, ‘QaSTAT] What are the advantages and disadvantages of machine earning ? “Anewer Advantages of machine learning are 1. Easily identifies trends and patterns? 1 Machine learning can review lange volumes of data and diseover specific tronds und pattern that would not be apparent tohumans. 1b Foran e-commerce website lke Flipkart, it serves to understand thebrowsing behaviours and purehase histories ofits user to help cater to the right products, deals, and reminders rolovant to them, buses the results torevea! relevant advertisements to them. 2 No human intervention needed (automation): Machino leering {doesnot require physical fresno human intervention is needed. Continuous improvement + | MLalgorithms gain experience, thoy keep improving in accuracy ‘and efficiency, 1b. An the amount of data keeps growing, algorithms learn to make ‘accurate predictions faster Machine Lesening Techies chine Learning Tecate _ |{. Handling multi-simensional and mul a ei aig ee ee ng a ienotne . 8 hat ey ind Disadvantages of machine learning are : 1, Data acquisition: sae sa Machine leerning requires massive data cots to train on, ‘haul beinlusive/unbiased, and of good quality, 2 Time and resources : 4 ML neds ena tne to et Che algoritims lear an de coough to fall their purpose with a considerable am curacy and relevancy. a 1 Italo needs massive resources to function, Interpretation of results ‘To scurately interpret recut generated by the algorithms, We rust carefully chose the algorithm fr our purpose, 4 High error-susceptibility : Machine lsruig s autonomous but highly susceptible ta errers 1 Tetakes time to recognize the source ofthe issue, and even longer ‘GoeA8 | What arethe advantages and disadvantages of different ‘pes of machine learning algorithm ? ‘Answer Advantages of supervised machine learning algorithm 1 Clases represent the features on the ground. 2 808 these ig dats is reusable unless features change. Disvantages of wupervised machine learning algorithm = ot match spetral lasses. % Costand time are involved in selecting training date Advan "ages of unsupervised machine learning algorithm : No previous keomledge Tre opportan ofthe image aren is required fork tama error is minimied Relatively eary and fst to carryout 114 LCST Sem) Introduction Disadvantages of unsupervised machine learning algorithm ; 4, The spectral clases do nat necessarily represent the features onthe found. 2, edocs not consider spatial relationships inthe data. ‘8. ean takotimetoinarpret the spectral classes. Advantages of semi-supervised machine learning algorithm : 1, Teiseasy to understand, 2 Ttreduces the amount of annotated data used, 8. Liestable fart convergent, 4 Kiesimple 5, Tehashigh eieney. ‘Disadvantages of semissupervised machine learning algorithm 1. eration results exe not stable, 2 Itisnot applicable to network eve data, 3. Rhaelow accuracy, Advantages of reinforcement learning algorithm : 1 Reinforcement learns used to solve complex problems that cannot besolved by conventional tchniquas 2 Tals teehniqueis prforrod to achieve long-termresults which are very liielt to ehieve, 8. This learning model ix very sail othe earning of human beings, “lence, is cles to ocbiving perfection. Disadvantages of reinforcement learning algorithm : 4 Two much seinforeoment learning ean lead to an overosd of states ‘which can dinish the results 2 Reinforeoment learnings nat preferable for solving simple problems 3. Reinfrcoment learning needs alot of dats and lot of computation. “4. Tho curse of dimensionality limite reinforcement learning for real hysial pst, Que LAG] Write short note on Artificial Neural Network (ANN). nat 1 Atifcial Neural Networks (ANN) gr neural networks are computa ‘porns that intended osnulte the behaviour boogie systems ‘composed of neurons Nachin Lari Teshniies_ 2 LBL ‘SMT Sem) Ne arecomptatina mel iapied hy an animal cna stems Ttisenobe ot machine ering a wll pattern reogniton, ‘Aneura network isanoriented graph Itconsists ofnader why peng ihn ‘palo analogy represent neUrOns, connected by ares he Ieoxrsponds to dendrites and synapses, Each ar atocited wig sweet teach node . ‘Anearal netrork it a machine learning sgorithm based on the ‘fanaa, The an brn onists of liens ser 1esends an proce sgl th forms of elotical and ehemic ign, ‘how neurons reconnected with asposaletrueture own says ‘Syapses llow neurons to pass signals ‘An Artifical Noural Network s an information procesing techniqy ‘works ike the way human brain procosses information, ANN includes alarge number af connected processing unite that work ‘ogther to processinformation. Thay sleo generate moaningfl rls from QeTAE] Write short note on clustering. Tower ] 1 (Clusteringisa division of datainto groups of similar objects, ach gropor cluster consis foes that are similar among thats and dissimilar to bjets of other groups aa shown in Fig, L141. ig 14.1. Chonters, of data objets that are similar to one ee clator and are diasinlar to the object ithe Acustr i calla itn the a ater. Chater maybe desea Stee containing relive other bya region conics Fram th machine I unsupervised Ir Ch tering analy, a conneted regions ofa mules iy high dey pint separated 0 ing relatively low density point views tt ening perspective, clustering can be ot eoncepts ata objects without help of known clas abe 1-181 (CST Sem-5) 1 1, Introdction In custering, tho cass labels are not presen in trinng data simply because they are not knowen to cluder the data abject Hence itis the type of eneuperviaed lenming For this aso, clustering form of learning by absryation rather than earing hy example "There ar certain situations where clurterng is wef. Theale ‘© The colletion and casifcaton of traning dat can be ently and time consuming. Therefore i dil ty collet traning data fet. Alarge number oftesning samples are natal beled Tent | eft train a supervised elasedter with a stall portion uf ‘raining dts thon se catering proadreto tan the aster Inseam the larg, wacnsifed dant or dta mining, it ano useful to search fr grouping among the ddataand thon rengrze the later, The properties of feature ects can change over time, Thea, scaervisedcnsfication snot reasonable Because et feature ‘veeiorsmay have completely diferent properties, 44 Thocustering caa be tefl when iit reuired to search for goed arametrie families forthe class conitonal deneitien ia ease of ‘upervisod asiintion, ‘QEETAE | what are the applications of clustering? Ter] L Followingare the applications ofelutering: Data reduetion, ‘5 Immany cates, tho amount of avaiable data i very lange a its racessing becomes complicated, Custer analysis con be used to group the data into a number of stusters and then proses ach duster as single entiy rl 5 Inthis way data compression i achisved. iypothesis generation : 5 Inthis com cluster analyse isaplied oa ata et tinker hypothesis ‘that concerns abputehe nature ofthe dats. Clustering is used here to suggest hypothesis that mise veried usingother dataset, Hypothesis testing : In thie contest, caste analsi suse forthe verification ofthe valiity af pei bypothere. Prediction based on groups: se | Inthiscase, duster analyses ppliad tothe valle tase {hen the ruling custre are characterized based onthe ‘characteristics ofthe pater by which sey re forme Machine Lesenng Teeigue® JATL CSAP Sem.) See fa imei aance © arena mg ge ee eee ee FTI] Dtterentate between clustering and classification, Taawer [SNe] Clustering: 1. | Clustering analy data beets | Inclaseifcation, data are ‘vihout known elas abe, srouped by analyzing the data bets whose ela abe is aowa. Ginesifiontion | Thereisno prior knowledge ct | Thereicsome prior ‘theattsbutes ofthe data to form | knowledge ofthe atributs of ester. cach lansieation ‘sdone by lacing output ‘based onthe values af the spotdta. 3 | Ieisdone grouping aly the LASL(CSIT Sem) “Answer | 1 Clustering technique aro used fer combining observed examples int ‘haters or groupe which say wo following main eter {Each rouy orca a aopegns example olen the 1b. Bach group or cluster should diferent from other lstrs ie, ‘amples that belong one cluster should be diferent fo the ‘ape ofthe other ltrs 2 Depending on the eustving techniques, hustrs canbe expresied in Giferent wage {Tenkfedhstere maybe exlasivecothat any exampl belong to only one cluster They may be overlapping, an example may belong to several cartes ‘eThey may be probabilistic‘, an example elon to ech aster witha certain probability 4 Glosteramight havohierarhial strustre Major classifications of clustering techniques are: Inradton Z| The somber cone ct |The amber fain icown ifr surering These | known before caption re _are identified after the there is predefined output =— Sompncnnsloern ‘eto pu dt | — H Fea [Ban] [Redan] [Gor] | sel Fis onsidered asthe learning because there ino peor supervised earning bcaus | nowledgecftheclat lable. | elas label are known bere. Que TAT] What are the various clustering techniques ? ig. L171. Type of choi. Once seriterion function hasbeen elected, clstering bees a ‘welldelinad problem in discrete optimization. We fad those partinsof tho set of samples that extrema terri incon. .Thesumple ect fist, there aeonly finite numberof possibe parton. 4. The clustering prblam can always be solved by exhaustive Hicearchieal lustering 4 Thesmelod works by grouping data cect ita a roof caters 1h This method can be further clase dope 0 whether be ‘rari! desomposinn farmed in eto up (sient own splitting fshion. ollowing are the two typesot hierarchical ehstering 119 L(C8TT-Sem5) Machine Leeraing Techniques -STSem) a. Agslomeratie hierarchteal clustering? Thisbotom up. Meet pacing coc abject in its own clustor and then {hese alone cluterasnt larger and larger clusters, until allofthe ‘jest are sa singe custer, 1. Divisve hierarchical clustering: This top down stratogy does the reverse of agglomerative ‘rateg by ating with al objects in on cluster. ii Weeabivides the eluster iota analler and smaller peces uni ‘ach alject forms a eloster on its am. 2 Partitional clustering: ‘This method fist crontee an inital gt of numberof partitions “shen sash petition roprosent a laser 1b Theclasters are frmeadtoeptiniz an objective partition eiteron uch ava dismay function based on distance so that the objects ‘wthinaclustraesinilar whereas the objects ofiferent dustre ‘we dss, Following are che pee ef partitioning methods 5. Conteoid based clustering: 1. In this it takes the ispat parameter and partitions wast of ‘objet nto a mumbor of clusters co that resting intasiotar ‘ular is high ba to intorlurter saris low. Cluster similarity e mensured i terms ofthe mein value ‘he objets in the luster, whiehean be viewed nate leter= centroid or center of gravity 1 Modebbased clustering: This mothod hypothesizes a model for ‘cath fthecluster and finde the best St ofthe data that model WaeTIE.] Describe reinforcement learning. eax 1 Reinforcement earning isthe study how animals and artical ete an learn to optimise their beboviou in the luo of owas ood 2 Fainfremen earings rtd to maths of dynamic programming whieh a general apprearhtooptimaleantrl infront learning phenomena have been verve in paycholegicl sie of al behav, ain url ines 4. The tsk of einfororment learnings vo use observed reward to lear sepia iy fortran, An ei pally in ae 1-201. (CST-Sem-5) Introdstion 1 Adesison tree is « Nowehartstructre ia which each internal node ‘eprosente atest ona feature, eat la node represent a elas abe [En brenchesrpetsent conjneins af eatures that lead to those class abel 2. ‘The paths from root to leat represent classification rls 13. Fig 194, \Tostrate Ue asi lw of decision tee fer decision making svthiabele (Raines, Ral. Suany Ovefeast va “y ys * Pig. 1.101. 4 Desson tees the prtictive modelling approath used in tats data ‘mining and machine lean |, Decision toes are constructed via an algorithmic approach thet enter ‘the ways to split adata set based ondferent conditions @. Dession irees ares non-parametric superveed learning method weed {orbuthclasieation and egresion asks “1. Classification trees are the tee models where the target variable can take a discrete st of slice ‘5 Regression tres ars the decision trees where the target variable on {ae cantiauous wet of values “Que Ta] Wat are the stops uted for making decision tree? newer ‘Stops ured for making decision tree are: 1” Getlistot rows datas) which are taken nt consideration for making Aevstn tre escursvely at each noe. Homie ve Machine Learning Techniques 121 L(C8TT-S6m.5) Machine Learning Techniques RUS 2. Calelate uncertainty of ur dataset or Gin impurity or how much ou ddatais mixed wpe. Generate ist of all question which needs tobe asked at that nod, ‘4. Partition rows into Tre rows and False rows based on each question asked 15 Calulae information gain based on Git impurity and partition of data from previous sep. {6 Update highest information gxin based on each question ase 1 ‘Update question based on information guin higher information gin), 8 Divide the node on question. Repost again from step Luntil we ot pre node af nodes. TET] wat are te advantages and disnavantagos of decision tree method ? =a ‘Advantages of decision tree method are: 4 Deeision toes ar ele to generate understandsbe rules, 2, Devsion rees perform elassifestion without requiring computation. 3 Dee 3 trek are ale to handle bts continwous end eae ‘variables, “a ect ovine ee that ae peta ‘for prediction or classification. * Disadvantages of desain treo method ae ‘econ os aren papier einain nhs where th gal open thle sfc coninune int Deco ter ar ron errornemfentinpolen with many ‘tat and lative finn evapoe, econ rs ae computationally exes nti, A ath ode sch eandidate apliting field must be sorted hare ia bt epi can ‘ 8 oat ple an Be 1n decison toe algorithms, combinations ffelds are used anda search ‘ost be made for optinal combining weights, Pruning algorithms ext Sloe epee sine many cede sbzes mt beamed i ‘THeTaET] weit short nota on Bayesian belief network. yea ‘Bayesian belie etmorks pec conditional probability istibuton ‘They are also known ap belief networks, Bay peat netwera 10 networks, oF 1-291 (CS/TSem-5) Intradction 43. A Beli Network allows clas conditional independences to be defined ‘etmaen rabaet of variables. 4 provides a graphical model of enusal relationship on which learning can be performed. 5. Weenn uso trained Bayesian network for elassifcation, ‘There re two components that defines Bayesian bli notwork. a Direoted acyelie graph i. Bach nade in a directed seycie graph reprosents a random variable, fi, Theso variable may he dacrete or continuous valued ji, These variaboe may correspond to the atual attribute given inthe data Directed acyelic graph representation The following diagramshowss Aisectedseyee graph fr aix Boolean varies. iL The are in the diagram allows representation of eats nowledge, |i For example, lung cancer i influence by a person's amily Instory af Tang ancor, aswell 2 whether or notte porn is emer wana) Caer) Gace Gans Konas) Corms> ‘i, Tea worth noting that the vviabl Positive X-ray isndependent tf whether the patient has s family history of ung cancer or {hat the patient ex smoker, given that we know the patient Tha Tong cancer Conditional probability table ‘The coitionl probability tab for the valves af the variable LungCaneer(L0} showing eark posable combination ofthe vcs sitsparent node, Purl istory (FH, ad Soke) sa fll HS rH.8 FHS 7S ufos [os [or | ot ac [oa [os | 03 | 09 _—o~"~Ss—i (i Introdvetion ARAL (CSTTSomg) 1-96 L(CSITSemS) Machine Lari Tees BERTH] wre shot note on support wstr machine Tower | 1A Suppor Vector Mas ‘halysdat for lassi ‘SVM ea auperisd learning 5. AnSVMourputeamap ofthe verted data thas far apart as poesle, 4. Applications of SVM: i Tent and bypertertclasification i Imagectasifeation ‘Recognizing handwritten charactors fv. Bblogicalocences,incading protein classification ‘GEeTaE] Eptain genetic algorithm with flow chart, Tawwer Genetic algorithm (GA) ‘The genetic slgorithm is « method for saving both enstratned end {incnstrained opine problemsthatistaeed on ataral selection ‘Toe genetic algorithm repeatedly modifies a population of individual Sinton 4 Aveach step, the genetic algorithm selects dividual at random frost {he Strvent population t be parents nd trex them to prodce the & Overasecessve generations the population evolves toward an optzal Flow chart: The genetic algorithm uses thre main types of rules at each ‘5 Selection rule: Solnction rules slect the individual, elle parent 5 entre to the population at the next generation, tine (SVD) is machine lrning algorithm that ation ad regression analysis method that lokrat data end sorts int jth the marginebetween he Crossover ue: Cnr alexis ne penis infirm le 4 Mataton re: Muatin res pps random changes a indi Taiialzation Init population Beleton New popalaton | ee quer id popu ‘Long Answer Type and Medium Answer Type Questions, by machi WHoIae] Brierly explain the ianwes related Machine Learning Techniques 1.251. (CS1TSem.5) Tarver] Temues related with machine learning are: 1 « eae] wna Data quality ‘Tis ecential to have good quality data to produce qual tlgorithns and model ae ‘Toot high-quality data, we must implement dats evaluation, {Integration exploration, and governance techniques prior developing ML model mere Accuracy ofMLiedriven by the quality ofthe data, ‘Transparency 4 Teisdiiul tomake definitive statements on how well a models soing to generalize in new environments Manpower : | Manpower means having data and being abe to use it This doe ot introduce basing the model ‘There shoul be enough kil es in the organization for soare evelopment and ta cllection, Other: ® ‘The most common eave with ML is people using it where it. Tamers people using it where it des "Every tine there it some new innovation in ML, we se oversesout cogineers eying use wher i's not really necessary ‘This sed to happen lot with deep leming and neural network Traceability and reproduction of reaute are two main sues se the classes of problem in machine learning? a Common classes of problem in machine learning: (Claasification lawton data labeled issgned a clans, for ex urns o radon “ee ‘The decision being modelled isto aig label to new unlabelled ‘Bescon etna fa dcriniatin rn, madeline vm oaroupe ern aaah th are valerate than ta being modelled ix what value ta predict 1-261 (CSIT-Sem-5) geacnineeneees 8 Clustering + ‘Gu la| Differentiate between date scie Introduction 1, In lustering data isnot labelled, but ean be divided into groups ‘based on similarity and other mearseesofratural structure in the aia, bb Forexsmple, oganising pictures by faces without names, where fhe human wor he oastgn names to groups, ike Photo onthe Mase 4. Rule extraction: ce grate extraction, data I wed as the bass forthe extraction of propositional ules, ‘by Thenoroles dsoover statistically supportable relationships between attrib the data and machine Data selence Machine learning ‘Dataacieneesaconcopt used | Machine learning i defined a= ‘tocdebigdateand ins | the practice of using algorithms | data cleansing, preparation, | to use data learn from it and ‘and analysis. then foreeast future trends for | that tpi. 2 |W includes various data | It includes subset of Artificial operations, Inteligence. 3 | Data science works by Machine learning uses efiient Nouscing, cleaning, and | programs thet can use dat Stoceasig data to extract | without being explicit told t | Reaning out of it for | 4020 | ‘naltial purpoes 'SAS, Tableau, Apache Spark, [MATLAB are the tole used indatascience. mason Lex, TBM Watson Studio. Mieroso® Azure ML Seuss are the toca used in ML. aI | Data asience deals with | Nachineleaming wes tate | sirectured and unetructured. | models | a & | Fraud detection and | Recommendation systems such healthcare analysis are | asSpoifyand Focal Recaption seaeenegefdaiaucence. | are examples of machine \ Tessin | @00 2.9L (CSITSems) Regression & Bayesian Learning Regression ang Bayesian Learning oor 1 CONTENTS [Repremion, Linear Regression oak ‘and Logistic Regression a Bayeeian Learning, Bayes ‘Theorem, Concept Learning, Bayes Optimal Clasiser, Neive Davee Classifier, Bayesian, Delle Networks, EM Algorithm cS 2-18 Parts: Support Vectr Machine, mem ent ual Gaon Kerac Hyperae Deca Sacteel Properts 2-20, to 2-246 PART- “Rearession, Linear Regression avd Logistic Regression. ‘Long Answor Type and Medium Anewer Type Questions Gua | Define the term regression with its type. wer] ‘Regression ira statistical method used in finance, investing and other ‘Boeiplines that attest determine the strength and character ofthe felationship between one dependent variable (usually denotd ty Y) lind aseries of other variables (now ns independent variable) Togrescon helps investment and Gnancial managers to val ascte fn understand te relatanehipe between variabls, suchas enmmedity Dries and Uh ste of basinees dealing in thos commodities. ‘There are two type of regression: ‘4. Simple linear regression :It uss one independent variable to ‘plain or prods the ooteome of dependent variable Y ne |b Multiple Iinear regression Tt uses two or moro independent ‘variables to predict osteames. Y= 0sbX, +X, +02, + OX Where Y¥=Toe variable we you are eying o predic (dependent variable © variabla hat we arising to pred ¥ independent variable) QueBT | Describe brie tinear regression aa DAL CTS) Linear regression is «supervised mackie laring algorithm where ‘he predicted outputs continuous ands constant lope Wa usd to predict values within a continuous range, or xara Sales, pice) rather than tring elseif them nto ctogerie or ‘ample et do Machine Lenring Techniques 23L (CST Sem) Following are th types oflinear regression Simple regression ‘Simple linear regression ts traditional lop intercept form to prod, securate prediction, = mc-+b rere, mand’ ae the variables, "creprogents our input data and y represents our prediton, 1b Muluivariable regression : Amulti-rarabe linear equation given below, where wrepeerentsthe ‘oecients, or weighte ifs a) = wget ay Hue The variables «, , 2 reprosont the attsibutes, or distinct piecos of ‘nformation Ut, we have about each observation ‘i, For sees predictions, these attributos might include a compenye avertsing spend on rao, TV, and newspapers. Selee= 1, Radio +0, TV +, Newspapers GEEET |] weptain logistics regression. Taower 1. _Logisticropresonis a sepervivedlearningclasifeaton lgpithm used topredict the prebabiityofatargt variable. 2 Thenaturcaftarget or dependent variahiois dichotomous, which means ‘here would be aly two poribe casecs. ‘8. The dependent vrlbleis binary in nsture having data cod as either 1 (stands for sucetv/ye) oO (Stands for failure) A. Alogisticrogresion model predicts P= 1 asa function ofX His one ‘ofthe simplest ML slgorthms thst ean be used fr various elasifiestion problems suchas spam detection, diabetes prediction, cancer detection QaeBA | What are the types of logistics regression ? ‘Anewer Logistis regression cane divided int following types: 1. Binary (Binomial) Regression : ‘In thisclasieation, a dependent variable willhave only two pase types either 1and0, 1 Forexample, these variables may represent succes or flure, y= foro, win arlass ee 2 Multinomial regression ‘In thin claefcaion, dependent variable can have three or mor® pasible uaordored types or the typos having no quantitative Fignifcane. b, Forevample then variables may represent “Type. x Type “Type PAL (CSMT-Sem-5) Regrosson & Hayesian Learning 3 Ondinal regression : f2 Inthis clasifcation,dopondeot variable can have three or more eile dared types or te types having quite sen bb Forexampl, those variables may represent "por" ¢ yonder) od, “Excellent” and eachestgy cn have the srestibe 1.2. ‘TaEEE. | Ditterentiate between linear regression and logistion regression. Taser [S.No] Linear regression 1. [linear ropression is supervised oprossion model ‘Logiaties regression Teste rogrossion ampere stnsfetion mode | In Logistic regression, we predict thevalveby Loe0, Ta Linear repression. we prediet t valet an ingore mumber, ‘| Nosctivation fretionie wed t ‘uatonte thelist rogreion ceuation. No threshold valuisnoeded “&_|Atreshold vais we, 5 | lisbased onthe least square "The dapendent variable consis only ro eta, Logistic regrnsin is used to Calculate the probability of an 8 | Einoar regression i wed to fostimate. the dependent arable acne of cnet Independent variable. 7 [Tanwar regres assumes the Aitribation othe dependent variable Togste regression assures the | binomial distribution of the Aepondent vail, Lt a Ba ro ont Lm ain ri ee es Cas apse Pt Oo re da eve Learsing Teeniqves Macive Werke | Esplain Bayesian learning. Explain two cat lasitiation. “ Tawer [Bayesian learning: ‘rn ami a ndamentl ais aprsch the peg 2 Ths apprsch s hase on quantiffing the tradeoff between tats chaseeston dese sing probability and coat that accom Secon mre ecu the decison rele colton the basis of term : Jon the basi probabil boc te assumed that all the relevant probit are haown 4 Fer th we define the state of nature af the th 2 of nature of the things presen in th partial pattern We denote the state of nature Teo category classification he the two cates of the patterns, 1 is assumed that thea a ad po, are known a6 ot kowwen, they can easily be estimated from the Tepe bie training patterns and N,N of tha ‘hen pin) = Ny and ploy! = YN. functions pla wei = 1, 28 ae mr phch desert the trite of he ea taal wil be denoted Br lnerte vale a6 which wo a 261 (CST Som-5) Regression Bayesian Learning {8 Now, the Bayes lasifeation ule canbe defined as Hep, |2)> pCa, 2) is cassified tow, b.Tplo,|x)Pbx| wnlo b. pel a,)p%o,)

pixloy) fe plel oy pngla) Ry ple 278) 1 union of the regions R,, R, covers all the space, we have siptarde [ploy Leyes (275) P= los) | (play|s)- peng hen ponds. 2.8) 11 Thus the probability af error is minimized if isthe region of space winds Then Fe, becomes region where the reverse it 2-8L(CSIT-Sem- Regression & Bayesian Learning 42, Inaclassifiation task with M clases, yn unknown pattern, represented by the feature vector xis assigned to class if ph, |2)> Hajlov Jet ‘Guede. | Consider the Bayesian el Aistributed classes, where fier for the uniformly fa + relay ay] Posey) = ‘muuition ata elbebl © muuttion sults for some values for @ and b ‘Typical easosare presentedin the Fig. 28.1. Pel) aa = ae a wig 281. QueBTT | Detine Bayes classifier. Explain how classification is done by using Bayes classifier. Anawer 1. ABayes classifier iea simple probabilistic classifier based on applying Bayes theorem (from Bayesian statistics) with strong (Naivel independence assimptions ‘Machine Learning Techniques 29L CT Sem 2 ANaive Baye clasier assumes that the presence (or ab prea ature of dass is unrelated to the presence fon fy other feature Depending on the precne nature of tho probity mode azine ined ey een in asuportaed oe eee 4 Inmany practia applications, parameter estination for mats ipe even pfantosn tong na can work with the Naive Bayes model without believing in Bs maa probability or using any Bayesian methods. oe “An aulvantage ofthe Nave Bayes classifiers that ire ammunt of taining data 1o estimate the parameters (aecas ‘arlances ofthe variables) necessary for cleetcation, 6 The perceptron bears a certain relationship to elas classifier known as the Bayes classifier, up a lamiel ptr enc bean ap ‘When the environment is Gaussian, the Bayes class Fhe th nv the Bayes classifier reduces toa In the Bayes lanier, or Bayes bypotheis texting proce minimize the average tisk, denoted by 8: For a te-daseprelem, ‘represented by classes Cy and Co, the average risk ie defined Be GA] PialCpterCaRfRearoae souk | Rete f RerOe where the various terms are defined as follows: , = Prior probability that the observation vector x ie drawn from subspace H, with =1,2,and P, +P, C, = Cost of deciding in favour of class C, represented by subypace H, sthen class Ci true, with fj 1,2 P, (iC,) Conditional probability density function of therandam vectarX Pig 26.2a)depict able digram representation ofthe Bayes eassiir ‘Tn important points in this Block dlagram are twofold ‘The data processing in designing the Bayes classifiers confined entirely tothe computation of the likelihood ratio nx by. Thiscomputation incompletely invariant to the valves assigned to the per probailities and involved inthe decision-making proces These quantities merely afect the values of the threshold x. From acomputational point of view, we find it more convenient to ‘work with logsrithan of the likelihood ratio rather than the Telibood ratio tee 01 (CstTson 8) inet tipeiaiandier vector . | Likelihood Asin x to class 6) apne, | he [|i = MEE es ean o #@ aiess ion, a Mpa eS | ratio FE“ comparator be 08 88> lag ae iets ane o mt ‘Fig. 29.1, Two equivalent implementations of the Bayes dassifir : Neh Likeond rat text (5) Lagabod ratio et ‘Gur Hoy] Discuss Bayes eassifier using some example in detail, awe ‘Bayes canter: Rafer Q.28, Page 2, Unit 2 Forexample: 1. Let Dhe etesning sto taresand the nscale. ach feature is reprosented by an timensional atribste vector ater sy ct) doping measurements made on the esture rom ‘atte cetpoctve fy Ayn 4 ‘Suppace that there ae clases, C, Cyr Cy Gama feta the Shsaier wil pede tha tong to the Sass vig he hhet foster probit, ndnned oo. That laser predetsthatX longs ln fan aly HC R0> pl ND for 157m fet ‘Toon we maximize AC, Tela Cfo wich, 0 emai read the mas posterior hypothe By Bape hearer, ic) qs = BEIGE es, only POC| C) PIC) nowd to be ‘hatimed. IF tho elas prior proebitis ae not known thes 38 Commonly assweed that the classes are equally Uikely 1 HC =p = aC, and therefore peX|C) rained Ober Pax|6)G) is maximized. ion data sets with many ateributes ville extromely expensive 1% To reduce computation in evsating 2X16) acs conditional independence is made 8. As pO) is constant for al clas the comptation fPXIC) tho asumption of Me retngues canny ig Lat sat ameeey hx preume hat the vas ofthe atts women Tsar one another, vento cla abel ote a? ts, piX1C) Tlpealca Ploy Cy) *P UXq| Cale... x PGE, 1C,) ses Th pects [CPs Py Gon IG) are easy ting eran Ol ry ig sete fete codbed hehe the senken 4 tna vad An tempat pC) we om Tras isoprene mur of ayo ae eae thease didely [ono sae dase ib 1. haeotoerssaned igen ised ott etemdtshareu Gisasandtbon wilt ‘liter eviaden seeBn0dby, EB ef] cae oO Tas otha 1) tg ‘i Theres eneetocmpute the mean and the standard devin tthe ale af tribute yf tring a of clas C, Te Sales reused testinte 16) vit For example, let X = (35, Rs. 40,000) where A, and A, are the sets age and ince respctv Lat he lat abel Sebarecompater ‘Vid. The associated clase label for X’ wya-oorey = as abe fi Xe yd, bus compet Latssarpe hat apa ate eran na three it stecotimoonvsieedatribute Suppose thet from tho training set, we find that customer in. whe Acer are 012 urs, ntr words HE ee and thi clase, we have ye5B and = 12 Inari t pret Prodi the class abel of XC) pICis eval lass; The cant edt predict hat the cas label of is POX|CD PC)» ‘The predicted reat ca IC pC for 1s jm. 04, ee abe the ean, for which X16) MC) # 2-121 (CSITSom5) ah a Regression & Bayesian Learning FHT] Let blue, green, and red be three clamses of objects with peor probabicon given by Phi) = 4 green) «1, eed) = Vt erative three types of objects pencils, pens, ane paper Lat the irre J er gual pbs ot tese objects given follow Use Bersmcen) 8 Pepenireen) an rune 6 Bees Ptpenircen= 73 Prey Pigenbivey= Wo Pepaperfb) = 13 Pee us, Plpeneed) iS) Papaperred)= 1 Tver | AeperBayes rule: igen ree) aren) Mereentenci = Epecil area) Rr) + PpendiZ Be) Pitoe + Pipencl ed) Pied) 1a 1 ite a" 2*a"a*e a Pipenciv be) Fae) ‘Pibtueipenell = “Pipencl/ green) P(green) + Plpencil bi Pius) + Pipa ed Pee) ay 244 wos oat = 0.5050 ed) Pleed) Pheaa rod Prod) + Pipencii/Bive) Pius) + pon grees) green) at ad 6X4 wo, = “tan ~o5 “8 ‘Since, Pgreen/penll has the highest value therefore pencil belongs to lass green. Pipeu 1) Pigrees) Pigroon/pen) = pircay green) Pigreen) + Ppen/ blue) Piblue) + pen red) Pred) aca Kerrie Boe PISL(CSAT Sema y 14 L(CSIT-SEmD) Regression & Bayesian Learning Pipe/ boo) Se pala ret + Roa 7 Neer a | MBN sare dais ment emmen gen ivr edad we Nae eeTcae AS sence Cth ot wh to predic sha pba © ae oe arb estislowee aon Pe ttt Tell are een ume he etre ‘Pipen/ green) P(green) + Pipen’ blue) 7 ri Pu) ge ol Pod) . 4,4 2 ica a og75 ~ o376 ~ 0%? § os Sine Mephesto, pn gen aol Bon igreenpaper) = -——_Pibaper/ green) Pigresn)___ 7 psp oe an papa) fos secanaee Pius Pge Pon Bos raters Prbtutpapee = Since, Predipaper asthe highest 409 Since. highest valu therefore, paper belongs to BEER] expinin Naive Bayes class paper! be) Pie) green) Pigreen) + Pipapee/ Blac) ue) + paper! red) Pir) ___ sper re Pei) Fipaper/gren) Pron’ « pope Has) Pibtae)« Pipapen’ ed Pied) “ym oo ig. 2.231. TheJaring carve for Nsw Bayes learn, 4, Atcoming Boolean variables the parameters are ‘O= AC = tre, 0, = PUX,= true} = true, og" PAX =trve | C= Fale) 15. Naive Bayes models can be viewed as Bayesian networks in which each [Xchas Ca he ole part and Clas no parents 5. ANaive Bayer model with gaussian POX, |C) inequivalent to amixture ‘fgauesiane with diagonal eovariance matrices, 1. While isture ofgaussans are used or dnsty estimation in continous tomains, Naive Beyos models wed in dacrte and mixed domains 8. Naive Bayes modes allow fr very ofcient inference of marginal and conditional distributions 8. Naive Bayes learning has no dificlty with noiy data and can give ore appopriata probable predtions, GaeRAT, | Consider a tworsiats (Tasty or non-Tasty) problem with the following training data, Use Naive Bayes classifier to classify the pattorn = "Cook = Asa, Health-Status = Bad, Culsine = Continental”. Machine Learning Techniques 301 (00en gression & Bayesinn Learning Cook | Health Status | Cuisine ‘asa Bed | tadian 2.6 aoa ‘a Good Continental eo 0N Sa = i Linstead tno = 0x 24344 20 Sia (Good | Indian rarer Usha Bad Indian - “Therefore the pritonis tasty, Usha Bd Continental { SE] Pxptain EM algorithm with steps Sie Bed “Continental | No lhe _ Si Good Continental Yoo Anawer Usha Good Tian Yea] 1 The Egcatin eintion seri ent — 1 Bat ream ietond estimate for nde parameters when the Ushe (oe Continental No Eetaincompleteor has misang dt posto some variate 2 EMchogees random value for the ising data pints ad eatin aaa prow act of daa 5, These new vals are then recursively used to estimate a eltr ret ye | wo Yes |e Lew Tne] ‘+ These are the two asc ep othe FM agri mn} 2 fo fae [2 [a [indin [a [a] * Mtmeeniey’ seta [s]ow [et Jemamal spo] * Rabntybanlecere maa a — it |i ‘Thén for those given parameter values, estimate the value of vente £18 |i. Tnitjslize the mean 1, eeceea yal as Heat ea ‘ho mixing velit sitet ‘by random values, (or other values) re [ve Fis[oe] Comparten lesa ‘Asha [26 [0 | Bed | 2/6 [8/8 [indian [ale | 1/4 ii Again estimate all the parameters using the current values Sean far| Gm as female poe]. Gonpali etin See EE 2 'v. Put some convergence criterion. Ey 1 PeEeimiaramecmensiomne ie F (orifall the parameters converge to some values) then stoP, wo] a0 Machine Leaening Techniques QaeHAE | Describe the usage, advantages and disadvantages o EM algorithm. “Anewer Usage of EM algorithm 1. Tteanbeuced te il the missing data ina sample. 2 Itcanbo used asthe bass of unsupervised learning of clusters 3. can be used forthe purpose of ectimating the parameters of Hidden ‘Markov Model (MD. 4 Teean be used for discovering the value of latent variables. Advantages of EM algorithm are: 2. Teisalwaye guaranteed that likelihood willinrease with ench iterate, 2 TheE-step and Motep are often pretty eagy fr many problems in terms of implementation. 3, Solutions tothe M-steps often exit in the clased form, Disadvantages of EM algorithm are: 1. Tehae slow convergence 2 Temakes convergence to the acl optima only. 3. It requires both the probablitics, forward and backward (numerical ‘optimization requires only forward probability), ‘QueAG.| Write a short note on Bayestan network. or Explain Bayesian networkby takingan example. How isthe Bayesian network powerful representation for uncertainty knowledge ? aa ‘A Hayesian network is directed acyclic graph in which esch node i ‘notated with quantitative probability information, The specifcationis as fellows: Ast of random variables makes up the nodes ofthe network Variables may be discrete or continuous A mt of directed links or arrows connect pirsof nodes, there is om from to node i eaid Wo be parent oly Each node o, han & conditional probability distribution ‘Ps |parents tba quantifies the effect of parvatson the sade The graph hax na directed eyees and ene ina directed cycle DAG 217 L CSITSem, 'SIT-Sem-5) CSHT-Sems) 2-181 (Cs Regression & Bayesian Learning ‘A Bayesian network provides complote description of the domain, ‘very entry i the full ont probability dstihtion canbe called ftom the information in the network 4 Bayesian networks provide a concise way to xepresent conditional {ndapondence relationships nthe domain [A Bayesian network soften exponentially smaller than tefl oint ‘strut, For example 1. Suppose we wantto determin the posbiliy of ras geting wet or dey digo tothe oearrence of diffrent seasons. 2. ‘Tho weather has three states: Sunny, Cloudy, an Reiny. There are two poesblities forthe grass: Weto Dey. ‘8. Thesprinkler canbe on or off. Iie rainy, the grat gets wet but ifitis funy, we ean rake grass wet by pouring Water from a sprinkler. “4. Suppose that the gras i wet. This ral be contributed by one ofthe {wo Toscana: Fry itis reining Secondly, the sprinklers are turned 5. Using the Baye's rule, we can deduce the most contributing factor towards the wot grass a Conon Sprinkder Baia Wet go “Fig 2463. Bayesian network possesses the following merits in uncertainty knowledge representation. 1L- Bayesian network can conveniently handle incomplete data. 2 Bayesian network can learn the essual relation of variales. In dat nalyes, astal relation ishelpfl fr field knowledge understanding, t ‘analeo easily lead to precise prediction even under much interference 3. ‘The combination of bayesian network and bayesian statistics can take full advantage of fold knowledge and information from data, 4. ‘Thecombination of bayesian network and other models can effectively void averting problem.

You might also like