The document summarizes a machine learning mini project applying random forest, logistic regression, SVM, and neural networks algorithms to classify a breast cancer dataset. It analyzes the accuracy, precision, recall, and plots accuracy comparisons of the different algorithms. It then provides more details on the logistic regression, SVM, random forest, and neural network algorithms.
The document summarizes a machine learning mini project applying random forest, logistic regression, SVM, and neural networks algorithms to classify a breast cancer dataset. It analyzes the accuracy, precision, recall, and plots accuracy comparisons of the different algorithms. It then provides more details on the logistic regression, SVM, random forest, and neural network algorithms.
The document summarizes a machine learning mini project applying random forest, logistic regression, SVM, and neural networks algorithms to classify a breast cancer dataset. It analyzes the accuracy, precision, recall, and plots accuracy comparisons of the different algorithms. It then provides more details on the logistic regression, SVM, random forest, and neural network algorithms.
VALLETI LOKESWARA REDDY - 20BCI7130 T.BHAVITEJA REDDY – 20BCI7053 • APPLY RANDOM FOREST, LOGISTIC REGRESSION, SVM AND NEURAL NETWORK AND FIND OUT CONFUSION MATRIX, ACCURACY, PRECISION , RECALL AND DRAW THE PLOT FOR ACCURACY COMPARISON FOR ALL ALGORITHMS ON BREAST CANCER CLASSIFICATION DATASET LOGISTIC REGRESSION • LOGISTIC REGRESSION IS THE APPROPRIATE REGRESSION ANALYSIS TO CONDUCT WHEN THE DEPENDENT VARIABLE HAS A BINARY SOLUTION. SIMILAR TO ALL OTHER TYPES OF REGRESSION SYSTEMS, LOGISTIC REGRESSION IS ALSO A TYPE OF PREDICTIVE REGRESSION SYSTEM. LOGISTIC REGRESSION IS USED TO EVALUATE THE RELATIONSHIP BETWEEN ONE DEPENDENT BINARY VARIABLE AND ONE OR MORE INDEPENDENT VARIABLES. IT GIVES DISCRETE OUTPUTS RANGING BETWEEN 0 AND 1. • A SIMPLE EXAMPLE OF LOGISTIC REGRESSION IS: DOES CALORIE INTAKE, WEATHER, AND AGE HAVE ANY INFLUENCE ON THE RISK OF HAVING A HEART ATTACK? THE QUESTION CAN HAVE A DISCRETE ANSWER, EITHER “YES” OR “NO”. • SUPPORT VECTOR MACHINE ALGORITHM • SUPPORT VECTOR MACHINE OR SVM IS ONE OF THE MOST POPULAR SUPERVISED LEARNING ALGORITHMS, WHICH IS USED FOR CLASSIFICATION AS WELL AS REGRESSION PROBLEMS. HOWEVER, PRIMARILY, IT IS USED FOR CLASSIFICATION PROBLEMS IN MACHINE LEARNING. • THE GOAL OF THE SVM ALGORITHM IS TO CREATE THE BEST LINE OR DECISION BOUNDARY THAT CAN SEGREGATE N-DIMENSIONAL SPACE INTO CLASSES SO THAT WE CAN EASILY PUT THE NEW DATA POINT IN THE CORRECT CATEGORY IN THE FUTURE. THIS BEST DECISION BOUNDARY IS CALLED A HYPERPLANE. • SVM CHOOSES THE EXTREME POINTS/VECTORS THAT HELP IN CREATING THE HYPERPLANE. THESE EXTREME CASES ARE CALLED AS SUPPORT VECTORS, AND HENCE ALGORITHM IS TERMED AS SUPPORT VECTOR MACHINE. CONSIDER THE BELOW DIAGRAM IN WHICH THERE ARE TWO DIFFERENT CATEGORIES THAT ARE CLASSIFIED USING A DECISION BOUNDARY OR HYPERPLANE • RANDOM FOREST ALGORITHM • RANDOM FOREST IS A POPULAR MACHINE LEARNING ALGORITHM THAT BELONGS TO THE SUPERVISED LEARNING TECHNIQUE. IT CAN BE USED FOR BOTH CLASSIFICATION AND REGRESSION PROBLEMS IN ML. IT IS BASED ON THE CONCEPT OF ENSEMBLE LEARNING, WHICH IS A PROCESS OF COMBINING MULTIPLE CLASSIFIERS TO SOLVE A COMPLEX PROBLEM AND TO IMPROVE THE PERFORMANCE OF THE MODEL. • WHY USE RANDOM FOREST? • BELOW ARE SOME POINTS THAT EXPLAIN WHY WE SHOULD USE THE RANDOM FOREST ALGORITHM: • <="" LI="">IT TAKES LESS TRAINING TIME AS COMPARED TO OTHER ALGORITHMS. • IT PREDICTS OUTPUT WITH HIGH ACCURACY, EVEN FOR THE LARGE DATASET IT RUNS EFFICIENTLY. • IT CAN ALSO MAINTAIN ACCURACY WHEN A LARGE PROPORTION OF DATA IS MISSING. • NEURAL NETWORKS • NEURAL NETWORKS ARE ARTIFICIAL SYSTEMS THAT WERE INSPIRED BY BIOLOGICAL NEURAL NETWORKS. THESE SYSTEMS LEARN TO PERFORM TASKS BY BEING EXPOSED TO VARIOUS DATASETS AND EXAMPLES WITHOUT ANY TASK- SPECIFIC RULES. THE IDEA IS THAT THE SYSTEM GENERATES IDENTIFYING CHARACTERISTICS FROM THE DATA THEY HAVE BEEN PASSED WITHOUT BEING PROGRAMMED WITH A PRE-PROGRAMMED UNDERSTANDING OF THESE DATASETS. • NEURAL NETWORKS ARE BASED ON COMPUTATIONAL MODELS FOR THRESHOLD LOGIC. THRESHOLD LOGIC IS A COMBINATION OF ALGORITHMS AND MATHEMATICS. NEURAL NETWORKS ARE BASED EITHER ON THE STUDY OF THE BRAIN OR ON THE APPLICATION OF NEURAL NETWORKS TO ARTIFICIAL INTELLIGENCE. THE WORK HAS LED TO IMPROVEMENTS IN FINITE AUTOMATA THEORY. https://colab.research.google.com/drive/1407- EG72akVVryGeKo11Lx9cyhJ4I_vJ#scrollTo=n3sIIfaFoT8I THE END