1SJ18CS030 Devireddy Sreelatha

You might also like

Download as pdf
Download as pdf
You are on page 1of 30
VISVESVARAYA TECHNOLOGICAL UNIVERSITY "Jnana Sangama", Belgavi-590 018, Karnataka, India + Sa An Internship Report On “Loan Sanction Amount Prediction” Submitted in Partial Fulfillment of the requirement for the award of the degree of BACHELOR OF ENGINEERING IN COMPUTER SCIENCE AND ENGINEERING Submitted By Devireddy Sreelatha 18J18CS030 Carried out at QUANT MASTERS TECHNOLOGIES PRIVATE LTD, BANGALORE Under the guidance of Internal Guide External Guide Prof. Ajay N Shashank T rant Professor ‘Technical Trainer Dept. Of CSE, SJCIT Quant Masters Technological Private Ltd. Quant Masters S.J. C INSTITUTE OF TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CHIKKABALLAPUR-562101 2021-2022 |Jai Sri Gurudev|) Sri Adichunchanagiri Shikshana Trast® S.5.C INSTITUTE OF TECHNOLOGY, Chickballapur - 562101 Department of Com; puter Science and Engineering CERTIFICATE This is to certify that the Internship work entitled “Loan Sanction Amount Prediction” carried out by DEVIREDDY SREELATHA beating USN:1$J1&CS030 a bonafide student of Sri Jagadgura Chandrashekaranatha Institute of Technology in partial fulfilment for the award of Bachelor of Engineering in Computer Science and Engineering of Visvesvaraya Technological University, Belgaum during the year 2021-22. It is certificated that all. | Corrections / suggestions indicated for internal assessment have been incorporated in the report deposited in the departmental library. ‘The Intemship report has been approved as it satisfies the academic requirements in respect of Intemship work presoribed for the said Degree. Ww ; i VN os) mature of Guide Signature © |2efo2- rincipal Prof. Ajay N Dr.Manjunatha Kumar BH Raju Assistant Professor Professor & HOD, Principal, SJCIT, Dept. of CSE,SJCIT Dept. of CSE,SICIT Chickballapur External Examiners: Name of the Examiners Signature with Date DM; SW hr 2, Harsha, ke GPA ls lro0-2 | COMPANY CERTIFICATE ‘Quant Masters Certificate of Completion Devireddy Sreelatha is hereby awarded this certificate for the successful completion of Asiificial Intelligence and Machine Learning internship conducted by Quant Masters Technologies Pv'. Lid. from 01/09/2021 to 28/09/2021 DECLARATION I, DEVIREDDY SREELATHA, student of VIII semester B.E in Computer science & Engineering at $ J C Institute of ‘nology, Chickballapur, hereby declare that the Internship work entitled “Loan Sanction Amount Prediction” has been independently carried out by me under the supervision of Prof. Ajay N, Assistant Professor, Department of CSE,SJCIT and the coordinator Prof. Swetha T, Assistant Professor, Department of CSE,SICIT submitted in partial fulfillment of the course requirement for the award of degree in Bachelor of Engineering in Computer Science & Engineering of Visveswaraya Technological University, Belagavi during the year 2021-2022.1 further declare that the report has not been submitted to any other University for the award of any other degree. PLACE: Chickaballapur DEVIREDDY SREELATHA. Date: USN: 18J18CS030 ABSTRACT Loan sanction amount prediction is a very important process in banking organizations. Considerable number of distinct attributes are examined for the reliable and accurate prediction. Respective performances of different algorithms are compared to find one that best suits the available data set. The final prediction model is evaluated using test data and the R2 score obtained. Given a da et containing numerous rows and columns of specific details of many customers like location, loan amount request, credit score, Number of defaults, Property location, Property price, etc, we have to utilize some Machine Learning Algorithms to predict the loan sanction amount and determine the accuracy of our predicted models and plot appropriate graphs to demonstrate the possible relationships between different parameters. In this project, we use three machine learning algorithms to predict the loan sanction amount: 1, Linear Regression Model 2, Random Forest Regression Model 3. Decision tree Regression Model Using these three algorithms, we need to find out the best suited algorithms for prediction of Loan sanction amount. ii ACKNOWLEDGEMENT With reverential pranam, [express my sincere gratitude and salutations to the feet of his holiness Byravaikya Padmabhushana Sri Sri Sri Dr. Balagangadharanatha Maha Swamiji, & his holiness Jagadguru Sri Sri Sri Dr. Nirmalanandanatha Swamiji and Paramapoojya Sri Sri Mangalanatha Swamiji of Sri Adichunchanagiri Mutt for their unlimited blessings. First and foremost we wish to express my deep sincere feelings of gratitude to our institution, Sri Jagadguru Chandrashekaranatha Swamiji Institute of Technology. For providing me an opportunity for completing my internship work successfully. Lextend deep sense of sincere gratitude to Dr. G T Raju, Principal, S J C Institute of Technology, Chickballapur, for providing an opportunity to complete the Internship Work. T extend special in-depth, heartfelt, and sincere gratitude to our HOD Dr. Manjunatha Kumar B H, Professor and Head of the Department, Computer Science and Engineering, SJ C Institute of Technology, Chickballapur, for his constant support and valuable guidance of the Internship Work. I convey our sincere thanks to Internship Intemal Guide Prof. Ajay N, Assistant Professor, Department of Computer Science and Engincering, S J C Institute of Technology, for his constant support, valuable guidance and suggestions of the Intemship Work. Tam thankful to Internship External Guide Shashank T, Technical Trainer, Quant Masters ‘Technologies Private Ltd, Bengaluru, for providing valuable guidance and encouragement of the Internship Work. [ also feel immense pleasure to express deep and profound gratitude to our Internship Coordinator Prof. Swetha T, Assistant Professor, Department of Computer Science and Engineering, $ J C Institute of Technology, for her guidance and suggestions of the Internship Work, Finally, I would like to thank all faculty members of Department of Computer Science and Engineering, $ J C Institute of Technology, Chickballapur for their support. Talso thank all those who extended their support and co-operation while bringing out this Internship Report. Devireddy Sreelatha (1SJ18CS030) if Declaration i Abstract ii Acknowledgement iii Contents iv List of Figures vii Chapter No Chapter Title Page No 1 COMPANY PROFILE 1-4 1.1 History of the Organization 1 1.1.1 Objectives 1 1.1.2. Operations of the Organization 2 1.2 Major Milestones 2 1.3 Structure of the Organization 2-3 1.4 Services Offered 3-4 2 ABOUT THE DEPARTMEN’ 5-8 2.1 Specific Functionalities of the Department 5 2.2 Process Adopted 5 2.3 Testing 6 2.4 Structure of the Department 7 2.5 Roles and Responsibilities of the Department 8 3 TASK PERFORMED 9-10 3.1 Overview 9 3.2 Problem Statement 10 3.3 Technologies Used 10 4 REFLECTION NOTES 11-17 4.1 Experience ul 4.2 Technical Outcomes i 4.3 System Analysis and Design 12-13 iv 43.1 Existing System 12 4.3.2 Disadvantages of the Existing System 12 4.3.3 Proposed System 12 4.3.4 Advantages of the Proposed System 12 4.4 System Architecture 13-14 4.1 System Architecture 13 4.2 Use case Diagram 13, 4.3 Class Diagram 14 4.5 Implementation 15-16 4.5.1 Modules 15-16 4.6 Screen Shots 17-18 5 CONCLUSION 19 BIBLIOGRAPHY 20 Figure No. Figure 2.1 Figure 2.2 Figure 4.4.1 Figure 4.4.2 Figure 4.4.3 Figure 4.5.1 Figure 4.5.2 Figure 4.5.3 Figure 4.5.4 Figure 4.5.5 LIST OF FIGURES Name of the Figure Process Adopted in SDLC Department Structure System Architecture Use case Diagram for Social Distance Monitoring Class Diagram for Social Distance Monitoring Importing Python Libraries Training and Testing Dataset Linear Regression algorithm. D sion Tree algorithm Random Forest Algorithm Page No. B 14 14 16 16 7 CHAPTER - 1 COMPANY PROFILE 1.1 History of the Organization Himanshu Sharma, a native of Bengaluru, founded the company “Quant masters training services” in 2019 with just 2 employees. The services offered by the company aimed at offline providing placement training to the undergraduates ranging from Quantitative,Logical, Verbal to HR interview preparation. In the early 2020s, the Shift from offline to online training took place due to pandemic with st batch going with just 30 students. In 2021, Quant masters not only achieved a place among MSMEs and became “Quant masters Technologies private limited” but has trained over 1000+ students using the online platform with highly qualified educators and mentors guiding them throughout. Quant masters when started had a reach only to the Students of Bengaluru region. But, withits dedicated training and quality services, over 700+ students enroll in our batches every month from all over the country. The training helps them in getting placed and likewise many students have brought laurels to Quant Masters. A brief profile of the founder: 1. Himanshu Sharma i. Founder & Director, Quant Masters ii, Cleared CDS, AFCAT, RBI GRADE B, CAFs, IB, AMCAT (99.99%). iii, Recommended as Pilot in Indian Air Force iv. Oracle Certified Java Programmer- OCIP (95%) v. Former Software Developer-Grade 4 @NTT DATA. 1.1.1 Objectives i The essential objective of QUANT MASTERS is to improve the quality of training and enhance the learning process. ii, Most importantly to create engaging and effective learning experiences and provide a variety of technological information, ideas to encourage curiosity, stimulate self confidence through the knowledge and develop practical skills Loan Sanction Amount Prediction Company Profile 1.1.2 Operation of the Organization i, The organization provides students trainings on various new technologies and helpsthem to get themselves ready for the IT industry. ii, This organization lies as a bridge between the college life and IT industry iii, ‘They offer various services to the students and to those individuals who want to upgrade themselves in the career with new skills, 1.2 Major Milestones i, Within 1 year more than 2000 students of QUANTMASTERS have been placed in Service, Product and Technology based companies ii. 1000+ students placed in TATA CONSULTANCY SERVICES iii. 83% placed in ACCENTURE Vv 82% placed in INFOSYS v 76% placed in CAPGEMINI vi, 78% placed in LTI 1,3 Structure of the Organization Himanshu Sharma CEO &Founder HR Operations Team Personality Development Technical Training Team Training Shashank T Technical Lead Deepshika Raina HR Operations Head Harshitha Aliveli Aptitude and Logical Trainer Anudeep M P Aptitude and Logical Trainer ¥ Dinesh Gosai Soft skills Trainer Ritu Dhudoria Verbal Ability Trainer Figure 1.1 Structure of the organization Dept. of CSE, SICIT 2021-22 Loan Sanction Amount Prediction Company Profile ‘On-going projects: We start a new placement training batch every 1.5 months. Currently are working towards giving quality training cum internships to the students and give them the pr cal implications of the related proj s. The training provided by us is also helpful to various students preparing for competitive exams from different branches- Engineering, Humanity, Commerce, Arts, Management ete. We will soon be launching our services with regards to various new technological advancements and certification courses, 1.4 Services Offered ii, ili iv. vi. viii. Quantitative Aptitude: Quantitative aptitude is an inseparable and an integral part of aptitude exams. It tests the quantitative skills along with logical and analytical skills Technical Training: Technical training refers to specific vocational training, meaning the hard skills that employees need to perform their daily job tasks and that managers can clearly measure in terms of proficiency. Verbal Aptitude: Verbal Aptitude is the ability to use the written language and to understand concepts presented through words. Logical training: A logical reasoning test measures your ability or aptitude to reason logically. Generally, logical reasoning tests measure non-verbal abilities Soft skills/ Communication Skills: Soft skills include attributes and personality traits that help employees interact with others and succeed in the workplace. Soft skillsinclude the ability to communicate with prospective clients, mentor your coworkers. Resume Building: Resume is the gateway for entering into any organization. A good resume opens up many gates to a fresher to start his/her career. Hence Resume Building is one of the major task for a fresher. LinkedIn Networking: LinkedIn is the fastest growing social media platform where it brings employers and employees to one place. We help you to build a strong Network inLinkedIn by providing tips and suggestions on how to improve your connections rate. AI and ML internship: We offer Training cum Internship in the domain of Artificial Intelligence and Machine Learning to the students who are interested in the domain ofAl & ML. In this program we make them to work on real life projects, Dept. of CSE, SICIT 3 2021-22, Loan Sanction Amount Prediction Company Profile ix. Group Discussion Preparation: As Group Discussion plays a major role when you enter into any corporate industry, It requires for students to train themselves ‘on how to behave in a Group Discussion, what to be done to get through the tough Group Discussions. We help them by providing mock interviews. Dept. of CSE, SICIT 4 2021-22 CHAPTER - 2 ABOUT THE DEPARTMENT 2.1 Specific Functionalities of the Department ‘The department has around 18 members that specialize in a variety of fields including IOT, skill development, ML, AI, project consultancy and hardware design. I worked in Machine Leaning domain, which is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead, It is seen as a subset of artificial intelligence. 2.2 Process Adopted SDLC is a process followed for a software project, within a software organization. It consists of a detailed plan describing how to develop, maintain, replace and alter or enhance specific software. The life cycle defines a methodology for improving the quality of software and the overall development process. A SDLC process as following mentioned steps: + Designing + Building + Testing + Deployment Analysis SDLC Testing Developnent Figure 2.1: Process Adopted: SDLC Loan Sanction Amount Prediction About the Department 2.3 Testing The various testing techniques used by the department can be summarized as follows: Functionality Testing of a Website: is a process that includes several testing parameters like user interface, APIs, database testing, security testing, client and server testing and ba website functionalities. Functional testing is very convenient andit allows users to perform both manual and automated testing. It is performed to test the functionalities of each feature on the website. 2. Usability Testing: This type of testing includes testing the site navigations and contents of the website. 3. Interface Testing: Three areas to be tested here are Application, Web and Database Serve 4, Database Testing: Database is one critical component of your web application and stres must be laid to test it thoroughly Testing activities will include Test if any errors are shown while executing queries, Data Integrity is maintained while creating, updating or deleting data in database, Check response time of queries and fine tune them if necessai Test data retrieved from your database is shown accurately in your web application 5. Compatibility testing: Compatibility tests ensures that your web application displays correctly across different devices. This would include-Browser Compatibility Test: Same website in different browsers will display differently. You need to test if your web application is being displayed correctly across browsers, JavaScript, AJAX and authentication is working fine. 6. Pipeline testing: After compatibility testing it is the time to test all the micro- services in pipeline together to check their compatibility and message passing. Thus all the services/functionalities are kept in pipeline and tested together. Afterwards whole pipeline is pushed in the deployment server. Dept. of CSE, SICIT 6 2021-22 Loan Sanction Amount Prediction About the Department 2.4 Structure of the Department The structure of the organization is descripted in the following figure: a fy rove Q me 0 wine 0 vent | Q seamen ||) smrommee || seo Quem | Q smerowiper | | GQ antetenines | | @ antroraen | | GQ ateomninws | | Gon Figure 2.2: Department structure Any organization will have a specific structure to function as a whole, The hierarchy of an organization is as shown above. There are multiple levels in an organization hierarchy starting from high level to low level. i. Ina Computer Science Department of any organization, the project manager will be the top level person responsible for delivering projects in time fi, Tech Leads work at the next level to project managers they provide technicalassistance to the below level peers and check the work at regular intervals. iii, Senior Developers and Junior Developers work at next two consecutive levels. Dept. of CSE, SICIT 7 2021-22 Loan Sanction Amount Prediction About the Department 2.5 Roles and Responsibilities of Individuals The different roles and responsibilities of individuals are: 1. Project Manager: Project Managers play the lead role in planning, executing, monitoring, controlling, and closing projec . They're expected to deliver a project on time, within the budget, and brief while keeping everyone in the know happy. 2. Tech Leads: Technical Lead as the name states is solely responsible for leading a development team. The is not easy. They have to lead a team. Technical Lead is the one who actually creates a technical vision in order to tur it into reality with the help of the team. 3. HR Manager: The Human Resource Manager will lead and direct the routine functions of the Human Resources (HR) department including hiring and interviewingstaff, administering pay, benefits, and leave, and enforcing company policies and practices. 4. Senior Developer: Develops software solutions by studying information needs, conferring with users, studying systems flow, data usage, and work processes; investigating problem areas; and following the software development lifecycle. A senior developer may manage a team of developers and will be expected to encourage creativity and efficiency throughout complex digital projects. Due to the pressurised nature of the role, a robust and organised approach to the work is needed to produce the best solutions 5. Junior Developer: Junior Software Developers are entry-level software developers that assist the development team with all aspects of software design and coding. Their primary role is to learn the codebase, attend design meetings, write basic code, fix bugs, and assist the Development Manager in all design-related tasks. Dept. of CSE, SICIT 8 2021-22 CHAPTER - 3 TASK PERFORMED 3.10verview Deep learning is a sub-field of machine learning which works on concepts of artificial neural networks to perform specific tasks. Artificial neural networks withdraw inspiration from the human brain, They are named as artificial neural networks as theycan complete precise tasks while achieving a desirable accuracy without being explicitly programmed with any specific rules. To complete a deep learning project successfully we need a well-defined plan as any deep learning requires so many iterations before finalizing final model. Steps in deep learning project: 1. Defining your architecture 2. Compiling your model 3.Fit the model 4, Evaluating and making predictions 5. Deploying the model In the Internship, to understand the implementation of machine leaming algorithms Thave made to work on a project titled “Loan Sanction Amount Prediction” The project is divided into 3 modules. 1. Defining Architecture 2. Fit the model 3. Deploy the model Methodology: i Data Collection: The dataset collected for foretelling loan failure clients is foretold into Training set and testing set. Generally 8020 proportion is applied to dissociate the training set and testing set. The data model which was created using Decision tree is , Test set forecasting is applied on the training set and hung on the test take finene done. ii, Preprocessing: ‘The collected data may contain missing values that may lead to inconsistency. To gain better results data need to be pre processed and so it'll better the effectiveness of the algorithm, We should remove the outliers and we need to convert the variables. In order to flooring these issues we use chart function. Train model on Training Dataset: No we should train the model on the training Loan Sanction amount Prediction Task Performed dataset and make sooth sayings for the dataset. we can divide our train dataset into two track train and testimony. We can train the model on this training part and using that make sooth sayings for the testimony part. In this way, we can validate our sooth sayings as we've the true sooth sayings for the testimony part (which we don't have for the test dataset) Correlating attributes: Grounded on the correlation among attributes it was observed more likely to pay back their loans. The attributes that are individual and significant can include Property area, education, loan measure, and originally credit History, which is since by insight it's considered as important. The correlation among attributes can be associated using corplot and boxplot in Python 3.2 Problem Statement ‘Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have given a problem to identify the customers segments, those are cligible for loan amount so that they can specifically target these customers. Here they have provided a partial data set. 3.3 Technologies used a. Google Colab: Colaboratory, or “colab“ for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education, More technically, colab is a hosted Jupyter notebook service that requires no setup to use, while providing access free of charge to computing resources including GPUs. b. Python Programming Language: Python is a high-level, interpreted, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, _ including structured object-oriented and functional programming. It is often described as a "batteries included” language due to its comprehensive standard library Dept .of CSE,SICIT 10 2021-2022 3.3 Technologies Used + Google Colab: Colaboratory, or “colab for short, is a product from Google Research, Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education. More technically, colab is a hosted Jupyter notebook service that requires no setup to use, while providing access free of charge to computing resources including GPUs. + Python Programming Language: Python is a high-level, interpreted, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is dynamically-typed and garbage-collected, It supports multiple programming paradigms, _ including structured object-oriented and functional programming. Itis often described as a “batteries included” language due to its comprehensive standard library. Dept. of CSE, SICIT W 2021-22 CHAPTER - 4 REFLECTION NOTES 4.1 Experience The internship has been a really useful experience for me that I can learn a lot of new knowledge that will definitely be useful for my future study. I'm grateful that my assignments have a lot of variety instead of just focusing on a specific area. This allows ‘me to be able to learn more and also challenge myself to overcome many different kinds of difficulties encountered during my internship. Having many assignments also required ‘me to manage my work time efficiently prioritizing the urgent task. Some tasks require me to do research with less available online documentation other task requires me to make attempts on works that I have never experienced before just by learning from documentations. Although the task may be difficult and overwhelming sometimes, I'm really excited to push my skills to the limit and carry outthose task assigned to me. Beside technical skills, I also observed and leaned a lot of soft skills from my supervisors and my co-workers such as professional communication and team work. I have also leamed a lot from my supervisor who's always willing to help me when I face difficulties and also willing to share a lot of his knowledge and wisdom to me from his post experience. ‘My internship experience has definitely improved my hard skills in IT and sharpen my soft skills a lot more than I expected I have shaped a better mind set in me and motivated me to keep on exploring and challenging myself in the world of information technology. 4,2 Technical Outcomes i, Learning the basics of AI and its subdomains Machine Learning. ii, Understand a wide variety of learning algorithms. iii. Understand how to evaluate models generated from data. iv. Apply, the algorithms to real problem v. Optimize the models learned and report on the expectancy accuracy that can be achieved by applying the models, ul Loan Sanction Amount Prediction Reflection Notes 4.3 System Analysis and Design 4.3.1 Existing System Bank employees check the details of applicant manually and give the loan to eligible applicant. Checking the details of all applicants takes lot of time. The artificial neural network model for predict the credit risk of a bank. The Feed- forward back propagation neural network is used to forecast the credit default. The method in which two or more classifiers are combined together to produce a ensemble model for the better prediction. They used the bagging and boosting techniques and then used random forest technique. The process of classifiers is to improve the performance of the data and it gives better efficiency. In this work, the authors describe various ensemble techniques for binary clas ification and also for multi class classification. The new technique that, is described by the authors for ensemble is COB which gives effective performance of classification but it also compromised with noise and outlier data of classification, Finally they concluded that the ensemble based algorithm improves the results for training data set. 4.3.2 Disadvantages of the Existing System Checking details of all applicants consumes lot of time and efforts. There is chances of human error may occur due checking all details manually. There is possibility of assigning loan to ineligible applicant. 4.3.3, Proposed System To deal with the problem, we developed automatic loan prediction using machine learning techniques. We will train the machine with previous dataset, so machine can analyse and understand the process, Then machine will check for eligible applicant and give us result. 4.3.4 Advantages of the Proposed System i. Time period for loan sanction will be reduced. ii, Whole process will be automated, so human error will be avoided. . Eligible Applicant will be sanctioned loan without any delay 12 2021-22 Loan Sanction Amount Prediction Reflection Notes 4.3 System Architecture ‘The figure 4.1 denotes the brief flowchart for the proposed system. (Cmisaeane ) (Pisa xed) Ld Figure 4.1: System Architecture Firstly take the dataset and dividing that data set into training and testing dataset, and then we should do the data cleaning and data preprocessing after preprocessing the data we should build the model with machine learning algorithms and the pick the best algorithm based on the accuracy .at last giving the input and getting the result. \ GS E> Figure 4.2 Use case Diagram Dept. of CSE, SCJIT 13 2021-22 Loan Sanction Amount Prediction Reflection Notes tion Figure 4.3: Class gram for Loan Sanction Amount Pri In Figure 4.4.3 can explain about loan sanction features such as Employment, Applicant, Bank loan, credit reference, vehicle, person details and so on. we are using these features to give loan. If all these features are applicable to give loan then we can sanction the loan by verifying the details of the person. 4.5 Implementation ‘The project is implemented in the following modules: i, Pre-processing the dataset. Training the model using dataset iii, Evaluating the trained model and finding the best algorithm for the project Dept. of CSE, SCJIT 14 2021-22 Loan Sanction Amount Prediction Reflection Notes 4.5.1 Modules Module 1: Preprocessing the dataset i. This step is performed using sklearn, preprocessing package. ii, In general, learning algorithms benefit from standardization of the data set. If some outliers are present in the set, robust scalers or transformers are more appropriate. ii, So, Standard Scaler is used to transform the dataset, Module Training the model using different ML algorithms ‘The different algorithms used for training are Tokeniser, word2vec. ii, These 2 algorithms are individually trained and tested Module 3: Evaluating the trained model and finding the best algorithm for the project i. The trained model is evaluated using testing datasets ii, The best algorithm is found by calculating the accuracy of the individual model iii, The accuracy is calculated using accuracy score()function which is present in sklearn, metrics package. 4.4 Screenshots Tiport pandas a5 pa Sngort matptot ibvoystot as ptt ‘gatplotLib intine from sklearnsnade\_selection import train_test_split fron sklearnepreprocessing import Labelencode fron aklearncnotries inport confusion matrix, classification report, accuracy_score fron sklearncnanifold daport TSE fran aklearnefeature.extraction- text ingort T#safvectorizer tron heras.preprocessing. text Aaport Tokenizer from Keras-preprocess ing. sequence port pad sequences fron herascnodels dnport Sequ fran keras-tayers dnport Activation, Dense, Dropout, Enbeding, Flatten, ComiD, MoxPootingiD, LST fron heras deport utiis fram keras.cottbacks dnport ReducelAbePtateau, EarlyStopsing Lngort tk fram nltkzcorpue Saport stopvords tron _nltk.sten dnport ShowbalStenmer inport gensin| Inport re Sngort impy 86 np Inport os fram collections iaport Counter ngort. Logging ingort tine Inport piekte inport ttertocls Teeingtassecont ig format elasetine)s + wlevelnane)s + Sinessage)s's Level=Logging- IMFO) Figure 4.4 Importing Python Libraries Dept. of CSE, SCIIT 15 2021-22 Loan Sanction Amount Prediction Reflection Notes For training and testing of the data we have to separated the columns and associated it to x train, y_train, x_test and y_test df train, df-test = train-test split(df, tests prant(-TRAIN size:", len(af_ prant(°TEST size:”, Len(éf_t 2e=1-TRAIN.SIZE, random_state=42) TEST size: 220008 Figure 4.5 Training and Testing dataset Training and testing the model using Linear Regression. Figure 4.6 Linear Regression Algorithm Linear Regression is a classification algorithm used to assign observations to a discrete set of classes. Unlike linear regression which outputs continuous number values, Logistic Regression transforms its output using the logistic sigmoid function to return a probability value which can then be mapped to two or more discrete classes. Dept. of CSE, SCIIT 16 2021-22 Loan Sanction Amount Prediction Reflection Notes Training and testing the model using Decision tree Regression Figure 4.7 Decision tree Algorithm, Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but mostly it is preferred for solving Classification problems. It is a t the features of a dataset, branches tree-structured classifier, where internal nodes repr represent the decision rules and each leaf node represents the outcome.In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further branches, 17 2021-22 Loan Sanction Amount Prediction Reflection Notes ‘Training and testing the model using Random forest tree Regression Figure 4.8 Random forest algorithm Random Forest, as its name implies, contains a large number of individual decision trees that act as a group to decide the output. Each tree in a random forest specifies the class prediction, and the result will be the most predicted class among the decision of trees. The reason for this amazing result from Random Forest is because of the trees protect each other from individual errors Although some trees may predict the wrong answer, many other trees will rectify the final prediction, so as a group the trees can move in the right direction. Random Forests achieve a reduction in overfitting by combining many weak learners that underfit because they only utilize a subset of all training samples Random Forests can handle a large number of variables in a data set. 18 2021-22 CHAPTER-5 CONCLUSION Ihave studied various approaches for sentiment analysis using machine leaming techniques like Linear Regression, Decision tree, Random forest ete. The researches have done the summarization of events, real time event detection as well as sentence based sentiment classification accurately and efficiently. Random forest regression is insensitive to unbalanced data which give more accurate results. From a proper analysis of positive points and constraints on the member, it can be safely concluded that the product is a considerably productive member. This use is working duly and meeting to all Banker requisites. This member can be freely plugged in numerous other systems. There have been mathematics cases of computer glitches, violations in content and most important weight of features is fixed in automated prophecy system, so in the near future the so called software could be made more secure, trustworthy and dynamic weight conformation. In near future this module of prophecy can be integrated with the module of automated processing system. The system is trained on old training dataset in future software can be made resembling that new testing date should also take part in training data after some fix time. 19 BIBLIOGRAPHY [1] https://www_analytiesvidhya.com/blog/2021/04/steps-to-complete-a- machine-learning-project/ [2] https://scikit-learn.org/stable/modules/linear_model-html [3] geeksforgecks.org [4] bups:/Avww.kaggle.com/getting-started/27270 [5] https://seaborn.pydata.org/generated/seaborn.heatmap. htm] [6] https://towardsdatascience.com/exploratory-data-analysis-8fcl cb20fd15 [7] www.appliedaicourse.com [8] hups://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-ii! [9] Kumar Arun, Garg Ishan, Kaur Sanmeet, May-Jun, 2016, Loan Approval Prediction based ‘on Machine Leaning Approach, IOSR Journal of Computer Engineering (IOSR-JCE) [10] Wei Li, Shuai Ding, Yi Chen, and Shanlin Yang, Heterogeneous Ensemble for Default Prediction of Peer-to-Peer Lending in China, Key Laboratory of Process Optimization and intelligent Decision Making, Ministry of Education, Hefei University of Technology, Hefei 2009, China [11] Short-term prediction of Mortgage default using ensembled machine learning models. [12] Clustering Loan Applicants based on Risk Percentage using K-Means Clustering Techniques, Dr. K. Kavitha, International Journal of Advanced Research in computer science 20

You might also like