Professional Documents
Culture Documents
Proposal Document Orignal 240603 101836
Proposal Document Orignal 240603 101836
University of Sialkot
I certify that project title An Automated Cv / Resume Analyzer Via Using Machine
__________________________________
Ma’am Shimza
Department of Software Engineering
Faculty of Computing & Information Technology
University of Sialkot, Punjab, Pakistan.
Dated: 03-June-2024
List of Tables
Table 1:Literature Review Comparison .............................................................................. 6
Table 2 : Task Division Table ........................................................................................... 16
Table 3:Bill of Material (Calculation I) ............................................................................ 17
INTRODUCTION
After completing Upon their education, individuals typically enter the job market, often
using their Curriculum Vitae (CV) or Resume as their primary representation. However,
many individuals start working before finishing their formal education. In the modern job
search landscape, technology has made the process more efficient. Yet, the abundance of
applicants for each position can overwhelm employers, making CV assessment
challenging. Some companies mandate specific formats for applicants to ease the process,
but it remains tedious and prone to errors. Automated CV analyzers, leveraging
technologies like natural language processing and machine learning, offer a solution by
improving efficiency and accuracy in candidate evaluation, addressing volume,
inconsistency, and bias issues.
PROBLEM STATEMENT
The traditional process of Manually reviewing CVs poses inefficiencies and errors due to
the lack of standardization, creating challenges for HR professionals, especially in larger
organizations handling numerous applications. There's a clear need for a more efficient,
accurate, and unbiased candidate evaluation method. Implementing an automated CV
analyzer can address this need, improving recruitment speed and fairness, facilitating
prompt hiring of top talent. This solution is crucial for enterprises, recruitment agencies,
and SMEs, given the substantial market size in HR software and recruitment outsourcing.
The issue is particularly critical when managing high application volumes and requires
continuous attention. Addressing this problem is vital for enhancing efficiency, fairness,
talent acquisition speed, and overall organizational effectiveness.
Project Motivation
There have been lots of work done While job search processes have evolved, CV/Resume
evaluation still relies heavily on manual methods. However, leveraging advancements in
Natural Language Processing (NLP) and Machine Learning (ML) offers a promising
solution. These technologies, already commonplace in activities like emailing and online
shopping, can automate and improve candidate selection processes. Importantly, NLP
and ML are integrated into daily routines, such as email and online shopping,
highlighting their familiarity and potential for streamlining CV evaluation.
Project/Product Scope
The project targets the creation of a web-based application to automate CV analysis using
Natural Language Processing (NLP) and Machine Learning (ML). It will assess and rank
CVs based on relevance across multiple domains, aiming for a comprehensive candidate
© Department of Software Engineering
Faculty of Computing & IT
University of Sialkot
9
profile. However, the computational requirements may present challenges for smaller
organizations. Nonetheless, the project's success promises to greatly improve the
efficiency and speed of candidate selection processes.
PROJECT OVERIEW/GOAL
The project aims to create an automated CV analyzer system using advanced
technologies like Natural Language Processing (NLP) and Machine Learning (ML) to
transform recruitment processes. By addressing inefficiencies and biases in traditional
CV evaluation methods, the system offers a more efficient, accurate, and fair approach to
candidate selection. Key features include comprehensive analysis, adaptability across
domains, efficiency, accuracy, and instant candidate feedback. The final product will be a
web-based application providing recruiters with ranked candidate lists based on job
requirements, leveraging NLP and ML algorithms for CV parsing and analysis. Expected
packaging involves a cloud-based web solution accessible via standard browsers, with
hardware and software components including servers, storage, programming languages
like Python, frameworks like TensorFlow, and libraries like spaCy and scikit-learn,
alongside web development technologies and database systems for efficient data
management.
2. Segmentation Component:
• Its Identifies segments in the HTML using headings, font size, and structure,
facilitated by parse tree and font analysis
4. Evaluation Component:
• Its Applies the ID3 decision tree algorithm to classify and rank CVs, calculating
entropy and information gain to identify the best attributes for evaluation.
Compares performance with logistic regression for potential enhancements.
© Department of Software Engineering
Faculty of Computing & IT
University of Sialkot
10
5. Training Component:
• Its Continuously trains the system using collected data to refine decision-making
and enhance accuracy, updating training data based on feedback from decision
tree evaluations.
Each of these components plays a vital role in processing, analyzing, and evaluating CVs
to automate candidate selection in a structured, efficient manner.
1. System Design:
• File Conversion: Convert CV/Resume from formats like PDF/DOCX to HTML
to retain original formatting and font size information.
2. Segmentation:
• Font-Based Segmentation: Segment the CV/Resume based on font sizes and
styles identified in the HTML, since headings and key sections usually have
larger or bold fonts.
• Syntax Analysis: Utilize syntax trees and pattern recognition to identify and
finalize segments accurately.
GANTT CHART
Sample Gantt chart
BUSINESS PLAN
1. Executive Summary:
The project aims to create a web app that uses NLP and ML to analyze CVs, helping
recruiters evaluate and rank them based on relevance. It targets large enterprises,
recruitment agencies, and SMEs to simplify their hiring process. The outcome should be
a fairer, more efficient method of candidate selection, offering ranked lists of candidates
aligned with job criteria.
2. Business Description:
The project aims to tackle inefficiencies and biases in traditional CV evaluation. Its
mission is to transform recruitment by introducing an automated CV analyzer for a fairer,
ETHICAL CONSIDERATION
The text discusses ethical considerations in research, covering key principles such as
informed consent, confidentiality, non-maleficence, beneficence, justice, and respect for
autonomy. Informed consent ensures participants are fully informed and consent
voluntarily, while confidentiality protects their privacy and data. Non-maleficence avoids
harming participants, while beneficence seeks to maximize benefits and minimize risks.
Justice demands fair treatment and equitable distribution, and respect for autonomy
respects participants' rights to make decisions.
WORK DIVISION
Table 2 : Task Division Table
Feasibility Study
Backend Development
Backend Development
Database Development
Effort Calculation:
• Effort (Person-Months) = a * (KLOC)^b
• Effort = 2.4 * (5)^1.05
• Effort ≈ 13.00 person-months
Time Calculation:
• Time (Months) = c * (Effort)^d
• Time = 2.5 * (13.00)^0.38
• Time ≈ 6.62 months
Person Calculation:
REFERENCES
1. McCallum, A., & Nigam, K. (1998). A comparison of event models for naive
bayes text classification. In AAAI-98 workshop on learning for text
categorization (Vol. 752, pp. 41-48).
2. Kawtar, N., BDIoT'19: Proceedings of the 4th International Conference on Big
Data and Internet of Things, October 3, 2019
3. p . Shivratri, P. Kshirsagar, R. Mishra, R. Damania, and N. Prabhu, “Resume
parsing and standardization,” 2015
4. 1. Santini, S., & Jain, R. (1999). Similarity measures. IEEE Transactions on
pattern analysis and machine Intelligence, 21(9), 871-883.
5. 1. Yi, X., Allan, J., Croft, W.B.: "Matching Resumes and Jobs Based on
Relevance Models," SIGIR, Amsterdam, pp. 809–810 (2007).
6. 1. Q. Le and T. Mikolov, “Distributed representations of sentences and
documents,” in International conference on machine learning, PMLR, 2014, pp.
1188–1196
7. 1. Westermann, F., Wei, J. S., Ringner, M., Saal, L. H., Berthold, F., Schwab, M.,
Khan, J. (2002). Classification and diagnostic prediction of pediatric cancers
using gene expression profiling and artificial neural networks. GBM Annual Fall
meeting Halle 2002,2002(Fall).
8. 1. Sinha, A.K., Amir Khusru Akhtar, M. and Kumar, A., 2021. Resume screening
using natural language processing and machine learning: A systematic review.
Machine Learning And Information Processing: Proceedings Of ICMLIP 2020,
pp.207-214.
9. 3. Orosz, G., Szántó, Z., Berkecz, P., Szabó, G., & Farkas, R. (2022). HuSpaCy:
an industrial-strength Hungarian natural language processing toolkit. arXiv
preprint arXiv:2201.01956.
10. 1. Sanyal, S., Hazra, A., Ghosh, S., & Adhikary, A. (2017). Extraction of
Information from Unstructured Data in Resumes.