Mango Yield Prediction - Thirukkuvalai

DATE: 19-05-2002 SUBJECT: PROJECT WORK
AI AGRI AID (AAA): MANGO YIELD PREDICTION USING MACHINE

LEARNING TECHNIQUES
PROJECT MEMBERS
 GIRIJA.S (REG NO:822219104011)

 MANIKANDAN.M (REG NO:822219104020)
 MEENACHI MADHUMITHA.V (REG NO:822219104022)
Under the guidance of

Mr.S.MADHAN,B.E,M.TECH.,
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
MANGO YEILD
PREDICTION
USING MACHINE LEARNING
AIM
• The aim of mango yield prediction is to forecast the amount of mango production that can be expected in a
given season or year.
• Mango yield prediction is important for farmers, agricultural companies, and government agencies, as it can
help with planning and resource allocation, such as deciding how much land to allocate for mango farming.
SCOPE
The scope of mango yield prediction is quite vast and has many practical applications. Mango yield prediction can be
used to:
• Help farmers and growers plan their harvest and manage their crops more efficiently.
• Estimate the production of mangoes at a regional or national level, which can help inform market demand and supply.
• Assist in developing pricing strategies for mangoes, based on expected yield and demand.
• Help to identify potential areas of risk for crop failure, allowing for preemptive action to be taken to mitigate this risk.
• Aid in crop insurance, by providing accurate yield predictions to insurance companies.
• Provide valuable information to government agencies responsible for agricultural policy, such as planning and
implementing agricultural subsidies or price support schemes.
EXISTING SYSTEM
• In the agriculture industry, yield prediction is crucial for crops like fruits and vegetables, and
it also benefits the producers.
• In order to predict current yields, the Chuping Meteorological Department in Perlis collected
and compared yield data from the Harumanis orchard, which is overseen by the Perlis
Agriculture Department.
• In the current method, yield prediction is carried out solely using temperature and rainfall data
and is accomplished by data mining techniques.
PROPOSED SYSTEM
• predicting mango yield, farmers and growers can plan their harvest, manage their crops more efficiently, and
optimize their resources such as labor, fertilizer, and water.
• Our proposed system uses Support Vector Machines (SVM) to predict mango yields. SVM is a machine learning
algorithm that is widely used in regression tasks, and it has been proven to be effective in predicting agricultural
yields.
• To develop our system, we collected data on various environmental factors that influence mango growth, such as
temperature, rainfall, humidity, and soil quality.
• We also gathered historical yield data from previous years to use as training data for our SVM model.
LITREATURE SURVEY
S.NO TITLE AUTHOUR JOURNAL YEAR METHODOLOGY
1 Crop Prediction Based on S. P. RAJA 1

Characteristics of the , BARBARA SAWICKA 2
Agricultural Environment Using , ZORAN IEEE 2022 Machine learning
Various Feature STAMENKOVIC
Selection Techniques and Classifiers
2 Ensemble Machine Learning HAYAM R. SEIREG 1

Techniques Using , YASSER M. K. OMAR Machine learning with ensemble learning
Computer Simulation Data for Wild IEEE 2022 techniques
Blueberry Yield Prediction
3 A Data-Driven Model for Pedestrian VASILEIA

Behavior PAPATHANASOPOULOU IEEE 2022 Neural network with LSTM(long short term
Classification and Trajectory , IOANNA memory)
Prediction SPYROPOULOU,
4 Prediction of Typhoon Track and MARIO RÜTTGERS 1,2,3, Deep learning techniques with Typhoon
Intensity Using a Generative SOOHWAN JEON 4 IEEE 2022 prediction
Adversarial Network With
Observational and Meteorological
Data
RANDOM FOREST
• Random Forest is a machine learning algorithm used for classification, regression, and other tasks
that involves decision trees. In simple terms, Random Forest builds a large number of decision trees
and combines their outputs to make predictions.
• To build a Random Forest model, the algorithm first creates a set of decision trees on subsets of the
training data, with each tree making a prediction based on a random subset of features.
• This randomization helps prevent overfitting and improves the accuracy of the model. Then, when
making a prediction for a new data point, the Random Forest algorithm aggregates the predictions of
all the decision trees to make a final prediction.
• Random Forest is a powerful algorithm that can handle complex and high-dimensional datasets, as
well as categorical and numerical data. It also provides an importance score for each feature, which
can be used to identify the most important features in the dataset.
RANDOM FOREST
Tree 1 Tree 2 Tree 3 Tree 4

Train
(seed type) (rainfall) (temp) (pesticides)
Tree 1 Tree 2 Tree 3 Tree 4

Test
(seed type) (rainfall) (temp) (pesticides)
Mango yield prediction with Random Forest

DATA FLOW DIAGRAM
New data input

X - train
X - test
Random Trained random
Dataset
forest forest model
Y – train
Predicted yield
Y - test
SVM (Support Vector Machine)
• Support Vector Machine (SVM) is a powerful machine learning algorithm that can be used for classification or regression tasks.
• It works by finding a hyperplane that separates the data points into different classes or predicts a continuous output variable.
• The hyperplane is chosen to maximize the margin between the closest data points of different classes, making it robust to noise
and overfitting.
• In classification tasks, SVM aims to find a decision boundary that separates the data points of different classes with the largest
possible margin.
• This decision boundary is then used to classify new data points based on which side of the boundary they fall on. SVM can handle
both linearly separable and non-linearly separable data by using a kernel function that maps the input data to a higher-dimensional
space where it can be separated by a hyperplane.
• In regression tasks, SVM aims to predict a continuous output variable by finding a hyperplane that approximates the relationship
between the input features and the output variable.
• SVM can handle non-linear relationships by using a kernel function that maps the input data to a higher-dimensional space where
it can be approximated by a hyperplane.
SVM BLOCK DIAGRAM
Feauture Trained svm

Dataset Pre processing
extraction model
Trained svm
New test input
model
Feature Predicted
extraction output
SYSTEM REQUIREMENTS
• Front end : python
• Backend : CSV (comma delimited)
• Operating system: windows OS
• System type: 64 bits OS
• IDE: Python 3.6.5 IDLE

MODULES
• Install the necessary Python libraries: You need to install the following libraries: pandas,
scikit-learn, numpy, matplotlib, seaborn, and scipy.
• Load and pre-process the dataset: Load the dataset into a Pandas Data Frame and perform any
necessary pre-processing steps such as removing missing data, scaling, and encoding categorical
variables.
• Split the dataset into training and testing sets: Split the pre-processed dataset into two sets,
one for training the model and one for testing the model. A common split ratio is 80:20 or 70:30
for training and testing, respectively.
• Feature selection: Identify the most important features for mango yield prediction. This can be
done using techniques such as correlation analysis, feature importance ranking, or principal
component analysis.
• Model Training: Train the SVM and Random Forest models using the training set. Set the hyper
parameters of the models such as kernel, regularization parameter, or number of trees.
• Model Evaluation: Evaluate the performance of the models using metrics such as accuracy, precision,
recall, and F1-score. You can also use cross-validation techniques to ensure that the models are not
overfitting the training data.
• Hyper parameter Tuning: Optimize the hyper parameters of the models using techniques such as grid
search or random search.
• Prediction: Use the trained models to predict the mango yield for new data points.
SOFTWARE LIBRARIES
• PYTHON
• PANDAS
• NUMPY
• SCIKIT-LEARN
• PYQT5
PYTHON
• Python is a popular programming language for machine learning due to its simplicity, flexibility, and extensive
library support. Python provides a range of libraries for machine learning tasks, such as scikit-learn, TensorFlow,
PyTorch, Keras, and many others.
• These libraries provide tools for data pre-processing, model training and evaluation, hyper parameter tuning, and
prediction. Python also provides easy-to-use data structures and syntax, making it easier for developers to write
code for machine learning tasks. Additionally, Python has a large and active community of users, who contribute to
open source projects, provide support, and share knowledge through various forums and communities. Overall,
Python is a powerful and versatile language for machine learning, suitable for both beginners and experts in the
field.
PANDAS
Pandas is a popular open-source data manipulation and analysis library in Python that is commonly used in
machine learning projects. Pandas provides powerful data structures such as Series (1-dimensional labelled array)
and Data Frame (2-dimensional labelled table) for handling tabular data, time-series data, and other structured
data types.
Pandas is particularly useful for machine learning projects because it provides several functions for data pre-
processing and cleaning, such as removing missing data, encoding categorical variables, scaling numeric
features, and handling outliers. These functions can help to ensure that the input data is in a suitable format for
machine learning models.
NUMPY
• NumPy is a Python library for numerical computing that is widely used in machine learning applications.
NumPy provides a powerful N-dimensional array object that can be used to store and manipulate large
amounts of numerical data efficiently.
• NumPy provides a wide range of functions for performing mathematical operations on arrays, including
linear algebra, Fourier analysis, and random number generation. NumPy arrays are also used as the
fundamental data structure for other machine learning libraries such as scikit-learn and TensorFlow.
SCIKIT-LEARN
• Scikit-learn is a popular Python library for machine learning. It provides a range of supervised and
unsupervised learning algorithms, as well as tools for model selection, evaluation, and pre-
processing of data.
• Scikit-learn is built on top of other Python scientific computing libraries such as NumPy, SciPy, and
matplotlib. It provides a simple and consistent API for building machine learning models and
working with data, making it a popular choice for both beginners and experts in the field.
PYQT5
• PyQt5 is a Python library that provides a comprehensive set of GUI (Graphical User Interface)
components for desktop applications. It allows developers to create visually appealing and responsive
applications with a wide range of features, such as buttons, menus, toolbars, tables, and dialog boxes.
• When it comes to machine learning, PyQt5 can be used to develop GUI-based applications for various
tasks such as data visualization, model training, data pre-processing, and data analysis. With PyQt5, you
can create interactive visualizations that allow users to explore datasets, adjust model parameters, and
visualize the results.
RESULT AND DISCUSSION
The use of machine learning algorithms such as SVM and Random Forest models to predict mango yield based on
environmental factors such as location, pH, temperature, pesticides, and fertilizer has the potential to provide more
accurate and reliable predictions than traditional methods. These models can learn from historical data and identify
patterns and correlations that may not be apparent to human analysts.
The discussion of the project's results can focus on the accuracy and reliability of the chosen model for predicting
mango yield. The best-performing model can be chosen based on its performance on the testing set, and its
accuracy can be further evaluated using cross-validation techniques. The project's outcomes can be used to
optimize mango farming practices by providing farmers with predictions of mango yield based on key
environmental factors. It's worth noting that the accuracy of the predictions may depend on the quality and
quantity of the data used to train the models. Therefore, it's important to collect and use high-quality data to
improve the models' accuracy and reliability. Additionally, the models can be further improved by tuning hyper
parameters such as the kernel function in SVM or the number of estimators in the Random Forest model.
SCREEN SHOT
CONCLUSIONS
In conclusion, our proposed system for predicting mango yields using RANDOM FOREST ALGORITHM has
shown promising results. By utilizing various environmental factors and historical yield data, we were able to
train an accurate and efficient model for predicting mango yields. The use of RANDOM FOREST ALGORITHM
as a machine learning algorithm has proven to be effective in predicting agricultural yields, and our system
provides a valuable tool for mango farmers to make informed decisions about their crop management practices.
Future work for our system could include incorporating additional factors such as pest and disease prevalence, as
well as exploring to improve the accuracy of the model. Overall, our system has the potential to assist mango
farmers in increasing their yields and improving their profitability.
FUTURE ENHANCEMENTS
• Incorporating remote sensing data
• Integration with IoT devices
• Integration with block chain technology
• Geographic Information System (GIS) integration
• Multi-crop yield prediction

REFERENCES
S. S. Sarma et al., "Mango yield prediction using machine learning models," International Journal of Computer Applications,
vol. 120, no. 5, pp. 30-35, 2015.
A. K. Chakraborty et al., "Mango yield prediction using artificial neural networks," Journal of Agricultural Engineering, vol.
45, no. 3, pp. 38-42, 2008.
V. K. Singh and A. Singh, "Predictive modeling of mango yield using multiple regression analysis," Journal of Horticultural
Science, vol. 9, no. 1, pp. 45-50, 2014.
M. V. Jagadish Kumar and R. P. Gupta, "Mango yield prediction using decision tree and neural network techniques,"
International Journal of Advanced Research in Computer Science and Software Engineering, vol. 5, no. 6, pp. 627-631, 2015.
R. D. Dhasade et al., "Prediction of mango yield using machine learning algorithms," International Journal of Computer
Science and Mobile Computing, vol. 7, no. 10, pp. 279-287, 2018.

Mango Yield Prediction - Thirukkuvalai

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mango Yield Prediction - Thirukkuvalai

Uploaded by

Copyright:

Available Formats

DATE: 19-05-2002 SUBJECT: PROJECT WORK

AI AGRI AID (AAA): MANGO YIELD PREDICTION USING MACHINE

 GIRIJA.S (REG NO:822219104011)

Under the guidance of

• Aid in crop insurance, by providing accurate yield predictions to insurance companies.

1 Crop Prediction Based on S. P. RAJA 1

2 Ensemble Machine Learning HAYAM R. SEIREG 1

3 A Data-Driven Model for Pedestrian VASILEIA

Tree 1 Tree 2 Tree 3 Tree 4

Tree 1 Tree 2 Tree 3 Tree 4

Mango yield prediction with Random Forest

New data input

Feauture Trained svm

• Front end : python

• Backend : CSV (comma delimited)

• Operating system: windows OS

• System type: 64 bits OS

• IDE: Python 3.6.5 IDLE

• Incorporating remote sensing data

• Integration with IoT devices

• Integration with block chain technology

• Geographic Information System (GIS) integration

• Multi-crop yield prediction

You might also like