Visvesvaraya Technological University: Bachelor of Engineering in Computer Science & Engineering

VISVESVARAYA TECHNOLOGICAL UNIVERSITY
“Jnana Sangama”, Machhe, Belagavi, Karnataka-590018
A Project Report
On
“Sunflower Yield Prediction Using Machine Learning”
Submitted in partial fulfillment of the requirements for the award of the degree of
Bachelor of Engineering
in
Computer Science & Engineering
Submitted by
MANASA M 4GW18CS041
ANUPAMA P 4GW18CS008
MEGHANA S D 4GW18CS048
Under the Guidance of

Mrs.Harshitha B
Assistant Professor
Dept. of CSE, GSSSIETW
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

(Accredited by NBA, New Delhi, Validity 01.07.2017 to 30.06.2020 & 01.07.2020 – 30.06.2023)
GSSS INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN

(Affiliated to VTU, Belagavi, Approved by AICTE, New Delhi & Govt. of Karnataka)
K.R.S ROAD, METAGALLI, MYSURU-570016, KARNATAKA
Accredited with Grade “A” by NAAC
2021- 2022
Geetha Shishu Shikshana Sangha(R )
GSSS INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN
K.R.S Road, Mysuru-570016, Karnataka
(Affiliated to VTU, Belagavi, Approved by AICTE -New Delhi & Govt. of Karnataka)
Accredited with Grade “A” by NAAC
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

(Accredited by NBA, New Delhi, Validity 01.07.2017 to 30.06.2020 & 01.07.2020 – 30.06.2023)
CERTIFICATE
Certified that the 8th Semester Project titled “Sunflower Yield Prediction Using Machine
Learning” is a bonafide work carried out by Manasa M (4GW18CS041), Anupama P
(4GW18CS008) and Meghana S D (4GW18CS048) in partial fulfilment for the award of
degree of Bachelor of Engineering in Computer Science & Engineering of the Visvesvaraya
Technological University, Belagavi, during the year 2021-22. The Project report has been
approved as it satisfies the academic requirements with respect to the project work prescribed
for Bachelor of Engineering Degree.
Signature of the Guide Signature of the HOD Signature of the Principal

..Mrs.Harshitha B Dr. S. Meenakshi Sundaram Dr. Shivakumar M
External Viva
Name of the Examiners Signature with Date
1.
2.
ACKNOWLEDGEMENT
The joy and satisfaction that accompany the successful completion of any task would
be incomplete without the mentioning the people who made it possible.
First and foremost we offer our sincere phrases of thanks to Smt. Vanaja B
Pandit, Honorary Secretary, GSSS(R) and the Management of GSSSIETW,
Mysuru for providing help and support to carry out the seminar.
We would like to express our gratitude to our Principal, Dr. Shivakumar M for
providing us a congenial environment for engineering studies and also for having
showed us the way to carry out the project.
We consider it is a privilege and honour to express our sincere thanks to

Dr. S.Meenakshi Sundram, Professor and Head, Department of Computer Science &
Engineering for his support and invaluable guidance throughout the tenure of this
project.
We would like to thank our Guide Mrs.Harshitha B, Assistant Professor,

Department of Computer Science & Engineering for his/her support, guidance,
motivation, encouragement for the successful completion of this project.
We would like to thank our Project Co-ordinators Smt. Madhu M Nayak,

Assistant Professor & Smt. Usha Rani J, Assistant Professor, Department of Computer
Science & Engineering for their constant monitoring, guidance & motivation throughout
the tenure of this project.
We intend to thank all the teaching and non-teaching staffs of our Computer
Science & Engineering department for their immense help and co-operation.
Finally, we would like to express our gratitude to our parents and friends who
always stood with us to complete this work successfully.
Manasa M (4GW18CS041)
Anupama P (4GW18CS008)
Meghana S D (4GW18CS048)
i
ABSTRACT
Sunflower production is the main agricultural activity in India. More than 350.000 Indian
families depend on sunflower harvest. Since sunflower rust disease was first reported in the
country in 1983, these families have had to face severe consequences. Recently, machine
learning approaches have built a dataset for monitoring sunflower rust incidence that
involves weather conditions and physic sunflower properties. This background encouraged
us to build a dataset for sunflower rust detection in Colombian sunflowers through data
mining process as Cross Industry Standard Process for data mining (CRISP-DM). In this
paper we define a proper data to generate accurate models; once the dataset is built, this is
tested using classifiers as: KNN and Linear regression Trees By analyzing all these issues
and problems like weather, temperature and several factors, there is no proper solution and
technologies to overcome the situation faced by us. In India there are several ways to increase
the economic growth in the field of agriculture. There are multiple ways to increase and
improve the sunflower yield and the quality of the sunflowers. Data mining also useful for
predicting the sunflower yield production.
ii
/’
TABLE OF CONTENTS
Acknowledgement i
Abstract ii
List of Figures iii
List of Tables iv
1 INTRODUCTION 1
1.1 Overview 1
1.2 Existing System 1
1.3 Proposed System 2
1.4 Objective 2
1.5 Problem Statement 2
2 LITERATURE SURVEY 3
3 SYSTEM REQUIREMENTS AND DESIGN 6
3.1 Requirements 6
3.1.1 Functional Requirements 6
3.1.2 Non-Functional Requirements 7
3.1.3 Hardware Requirements 7
3.1.4 Software Requirements 8
3.2 Design 8
3.2.1 ER and Schema Diagram 8
3.2.2 Design Description 9
4 IMPLEMENTATION 14
4.1 Methodology 15
4.2 Different Modules 18
4.3 Datasets 19
4.4 Implementation 21
5 TESTING 32
5.1 Introduction to Testing 32
6 RESULT AND DISCUSSION 35
6.1 Snapshots 35
CONCLUSION 39
FUTURE ENHANCEMENTS 40
41
REFERENCE
iii
LIST OF FIGURES
FIGURE PAGE
DESCRIPTION
NUMBER NUMBER
Figure 3.1 Use Case Diagram 6
Figure 3.2.1a Activity Diagram 8
Figure 3.2.1b Sequence Diagram 9
Figure 3.2.2a Architecture Diagram 10
Figure 3.2.2b Flow Diagram 11
Figure 3.2.2c Dataflow Diagram 13
Figure 4.1.1 KNN flow chart 17
Figure 4.1.2 Linear Regression flow chart 18
Figure 4.2 Dataset 20
Figure 6.1 Front page 35
Figure 6.2 Prediction method 35
Figure 6.3 Select input details 36
Figure 6.4 Prediction using Linear regression 36
Figure 6.5 Prediction using KNN 37
Comparison result of both Linear regression and

Figure 6.6 37
KNN
Figure 6.7 Prediction page for new users 38
Figure 6.8 Output prediction by KNN for new users 38
iv
LIST OF TABLES
TABLE DESCRIPTION PAGE

NUMBER NUMBER
Table 5.1 Testing 34
iv
Sunflower Yield Prediction Using Machine Learning
Chapter 1
INTRODUCTION
1.1 Overview
Six states with Karnataka in the lead are the major producers of sunflower in the country.
Karnataka with a production of 3.04 lakh tonnes from an area of 7.94 lakh hectares followed by
Andhra Pradesh, Maharashtra, Bihar, Orissa and Tamil Nadu are major sunflower producing states
of India. In India, Sunflower cultivation occupies about 1.48 M Ha area with average yield 0.6
MT/acre. Sunflower production follows a systemic weather risk as about 80 per cent of the area is
under rain-fed production.
In terms of productivity, Bihar leads with 1402 kg/ha followed by Tamil Nadu with 1328.7
kg, although both the states have less than 25000 hectares under the sunflower which is mostly
irrigated. The average productivity at all India level was 900 kg/ha depending on the climatic
conditions and irrigation, which are critical factors for high yields.
This work talks about K-Nearest Neighbor Algorithm and Linear Regression this algorithm does not
have any learning phase, because every time a classification is performed it uses a training set. The
assumption behind the k-nearest neighbor algorithm is that a similar classification is produced by
similar samples. The similar known samples used for assigning a classification to an unknown sample
are described by the parameter K. The comparison for the results obtained by KNN Algorithm is
given using Linear Regression Algorithm. Linear Regression is a linear approach for modelling the
relationship between a scalar dependent variable y and one or more explanatory variables (or
independent variables) denoted x.
1.2 Existing System

Nowadays, there are many agricultural sectors to provide guidance to the farmers regarding the
sunflower yield production. These agricultural sectors consider the Soil pH, Nitrogen and other
fertilizers to predict the sunflower yield and prescribe the measures to be taken to increase the
sunflower yield. Most of the time the weather conditions like Temperature ad Rainfall are not
considered to predict sunflower yield and hence there will be less accuracy in the predicted results
Dept. of CSE 1 GSSSIETW, Mysuru

as all there prediction is done manually. There is no automation for the prediction of “sunflower
Yield”.
1.3 Proposed System

Prediction of sunflower yield has become a major issue in agricultural field and is an area of
concern. This prediction gives a brief knowledge to the farmers regarding how much yield can be
expected on their plot based on the analysis of previous year records. This prediction is done using
the Classification Algorithm. Proposed system is automation for Sunflower Yield prediction using
the classification technique “KNN Algorithm”. A comparison for the result obtained by the KNN
Algorithm is given using Linear Regression.
1.4 Objective
1 “The sunflower yield prediction” helps to predict the yield based on the consideration of different
types of attributes like Temperature, Rainfall, Soil ph. level and Nitrogen.
2 “The sunflower yield prediction” predicts yield based on individual farmers past yield records
of particular plot.
3 “The sunflower yield prediction” predicts the yield using “KNN Algorithm” and a comparison
study has been given based on “Linear Regression”.
4 “The sunflower yield prediction” is implemented using JAVA technology.
1.5 Problem Statement

“Sunflower Yield prediction” has become a global issue and is an area of concern. All farmers
will not be having much knowledge regarding how much yield can be produced in their plot with
certain soil and weather condition.

Chapter 2
LITERATURE SURVEY
Agriculture is the most important sector that influences the economy of India. It contributes
to 18% of India's Gross Domestic Product (GDP) and gives employment to 50% of the population of
India. People of India are practicing Agriculture for years but the results are never satisfying due to
various factors that affect the sunflower yield
To fulfill the needs of around 1.2 billion people, it is very important to have a good yield of
sunflowers. Due to factors like soil type, precipitation, seed quality, lack of technical facilities etc the
sunflower yield is directly influenced. Hence, new technologies are necessary for satisfying the
growing need and farmers must work smartly by opting new technologies rather than going for trivial
methods. This paper focuses on implementing sunflower yield prediction system by using Data
Mining techniques by doing analysis on agriculture dataset. Different classifiers are used namely J48,
LWL, LAD Tree and IBK for prediction and then the performance of each is compared using WEKA
tool.
For evaluating performance Accuracy is used as one of the factors. The classifiers are further
compared with the values of Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and
Relative Absolute Error (RAE). Lesser the value of error, more accurate the algorithm will work. The
result is based on comparison among the classifiers. Food production in India is largely dependent
on cereal sunflowers including rice, wheat and various pulses. The sustainability and productivity of
rice growing areas is dependent on suitable climatic conditions.
Variability in seasonal climate conditions can have detrimental effect, with incidents of drought
reducing production. Developing better techniques to predict sunflower productivity in different
climatic conditions can assist farmer and other stakeholders in better decision making in terms of
agronomy and sunflower choice. Machine learning techniques can be used to improve prediction of
sunflower yield under different climatic scenarios.
This paper presents the review on use of such machine learning technique for Indian rice
sunflowerping areas. This paper discusses the experimental results obtained by applying SMO
classifier using the WEKA tool on the dataset of 27 districts of Maharashtra state, India. The dataset
considered for the rice sunflower yield prediction was sourced from publicly available Indian

Government records. The parameters considered for the study were precipitation, minimum
temperature, average temperature, maximum temperature and reference sunflower
evapotranspiration, area, production and yield for the Kharif season (June to November) for the years
1998 to 2002.
For the present study the mean absolute error (MAE), root mean squared error (RMSE), relative
absolute error (RAE) and root relative squared error (RRSE) were calculated. The experimental
results showed that the performance of other techniques on the same dataset was much better
compared to SMO. In this paper author has focused on the applications of Data Mining techniques in
agricultural field. Different Data Mining techniques are used, such asK-Means, K-Nearest Neighbor
(KNN), Artificial Neural Networks (ANN) and Support Vector Machines (SVM) for very recent
applications of Data Mining techniques in agriculture field. In this paper they have considered the
problem of predicting yield production. This work aims at finding suitable data models that achieve
a high accuracy and a high generality in terms of yield prediction capabilities. For this purpose,
different types of Data Mining techniques were evaluated on different data sets.
In this paper authors present some of the most used data mining techniques in the field of agriculture.
In the near future the penetration of Information Technology and Agriculture results is more
interesting area of research. The main aim of this work is to improve and substantiate the validity of
yield prediction which is useful for the farmers. Agricultural sunflower production depends on
various factors such as biology, climate, economy and geography. Several factors have different
impacts on agriculture, which can be quantified using appropriate statistical methodologies.
Agronomic traits such as yield can be affected by a large number of variables. In this survey, they
have analyzed Data Mining methods like clustering, classification models to select the most relevant
method for the prospect.
The proposed research aims to develop a predictive model that provides a cultivation plan for farmers
to get high yield of sunflower sunflowers using data mining techniques. Unlike statistical approaches,
data mining techniques extract hidden knowledge through data analysis. The data set used in this
research for mining process is real data collected from farmers cultivating sunflower along the
Thamirabarani river basin.
K-means clustering and various decision tree classifiers are applied to meteorological and agronomic
data for the sunflower sunflower. The performance of various classifiers is validated and compared.

Based on experimentation and evaluation, it has been concluded that the random forest classifier
outperforms the other classification methods. Moreover, classification of clustered data provides good
classification accuracy. The outcome of this research is the identification of different combination of
traits for achieving high yield in sunflower sunflower. The final rules extracted by this research are
useful for farmers to make proactive and knowledge-driven decisions before harvest.

Chapter 3
SYSTEM REQUIREMENT AND DESIGN
3.1 REQUIREMENT
3.1.1 Functional Requirements
The functions that systems should provide to its user are known as functional requirements.
Figure 3.1: Use case diagram

The introduction of the Software Requirements Specification (SRS) provides an overview of the
entire SRS with purpose, scope, definitions, acronyms, abbreviations, references and overview
of the SRS. The aim of this document is to gather, analyse, and give an in-depth insight of the
complete “Sunflower Prediction” by defining the problem statement in detail. The detailed
requirements of the Indian automobile buying behaviour – user related functions are provided in
this document.

PURPOSE:
The Purpose of the Software Requirements Specification is to provide the technical, Functional
and non-functional features, required to develop a web application App. The entire application
designed to provide user flexibility for finding the shortest and/or time saving path. In short, the
purpose of this SRS document is to provide a detailed overview of our software product, its
parameters and goals. This document describes the project’s target audience and its user interface,
hardware and software requirements. It defines how our client, team and audience see the product
and its functionality.
3.1.2 Non-functional requirements

The conditions on which system should operate are specified as non-functional requirements and
they are:
1. Availability This application will be available for all the farmers, where he can predict the yield
using build model the application can be dumped to servers for further use.
2. Usability: Usability Testing is a type of testing, that is done from an end user’s perspective to
determine if the system is easily usable. Usability testing is generally the practice of testing
how to easy design is to use on a group of representative users. In this application we use testing
all system design whether they are full filling farmer requirements.
3. Efficiency: Efficiency testing tests the amount of resources required by a program to perform a
specific function. The application has good efficiency in predicting the yield.
3.1.3 HARDWARE REQUIREMENTS
The hardware requirements are description of operating system requirements or compatibility. of

the project
• Hard Disk : 50GB
• RAM : 512GB
• Processor : i5 core

3.1.4 SOFTWARE REQUIREMENTS
The software requirements are description of features and functionalities of the project.
• Front End : HTML, CSS, Bootstrap
• Back End : MySQL
• Tool : JSP, Servlets, Ajax, JSon
• ID : NetBeans
3.2 DESIGN
3.2.1 Schema Diagram
The main entities and how they are related with the other is shown in the diagram below.
Figure 3.2.1a: Activity diagram

The control flow diagram shows how the user will flow through the system, and how the user’s data
will flow. The diagram shows how the user input will be converted to the output, and based on what
the user wants to do. The diagram shows the decisions that the system will perform to get the desired
output.The intermediate phases are also shown in the diagram.
Figure 3.2.1b: Sequence Diagram
3.2.2 Design Description
The Software Design will be used to aid in software development for android application by
providing the details for how the application should be built. Within the Software
Design,specifications are narrative and graphical documentation of the software design for the
project includes use case models, sequence diagrams and other supporting requirement
information.

Figure 3.2.2a: Architecture diagram
Stage1:
There are 4 features and 1 class label for Prediction of Sunflower yield, and the features include soil
ph, nitrogen,phophurus,rainfall, wpi as shown in Table.
Stage2:
Data Cleaning:
The data can have many irrelevant and missing parts. To handle this part, data cleaning is done. It
involves handling of missing data, noisy data etc,.
Missing Data:
This situation arises when some data is missing in the data. It can be handled in various ways.
Some of them are:

1. Ignore the tuples: This approach is suitable only when the dataset we have is quite large and
multiple values are missing within a tuple.
2. Fill the Missing values: There are various ways to do this task. You can choose to fill the
missing values manually, by attribute mean or the most probable value.
Stage 3:
The obtained data from stage is taken into consideration then data is trained using the decision tree
and obtained result is analysed and Showed in the graph using Highcahrt library. System architecture
is a conceptual model that defines the structure, behaviour and more views of a system. A system
architecture can comprise system components, the expand systems developed, that will work together
to implement the overall system.
Data Preparation
Missing Value Numeric Nominal
Modeling
KNN Algorithm KNN Algorithm
Visualization/Result Analysis
(Bar graph and Pie chart)
Fig 3.2.2b: Flow diagram

A data flow diagram (DFD) maps out the flow of information for any process or system. It uses
defined symbols like rectangles, circles and arrows, plus short text labels, to show data inputs,
outputs, storage points and the routes between each destination. Data flowcharts can range from
simple, even hand-drawn process overviews, to in-depth, multi-level DFDs that dig progressively
deeper into how the data is handled. They can be used to analyses an existing system or model a new
one.
Data preprocessing: Dataset will be added to the preprocessing
a) Input: crop dataset
b) Process: Preprocessing will find missing value and also does feature remove
c) Output: preprocessed dataset
d) Error handling: If the input file is not a valid one
Feature selection: Selection of the data from a dataset
a) Input: preprocessed dataset
b) Process: It will select only important data which is required
c) Output: Selected data will be displayed
Splitting of the Data: Training data and Test Data
a) Input: Feature selected data
b) Process: It will split the data into the train set and test set
c) Output: Dataset will be displayed as Train set and Test set and it will be tested for
the specific algorithms and performance analysis will be carried out.
Prediction: Training data and Test Data
a) Input: Soil Ph, nitrogen, phosphorus, rainfall
b) Process: The datasets are stored and
c) Output: Regression algorithms are applied and result is showed in the visualization format.

DFD 0:
DFD 1:
Figure 3.2.2c: Data Flow Diagram

Chapter 4
IMPLEMENTATION
The project is implemented using java which is an object oriented programming language and
procedure oriented programming language. Object oriented programming is an approach that
provides a way of modularizing program by creating partitioned memory area of both data and
function that can be used as a template for creating copies of such module on demand.
This project is implemented using java programming language. Java is dynamically

typed and garbage-collected. It supports multiple programming paradigms, including procedural,
object-oriented, and functional programming. Java is often described as a "batteries included"
language due to its comprehensive standard library. The machine Learning techniques are used in
this project.
Machine Learning overview :
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to
learn without being explicitly programmed. Machine learning focuses on the development of
Computer Programs that can change when exposed to new data. In this article, we’ll see basics of
Machine Learning, and implementation of a simple machine learning algorithm using java.
Machine learning involves a computer to be trained using a given data set, and use this training to
predict the properties of a given new data. For example, we can train a computer by feeding it 1000
images of cats and 1000 more images which are not of a cat, and tell each time to the computer
whether a picture is cat or not. Then if we show the computer a new image, then from the above
training, the computer should be able to tell whether this new image is a cat or not.
The process of training and prediction involves the use of specialized algorithms. We feed the training
data to an algorithm, and the algorithm uses this training data to give predictions on a new test data.
One such algorithm is K-Nearest-Neighbor classification (KNN classification). It takes a test data,
and finds k nearest data values to this data from test data set. Then it selects the neighbor of maximum
frequency and gives its properties as the prediction result.

4.1 Methodology
KNN [K-Nearest Neighbors]
KNN is also a lazy algorithm (as opposed to an eager algorithm). this means is that it does not use the
training data points to do any generalization. In other words, there is no explicit training phase or it is
very minimal. This also means that the training phase is pretty fast . Lack of generalization means that
KNN keeps all the training data. To be more exact, all (or most) the training data is needed during the
testing phase.KNN Algorithm is based on feature similarity: How closely out-of-sample features
resemble our training set determines how we classify a given data point:KNN can be used
for classification — the output is a class membership (predicts a class — a discrete value). An object
is classified by a majority vote of its neighbors, with the object being assigned to the class most
common among its k nearest neighbors. It can also be used for regression — output is the value for
the object (predicts continuous values). This value is the average (or median) of the values of it’s k
nearest neighbors.
The training examples are vectors in a multidimensional feature space, each with a class label. The
training phase of the algorithm consists only of storing the feature vectors and class labels of the
training samples.In the classification phase, k is a user-defined constant, and an unlabeled vector (a
query or test point) is classified by assigning the label which is most frequent among the k training
samples nearest to that query point.A commonly used distance metric for continuous variables is
Euclidean distance. For discrete variables, such as for text classification, another metric can be used,
such as the overlap metric (or Hamming distance). In the context of gene expression microarray data,
for example, k-NN has been employed with correlation coefficients, such as Pearson and Spearman,
as a metric. Often, the classification accuracy of k-NN can be improved significantly if the distance
metric is learned with specialized algorithms such as Large Margin Nearest Neighbor or
Neighbourhood components analysis.A drawback of the basic "majority voting" classification occurs
when the class distribution is skewed. That is, examples of a more frequent class tend to dominate the
prediction of the new example, because they tend to be common among the k nearest neighbors due
to their large number.

One way to overcome this problem is to weight the classification, taking into account the distance
from the test point to each of its k nearest neighbors. The class (or value, in regression problems) of
each of the k nearest points is multiplied by a weight proportional to the inverse of the distance from
that point to the test point. Another way to overcome skew is by abstraction in data representation. For
example, in a self-organizing map (SOM), each node is a representative (a center) of a cluster of
similar points, regardless of their density in the original training data. K-NN can then be applied to the
SOM.
The best choice of k depends upon the data; generally, larger values of k reduces effect of the noise
on the classification, but make boundaries between classes less distinct. A good k can be selected by
various heuristic techniques (see hyperparameter optimization). The special case where the class is
predicted to be the class of the closest training sample (i.e. when k = 1) is called the nearest neighbor
algorithm.
Input: Enter the previous year data records of famer’s plot

Output: Display the predicted results of Sunflower Yield
Working: Outputs the predicted results based on the previous year datasets begin
1. Scan the datasets (Storage servers)
Retrieval of required data for mining from the servers such as databases and so on;
2. Determine parameter K=3 (number of nearest neighbors);
3. Calculate the distance between the target yield and all the training examples(yield acquired by
the farmer);
4. Sort the distance between the yield and determine nearest neighbors based on the K-th
minimum distance;
5. Gather the top 3 values of the nearest neighbors;
6. Use simple majority of the category of nearest neighbors as the prediction value of the
sunflower yield;

Figure 4.1.1: KNN Algorithm
LINEAR REGRESSION ALGORITHM
linear regression is a linear approach to modeling the relationship between a scalar response
(or dependent variable) and one or more explanatory variables (or independent variables). The case
of one explanatory variable is called simple linear regression. For more than one explanatory variable,
the process is called multiple Linear Regression. In linear regression, the relationships are modeled
using linear predictor functions whose unknown model parameters are estimated from the data. Such
models are called linear models.[3] Most commonly, the conditional mean of the response given the
values of the explanatory variables (or predictors) is assumed to be an affine function of those values;
less commonly, the conditional median or some other quintile is used.

Figure 4.1.2: Linear Regression Algorithm
4.2 Different Modules
Module 1
The datasets considered for this work consists of Target Yield datasets and it is used to make
comparison with the Real time data collected from the farmer. In this work, approximately 1,600
datasets has been considered in target yield for respective Taluk and 11,340 datasets for overall seven
Taluks. Along with these datasets the data related to Rainfall and Temperature is approximately
around 1400 datasets. Soil analysis report consists of Soil ph and Nitogen value, which is
approximately 2279.

Module 2
We would make the architectures of various self-developed and pre-trained deep neural networks,
machine learning algorithms and their corresponding performances for the task of predicting the
sunflower yield . We will try to train the model to classify the prediction also for the new users using
KNN classifier.
Module 3
Development of a User interface with the following technologies:
● HTML
● CSS
● JavaScript
Basic Algorithms:
➢ KNN : KNN is a lazy algorithm (as opposed to an eager algorithm). this means is that it does not use
the training data points to do any generalization. In other words, there is no explicit training phase or
it is very minimal. This also means that the training phase is pretty fast .
➢ Linear Regression : linear regression is a linear approach to modeling the relationship
between a scalar response (or dependent variable) and one or more explanatory
variables (or independent variables). The case of one explanatory variable is called simple
linear regression.
4.3 Dataset
The datasets considered for this work consists of Target Yield datasets and it is used to make
comparison with the Real time data collected from the farmer. In this work, approximately 1,600
datasets has been considered in target yield for respective Taluk and 11,340 datasets for overall seven
Taluks. Along with these datasets the data related to Rainfall and Temperature is approximately
around 1400 datasets. Soil analysis report consists of Soil ph and Nitogen value, which is
approximately 2279.

Figure 4.2: Dataset

4.4 Implementation Code
package com.database.util;
import java.sql.Connection;
import java.sql.DriverManager;
public class DBsingletone1 {

private static final DBsingletone1 only_one = new DBsingletone1();
private static Connection con;
static {
try {
Class.forName("com.mysql.jdbc.Driver");
con = DriverManager.getConnection("jdbc:mysql://localhost:3306/cropyieldprediction",
"root", "root");
} catch (Exception e) {
e.printStackTrace();
}
}
public static DBsingletone1 getDbSingletone() {

return only_one;
}
public Connection getConnection() {

System.out.println("connection made ");
return con;
}
}

import com.database.util.DBsingletone1;
import java.io.IOException;
import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.json.JSONArray;
import org.json.JSONObject;
@WebServlet(urlPatterns = {"/getAllSurveyNumbersByFID"})
public class getAllSurveyNumbersByFID extends HttpServlet {
protected void processRequest(HttpServletRequest request, HttpServletResponse response)

throws ServletException, IOException {
response.setContentType("text/html;charset=UTF-8");
PrintWriter out = response.getWriter();
try {
String fid = request.getParameter("fid");

DBsingletone1 db = DBsingletone1.getDbSingletone();
Connection con = db.getConnection();
PreparedStatement ps = con.prepareStatement(" SELECT DISTINCT `SurNo` FROM
`soilanalysisreport` WHERE `FID`='"+fid+"' ");
ResultSet rs = ps.executeQuery();
JSONArray jaray = new JSONArray();
while(rs.next())

{
JSONObject json = new JSONObject();
json.put("name", rs.getString("SurNo"));
jaray.put(json);
}
out.print(jaray);
}
catch(Exception e)
{
e.printStackTrace();
}
}
@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response)
processRequest(request, response);
}
@Override
protected void doPost(HttpServletRequest request, HttpServletResponse response)
processRequest(request, response);
}
@Override
public String getServletInfo() {
return "Short description";
}// </editor-fold>
<!DOCTYPE html>

<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">

<meta name="description" content="">
<meta name="author" content="">
<link rel="icon" href="favicon.ico">
<title>Sunflower Yield</title>

<link href="css/bootstrap.min.css" rel="stylesheet">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-
awesome/4.4.0/css/font-awesome.min.css">

<link href="css/owl.carousel.css" rel="stylesheet">
<link href="css/owl.theme.default.min.css" rel="stylesheet">
<link href="css/style.css" rel="stylesheet">


<script src="js/ie-emulation-modes-warning.js"></script>


</head>
<body id="page-top">



<nav class="navbar navbar-default navbar-fixed-top">
<div class="container">

<div class="navbar-header page-scroll">
<button type="button" class="navbar-toggle" data-
toggle="collapse" data-target="#bs-example-navbar-collapse-1">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand page-scroll" href="#page-top"><img
src="images/logo.png" alt="Lattes theme logo"></a>
</div>

<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-nav navbar-right">
<li class="hidden">
<a href="#page-top"></a>
</li>
<li>
<a
class="page-scroll" href="#about">About</a>

<script src="js/jquery.min.js"></script>

<script src="js/jquery.easing.1.3.js"></script>

<script src="js/bootstrap.min.js"></script>


<script src="js/jquery.waypoints.min.js"></script>
<script src="js/sticky.js"></script>

<script src="js/owl.carousel.min.js"></script>

<script src="js/jquery.countTo.js"></script>


<script src="js/jquery.stellar.min.js"></script>


<script src="js/jquery.magnific-popup.min.js"></script>
<script src="js/magnific-popup-options.js"></script>


<script src="js/main.js"></script>
</body>
</html>
!DOCTYPE HTML>

<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Sunflower Yield</title>

<meta name="viewport" content="width=device-width, initial-scale=1">

<meta name="description" content="Free HTML5 Website Template by GetTemplates.co"
/>
<meta name="keywords" content="free website templates, free html5, free template, free
bootstrap, free website template, html5, css3, mobile first, responsive" />
<meta name="author" content="GetTemplates.co" />


<meta property="og:title" content=""/>
<meta property="og:image" content=""/>
<meta property="og:url" content=""/>
<meta property="og:site_name" content=""/>
<meta property="og:description" content=""/>
<meta name="twitter:title" content="" />
<meta name="twitter:image" content="" />
<meta name="twitter:url" content="" />
<meta name="twitter:card" content="" />
<link href="https://fonts.googleapis.com/css?family=Open+Sans:300,400,700"
rel="stylesheet">


<link rel="stylesheet" href="css/animate.css">

<link rel="stylesheet" href="css/icomoon.css">

<link rel="stylesheet" href="css/themify-icons.css">

<link rel="stylesheet" href="css/bootstrap.css">


<link rel="stylesheet" href="css/magnific-popup.css">


<link rel="stylesheet" href="css/owl.carousel.min.css">
<link rel="stylesheet" href="css/owl.theme.default.min.css">


<link rel="stylesheet" href="css/style.css">


<script src="js/modernizr-2.6.2.min.js"></script>


</head>
<body>
<!DOCTYPE html>

<html>
<head>
<title>Sunflower Yield Prediction</title>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="css/bootstrap.css" rel="stylesheet" type="text/css"/>

<style>
#container {
margin: 0 auto;
}
</style>
</head>
<body style="background-color: ghostwhite">
<br>
<br>
<a
href="http://localhost:8080/CottonYieldPrediction/Fasal Prapti/prediction.html">
<h5>BACK</h5></a>
<div class="container" style="margin-top: 5%">

<div class="row"><br>
<div class="col-lg-2">
<select id="taluk" class="form-control" onchange="getFarmers()" style="border: solid
black 1px">
<option value="-1">---Select Taluk---</option>
<option value="H D Kote">H D Kote</option>

<option value="K R Nagar">K R Nagar</option>
<option value="Hunsur">Hunsur</option>
<option value="Mysore">Mysore</option>
<option value="Nanjanagud">Nanjanagud</option>
<option value="Periyapatna">Periyapatna</option>
</select>
</div>

<select id="farmer" class="form-control" onchange="getSurvey()" style="border: solid
black 1px">
</select>
</div>
<select id="survey" onchange="getSeason()" class="form-control" style="border: solid
black 1px">
<option value="-1">---Select Survey No---</option>
</select>
</div>
<select id="season" onchange="getGraph()" class="form-control" style="border: solid
black 1px">
<option value="-1">---Select Season---</option>
<option value="FH">Summer</option>
<option value="SH">Kharif</option>
</select>
</div>
</div>
</div><br>
<script src="js/jquery-3.1.1.min.js" type="text/javascript"></script>
<script src="js/highcharts.js" type="text/javascript"></script>
<script src="js/highcharts-more.js" type="text/javascript"></script>
<script src="js/exporting.js" type="text/javascript"></script>
<script src="js/highcharts-3d.js" type="text/javascript"></script>
<div class="container">
<div class="row">
<div class="col-md-12 col-sm-12" id="container"></div>
<div class="col-lg-6" id="container1"></div>

<div class="col-lg-3" ></div>

<div class="col-lg-6" style="margin-top: 2%" id="container2"></div>
<div class="col-lg-6" style="margin-top: 2%" id="container3"></div>
</div>
</div>
<script>

Chapter 5
TESTING
5.1 Introduction to Testing

Verification and validation is a generic name given to checking processes, which ensures that the
software conforms to its specifications and meets the demands of users.
• Validation
Are we building the right product? Validation involves checking that the program has
implanted meets the requirement of the users.
• Verification
Are we building the product right? Verification involves checking that the program confirms
to its specification.
Stages in the Implementation of Testing
Unit testing:
Each individual unit is tested for correctness. These individual components will be tested to ensure
that they operate correctly.
Module Testing:
A module is a collection of dependent components such as a function. A module encapsulates
related components so can test without other system modules.
Sub-system Testing:
This phase involves testing collection of modules, which have been integrated into sub-
systems.Sub-systems may be independently designed and implemented.
System testing:
The sub-systems are integrated to make up the entire system. The errors that result from
unanticipated interaction between sub-systems and system components are removed.
Acceptance testing:
This is the final stage in the testing process before the system is tested for operational use.Any
requirement problem or requirement definition problem revealed from acceptance testing are
considered and made error free.
Test Plan:
Careful planning is needed to the most of testing and controlled testing cost.

Test Negative Required Expected Result Actual Result Test

Case No Scenario Input pass/
Fail
1 Install NetBeans IDE Run application Run application not Fail

Software 8.0, mysql successful successful
2 Prediction Click on graph will be No graph will be Fail

Of submit without displayed as a Displayed as a
sunflower filling all the result of result of prediction
yield fields prediction and and error message
error message will will be displayed
be displayed
3 Prediction Click on Update It does not outputs It outputs a error Fail
Of sunflowerTarget without a error message message when
yield adding new data when clicked it clicked it
4 Prediction Click on Update It does not outputs It outputs a error Fail

Of sunflowerReport without a error message message when
yield adding new data when clicked it clicked it

Test Positive Required Input Expected Result Actual Result Test

Case Scenario pass/
No Fail
1 Prediction Select the survey Two graphs are Two graphs are Pass
of sunflower number for which the displayed to show displayed to show
yield prediction has to be the predicted result the predicted result
done and submit using KNN and using KNN and
Linear Regression Linear Regression
Algorithm Algorithm
2 Predictio Dataset from Summary Predicted result of Predicted result of Pass

n using report sunflower yield sunflower yield
KNN
Algorith
m
3 Prediction Input X and Y array Predicted result of Predicted result of Pass

using Linear values sunflower yield sunflower yield
Regression
Algorithm
4 Prediction Dataset from Summary Predicted result of predicted result of Pass

using KNN report sunflower yield sunflower yield
classification
Algorithm
Table 5.1: Testing

Chapter 6
RESULTS AND DISCUSSION
6.1 Snapshots
Figure 6.1: Front Page

In the front page one can find prediction , classification and about details.
Figure 6.2:Prediction method page

In the Prediction method page user can select the prediction method
Figure 6.3: Select input details

User can select the required details from the drop down
Figure 6.4: Prediction using linear regression

Figure 6.5: Prediction using KNN
Figure 6.6: Comparison result of both Linear Regression and KNN

Figure 6.7: Prediction page for new users

New users can the input details and get prediction
Figure 6.8: Output prediction by KNN for new users

CONCLUSION
• This project is an agricultural sector application which helps the farmers in the predicting
the sunflower yield based on the previous year datasets.
• Famers can check the predicted sunflower yield for their plot by entering the past data of
their plot.
• It is automation for sunflower yield prediction and is an efficient and is economically
faster.
• It is successfully accomplished by applying the KNN Algorithm for sunflower yield
prediction and Linear Regression Algorithm for giving a comparison result for KNN
algorithm.
• The Classification technique comes under data mining technology. These algorithms take
the previous datasets as input and predict the sunflower yield based on the previous
datasets.

FUTURE ENHANCEMENT
• The present system is developed for the prediction of Sunflower only, in future a system
can be developed to predict sunflower yield for different types of sunflowers, vegetables,
flowers and so on.
• The present system is developed for the seven different taluks of Mysuru only; future
enhancement can be made by developing a system where prediction can be done for
different cities and their taluks.
• The present system outputs the result based on KNN Algorithm and Linear Regression,
whereas a system can be developed by the fusion of these algorithms as well as a
comparison for KNN algorithm.

REFERENCES
1. D Ramesh, B Vishnu Vardhan. “Data Mining Techniques and Applications to Agricultural

Yield Data”. International Journal of Advanced Research in Computer and Communication
Engineering Vol. 2, Issue 9, September 2013.
2. Ami Mistry and Vinita Shah. “Brief Survey Of Data Mining Techniques Applied To
Applications Of Agriculture”. International Journal of Advanced Research in Computer and
Communication Engineering Vol. 5, Issue 2, February 2016.
3. A.T.M Shakil Ahamed, Navid Tanzeem Mahmood and Nazmul Hossain. “Applying Data
Mining Techniques To Predict Annual Yield Of Major Sunflowers And Recommend
Planting Different Sunflowers In Different Districts In Bangladesh”. Department of
Electrical and Computer Engineering, North South University, Bangladesh.
4. Monali Paul, Santosh K, Vishvakarma and Ashok Verma. “Analysis of Soil Behavior and
Prediction of Sunflower Yield Using Data Mining Approach”. 2015 International
Conference on Computational Intelligence and Communication Networks.
5. O.D. Sirotenko and V.A. Romanenkov “Mathematical Models of Agricultural Supply”
MATHEMATICAL MODELS OF LIFE SUPPORT SYSTEMS – Vol. II
6. C. Philip Cox “A Simple Alternative To The Standard Statistical Model For
The Analysis Of Field Experiments With Latin Square Designs”
7. Datasets from “Karnataka State Natural Disaster Monitoring Center”
https://www.ksndmc.org/Weather_info.aspx
8. Datasets from “Directorate of Economics and Statistics” ANNUAL RAINFALL REPORT
OF 2010.
OF 2011.
OF 2013.

Visvesvaraya Technological University: Bachelor of Engineering in Computer Science & Engineering

Uploaded by

Copyright:

Available Formats

You might also like

Visvesvaraya Technological University: Bachelor of Engineering in Computer Science & Engineering

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Visvesvaraya Technological University: Bachelor of Engineering in Computer Science & Engineering

Uploaded by

Copyright:

Available Formats

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“Jnana Sangama”, Machhe, Belagavi, Karnataka-590018

Under the Guidance of

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

GSSS INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Signature of the Guide Signature of the HOD Signature of the Principal

Name of the Examiners Signature with Date

We consider it is a privilege and honour to express our sincere thanks to

We would like to thank our Guide Mrs.Harshitha B, Assistant Professor,

We would like to thank our Project Co-ordinators Smt. Madhu M Nayak,

3 SYSTEM REQUIREMENTS AND DESIGN 6

3.1.1 Functional Requirements 6

3.1.2 Non-Functional Requirements 7

3.1.3 Hardware Requirements 7

3.1.4 Software Requirements 8

3.2.1 ER and Schema Diagram 8

3.2.2 Design Description 9

4.2 Different Modules 18

5.1 Introduction to Testing 32

6 RESULT AND DISCUSSION 35

Figure 3.1 Use Case Diagram 6

Figure 3.2.1a Activity Diagram 8

Figure 3.2.1b Sequence Diagram 9

Figure 3.2.2a Architecture Diagram 10

Figure 3.2.2b Flow Diagram 11

Figure 3.2.2c Dataflow Diagram 13

Figure 4.1.1 KNN flow chart 17

Figure 4.1.2 Linear Regression flow chart 18

Figure 4.2 Dataset 20

Figure 6.1 Front page 35

Figure 6.2 Prediction method 35

Figure 6.3 Select input details 36

Figure 6.4 Prediction using Linear regression 36

Figure 6.5 Prediction using KNN 37

Comparison result of both Linear regression and

Figure 6.7 Prediction page for new users 38

Figure 6.8 Output prediction by KNN for new users 38

TABLE DESCRIPTION PAGE

Table 5.1 Testing 34

1.2 Existing System

Dept. of CSE 1 GSSSIETW, Mysuru

1.3 Proposed System

1.5 Problem Statement

Dept. of CSE 2 GSSSIETW, Mysuru

Dept. of CSE 3 GSSSIETW, Mysuru

Dept. of CSE 4 GSSSIETW, Mysuru

Dept. of CSE 5 GSSSIETW, Mysuru

Figure 3.1: Use case diagram

Dept. of CSE 6 GSSSIETW, Mysuru

3.1.2 Non-functional requirements

3.1.3 HARDWARE REQUIREMENTS

The hardware requirements are description of operating system requirements or compatibility. of

• Hard Disk : 50GB

Dept. of CSE 7 GSSSIETW, Mysuru

3.1.4 SOFTWARE REQUIREMENTS

• Front End : HTML, CSS, Bootstrap