Professional Documents
Culture Documents
THESIS TOPIC Soil Prediction Arsenic Version 1
THESIS TOPIC Soil Prediction Arsenic Version 1
About the problem: High levels of heavy metals such as Arsenic which is
potentially found on agricultural lands may give out serious circumstances when
it comes to crop-yielding and quality/safety of crops being managed.
Note: The only way to know how well any statistical modeling or machine
learning will perform is to try it. The usual strategy is to begin with simple
models (linear regression, logistic regression, linear discriminant
analysis, ...) and proceed to the more complex models (such as Random
forest) if solution requirements are not met.
Providing context: Analyzing the soil facets within agriculture is considered as one of
the vital parts when it comes to the aspect of crop yielding and management. Crops in
the first place also relies highly to the quality of soil and how the crop would grow in a
certain amount of time and the quality that it will produce. Thus, these processes alone
would be so overwhelming when it is approached by a manual and handwritten method.
Accuracy is critically important when it comes to handling such large amounts of crucial
raw data and how they are utilized through computing. There is a need for automation in
order to minimize the redundancy of utilizing and computing such data through
identifying the soil’s current status whether it is polluted with heavy metals such as
arsenic
Test selected areas of agricultural land in Davao by extracting and testing soil resources
in order to identify heavy metal pollution, specifically soil arsenic pollution through soil
mapping and getting the soil’s level of pH, nutrient, and carbonates
Measurement error in the soil property, that is, how to incorporate measurement error in
calibration data into a machine learning model to improve model calibration and
prediction; (ii) Incorporation of expert knowledge into a model. How to incorporate
expert knowledge about soil properties with machine learning models; (iii) How to
perform multivariate soil property prediction with machine learning models; (iv) How to
model compositional soil data with machine learning models. (based from the article of
Enhancement of the use of machine learning in digital soil mapping)
Research questions:
1. What would be the contribution outcomes when there is a widespread
identification of soil arsenic pollution to the agricultural lands in Davao?
2. How accurately can the Random forest MLA algorithm identify potential soil
arsenic pollution from agricultural lands in Davao?
4. If ever Random forest is quite inaccurate in soil mapping along the run, what
would be the potential MLA alternatives for soil mapping in identifying arsenic-
polluted soil?
My research references:
Potential RRLs:
https://www.sciencedirect.com/science/article/pii/S0341816222004854
https://www.sciencedirect.com/science/article/pii/S0016706121005267
https://www.sciencedirect.com/science/article/pii/S0048969722062702
https://www.sciencedirect.com/science/article/pii/S0048969722064865
https://ieeexplore.ieee.org/document/9725758
2. Sentiment Analysis with the use of Sigmoid and TanH Activation Functions in a
Recurrent Neural Network to Determine the Overall Rating of a Product/Service
a. One of the common issues in today’s environment is the fact that we are
suffering from data overload and it is humanly impossible to analyze
customer feedback manually without any kind of bias.
b.
i. How accurately can a machine give appropriate ratings based on a
comment?
ii. How effective are google play reviews in training a machine to
classify sentiments?
iii. What combination of activation functions in a recurrent neural
network provide the most accurate analysis of sentiments?