Professional Documents
Culture Documents
SE 416 Data Mining Term Project
SE 416 Data Mining Term Project
Term Project
Deadlines: Intermediate: April 15th, 2019 Final: May 06th, 2019
In this term project, you are expected to design and develop a predictive data mining model,
that is, you will perform a classification (or regression, if you like) task. Then, you are
expected to document your work in a formal report, and to present it in front of the class.
2. Dataset
Your dataset must not be a very small one. It should contain at least about 1000 objects with
at least about 10 attributes. Besides, the dataset should contain both categorical and numerical
attributes so that you can apply several transformations.
3. Algorithms
You can use any algorithms you like. However, you are supposed to try as many as
algorithms with several options, and then compare the predictive performances of them in
your report. You will try to create a model with the best predictive performance. In order to
do this, you will apply any data preprocessing methods that may work.
You must use either Python, or R, or Julia programming languages to perform everything in
this project, including data processing, model development, and testing. You are free to use
any common data science libraries. If you like you can use several GUI tools like WEKA etc.
during your work, but at the end, you must have a working code that does the whole work
when you run it. You also need to append your code to your report.
There will be two reporting and presentation deliveries: an intermediate and a final one.
In the intermadiate delivery, you have to prepare a report and presentation with the
following outline:
1
1. Introduction
Provide a small introduction to the project.
2. Problem
Explain the problem you are going to solve in this study.
3. Dataset
Explain the dataset, givin relevant statistics, correlation charts etc.
Explain the possible preprocessing that may be helpful.
In the final delivery, you have to extend your report and presentation with the following titles:
4. Data Mining Algortihms Used
Explain the algorithms you have used, including why you have selected them, and their
parameters.
5. Experiments and Results
Explain the details of the experiments, and present your result in clear and organized way
with tables and charts.
Interpret the results.
6. Conclusion
Appendix
Add your project code in this section.
6. Project Teams