Industrial Copper Modeling Project Explanation

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 1

Project Title: Industrial Copper Modeling

Objective: Use machine learning to improve pricing decisions and lead


classification in the copper industry, which often deals with skewed and noisy
sales and pricing data. Develop regression models for predicting selling prices and
classification models for lead status (Won or Lost).

Skills Gained:

Python scripting
Data preprocessing
Exploratory Data Analysis (EDA)
Streamlit for creating interactive web pages
Domain: Manufacturing

Problem Statement:
The copper industry faces challenges in manual predictions due to skewed and noisy
data. The goal is to build machine learning models that handle these issues and
improve pricing decisions. Additionally, a lead classification model is needed to
evaluate leads based on the likelihood of converting them into customers.

Data Overview:
The dataset includes information like item date, quantity, customer details,
country, status, item type, and selling price.

Approach:

Data Understanding: Identify variable types and distributions. Treat garbage


values, convert references to categorical variables, and discard unhelpful columns.
Data Preprocessing: Handle missing values, treat outliers, address skewness, and
encode categorical variables.
EDA: Visualize outliers and skewness before and after treatment using Seaborn.
Perform feature engineering and drop highly correlated columns.
Model Building: Split the dataset, train and evaluate classification models (e.g.,
ExtraTreesClassifier, XGBClassifier) and regression models. Optimize model
hyperparameters.
Model GUI with Streamlit: Create an interactive web page to input values and
predict selling price or lead status.
Learning Outcomes:

Proficiency in Python and data analysis libraries.


Experience in data preprocessing and EDA techniques.
Application of advanced machine learning techniques (regression, classification).
Building and optimizing machine learning models.
Creating web applications using Streamlit.
Evaluation Metrics:
Code should be modular, maintainable, and portable. Follow Python coding standards
(PEP 8). Maintain a public GitHub repo with a proper README file. Include a demo
video on LinkedIn.

Summary:
This project equips you with practical skills in Python, data analysis, machine
learning, and web application development. It provides a foundation to tackle real-
world challenges in the manufacturing domain.

You might also like