Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

SVKM’s NMIMS

Mukesh Patel School of Technology Management & Engineering


Computer Engineering Department
Program: B. Tech SEM-V

Course: Data Wrangling


Experiment No.07

PART A
(PART A : TO BE REFFERED BY STUDENTS)

A.1 Aim:

Write a program in Python to implement the given scenario for Data


Loading
A.2 Prerequisite:

1. Knowledge of coding language


2. Fundamental concepts of Python programming topic.

A.3 Outcome:
After successful completion of this experiment students will be able to

1. Understand how to load data to system

A.4 Theory:

Data loading is used in database-based extraction and loading techniques.


Typically, such data is loaded into the destination application as a different
format than the original source location.

For example, when data is copied from a word processing file to a database
application, the data format is changed from .doc or .txt to a .CSV or DAT
format. Usually, this process is performed through or the last phase of the
Extract, Transform and Load (ETL) process. The data is extracted from an
external source and transformed into the destination application's supported
format, where the data is further loaded.

The load phase loads the data into the end target, which can be any data
store including a simple delimited flat file or a data warehouse. Depending
on the requirements of the organization, this process varies widely. Some
data warehouses may overwrite existing information with cumulative
information; updating extracted data is frequently done on a daily, weekly,
or monthly basis. Other data warehouses (or even other parts of the same
data warehouse) may add new data in a historical form at regular intervals
— for example, hourly. To understand this, consider a data warehouse that
is required to maintain sales records of the last year. This data warehouse
overwrites any data older than a year with newer data. However, the entry
of data for any one year window is made in a historical manner. The timing
and scope to replace or append are strategic design choices dependent on
the time available and the business needs. More complex systems can
maintain a history and audit trail of all changes to the data loaded in the
data warehouse. As the load phase interacts with a database, the constraints
defined in the database schema — as well as in triggers activated upon data
load — apply (for example, uniqueness, referential integrity, mandatory
fields), which also contribute to the overall data quality performance of the
ETL process.

● For example, a financial institution might have information on a


customer in several departments and each department might have
that customer's information listed in a different way. The
membership department might list the customer by name, whereas
the accounting department might list the customer by number. ETL
can bundle all of these data elements and consolidate them into a
uniform presentation, such as for storing in a database or data
warehouse.
A.5 Tasks / Procedures:

TASK 1:

Write a program in Python to implement the given scenario for Data


Loading

1.Check the existing directory


2.Change current directory to new directory
2.Save csv file without index
2.Seperator: Use a custom delimiter for csv output
3.While saving csv file, try to replace missing value with some
unknown
4. While saving csv file, try to format floating number (rounded by 2
decimal)
5.Try to export csv without column name
6. Try to export csv with specific column

PART B
(PART B : TO BE COMPLETED BY STUDENTS)

Students must submit the soft copy as per following segments within two hours of the practical.
The soft copy must be uploaded on the portal at the end of the practical. The filename should be
DS_batch_rollno_experimentno Example: DS_E7_E001_Exp7

Roll No.: L010 Name: Raj Chaudhary

Class : CSDS 311 Batch : B1

Date of Experiment: 07-10-2023 Date of Submission

Grade : Time of Submission:

Date of Grading:

B.1 Brief about all steps written by student:


(Paste your code completed during the 2 hours of practical in the lab here)

Task 1:
Input:

Output:
Export without index
Export without index and different separator

Export without index and fill NA with “No”


B.4 Conclusion:
(Students must write the conclusion as per the attainment of individual outcome listed above and
learning/observation noted in section B.3. It must be purely based on personal learning only.)

Conducted Data Loading tasks in python.

B.5 Question of Curiosity


(To be answered by student based on the practical performed and learning/observations. Students
are free to explore the websites, books etc to answer these questions)

Q1. Explain how to load csv file to MySQL database(Steps with an Code)?

To import a CSV file into a MySQL database, you can use the LOAD DATA INFILE
statement. This statement reads a text file and imports it into a database table.
Here are the steps to import a CSV file into a MySQL database using Workbench:
Install MySQL Workbench and connect to the database.
Create a blank table.
Import the CSV file.
Check the import results.
You can also import a CSV file into a MySQL database using the command line:
Access MySQL Shell.
Create a MySQL table for CSV import.
Import the CSV into the MySQL table.
The LOAD DATA INFILE statement specifies the location of the input CSV file and the
table into which data is to be populated. The statement looks like this:
LOAD DATA INFILE '{folder_structure}/{csv_file_name}' INTO TABLE table_name
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE
1 ROWS;
************************

You might also like