Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Data using

Analytics
Data Wrangling

Company Profile Present


Data Wrangling
Data wrangling is the process of converting and mapping raw data and
getting it ready for analysis

tidyverse
&
friends
At its core, the tidyverse is a collection of packages designed to work together
as a full pipeline for doing every stage of data analysis on tidy data as an
alternative to the inbuilt base R functions.
dplyr library
The hflights Dataset
The hflights Dataset

Step 1: Filter for flights originating from IAH airport


Step 2: Count total flights and delayed flights by each carrier
Step 3: Convert it to a Delayed per thousand (DPH) metric
Step 4: Sort the result by DPH in descending order
Data Manipulation using R
The Pipe %>%
Tidyr
Reshaping Your Data with tidyr
Our Data

Overview

The goal of tidyr is to help you create


tidy data. Tidy data is data where:

1. Every column is variable.


2. Every row is an observation.
3. Every cell is a single value.
Hands on

You might also like