Lab 1a September 2118

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

LAB 1A ​NAME _________Gisselle Ramirez________

Data, Code & RStudio


Directions: Record your responses to the lab questions in the spaces provided.

HIGHLIGHT AND WRITE ALL COMPUTER CODES, AND HIGHLIGHT YOUR


ANSWERS. ​INCLUDING THE “LOAD_LABS”…

Welcome to the labs!

So let's get started!


•​ ​Describe the data that appeared after running View(cdc):

The data that appeared running was a chart of people and information based on a survey
•​ ​Who​ is the information about?

The information is about young adults.


•​ ​What sorts of information about them was collected?

The information collected was their gender, age, race and more. Based upon personally information.

Data: Variables & Observations


•​ ​How are our ​observations​ represented in our data?

The observations are represented by being organized and labeling everything by rows and
columns.
•​ ​What does the first column tell us about our observations?

The first column represents their age.


•​ ​How often did our first observation wear a seatbelt while riding in a car?

The first observation always wore a seatbelt.

Uncovering our Data's Structure


•​ ​How many students are in our cdc data set?

There are 15,624 students in the data set.


•​ ​How many variables were measured for each student?

There were 33 variables that were measured for each student.


Type the following commands into the
console
•​ ​Which of these functions tell us the number of observations in our data?
Each of the functions tells different information such as dim(cdc) which has 33, and
nrow(cdc) which also has 33. Once i typed in the name(cdc) more information
appears about their race, gender and age etc​.

•​ ​Which of these functions tell us the number of variables?

Function dim(cdc) and nrow(cdc).

First Steps

Syntax matters
•​ ​Run the following commands and write down what happens after each. Which
does R understand?
The first two commands reports as “ERROR”. Once i typed in the third command all
31 information appears. Once i typed in the last command “ERROR” appeared
again.

R's most important syntax

Syntax in action
•​ ​Which one of these plots would be useful for answering the question: ​Is it

unusual for students in the CDC dataset to be taller than 1.8 meters?
Histogram(-​height,data=cdc)would be useful
•​ ​Do you think it's unusual for students in the data to be taller than 1.8 meters?

Why or why not?


Most likely because that means you can be taller than 6 feet.
On your own:
•​ ​What is ​public health​ and do we collect data about it?

Public Health is the health population as a whole. We collect data about it by


medical.
•​ ​How do you think our data was collected? Does it include every high school

aged student in the US?


No, I don’t think it includes every high school student in the US possibly the ones
who took a survey.
•​ ​How might the CDC use this data? Who else could benefit from using this data?

People who are doing projects can use this data to see how many different
race,genders and ages are in high school.
•​ ​Write the code to visualize the distribution of weights of the students in the

CDC data with a histogram. What is the ​typical​ weight?


The weights of the students in the CDC data are different some that aren’t supposed
to be their average weight in a certain age.
•​ ​Write the code to create a barplot to visualize the distribution of how often

students wore a helmet while bike riding. About how many students never wore
a helmet?
The code could be “ helmet(cdc) “

You might also like