To explore the dataset and practice extracting basic observations about the data. The idea is for one to get comfortable working in R. It's an open ended problem and one is free to explore this from any angle i.e. any line of questioning.

statistics

Description

Output Answer: R Studio Programming

1. Project Objective

To explore the dataset and practice extracting basic observations about the data. The idea is for one to get comfortable working in R. It's an open ended problem and one is free to explore this from any angle i.e. any line of questioning.

Context - The data pertains to the houses found in a given California district and some summary statistics about them based on the 1990 census data. The columns are as follows, their names are pretty self explanatory: 

1. longitude 

2. latitude 

3. housing_median_age 

4. total_rooms total_bedrooms 

5. population 

6. households 

7. median_income 

8. median_house_value 


2. Assumptions

The size sample must be representing a huge area in California. The sample must be representing different areas within California. 


3. Exploratory Data Analysis

3.1 The SETWD command was used to set directory. The rear.csv package was installed to import the file Carlifornia

3.2 The package (psych) was loaded to allow the describe command to be able to view all 


The standard deviation of longitude and latitude explains that the chosen area for CENSUS was close though the sample in huge, 20 640 sample. The demonstration can be observed from the boxplots of longitude and latitude below, with no outliers.


Related Questions in statistics category