w6

Run Settings
LanguageC
Language Version
Run Command
WEEK 6: A) Write an R script to find basic descriptive statistics using summary, str, quartile function on mtcars & cars datasets. Sol:- Summary function x<-c(1,2,3,4,5) summary(x) y<-c(2,3,4,5,6,7,8) summary(y) output:- summary(x) Min . 1st Qu. Median Mean 3rd Qu. Max. 1 2 3 3 4 5 summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max. 2.0 3.5 5.0 5.0 6.5 8.0 Str function rv <- c(11, 18, 19, 21, 46) rv str(rv) ouput:- str(rv) num [1:5] 11 18 19 21 46 quartile functions in “r” take mt cars data set to load in r –programme data("mtcars") # to dataset in r head(mtcars) outpuit:- ## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 ## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 ## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 ## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 ## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 ## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 nrow(mtcars) out put:- ## [1] 32 ncol(mtcars) ## [1] 11 tail(mtcars) output:- ## mpg cyl disp hp drat wt qsec vs am gear carb ## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2 ## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2 ## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4 ## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6 ## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8 ## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 summary(mtcars) 4 2 ## mpg cyl disp hp ## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0 ## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5 ## Median :19.20 Median :6.000 Median :196.3 Median :123.0 ## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7 ## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0 ## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0 ## drat wt qsec vs ## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000 ## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000 ## Median :3.695 Median :3.325 Median :17.71 Median :0.0000 ## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375 ## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000 ## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000 ## am gear carb ## Min. :0.0000 Min. :3.000 Min. :1.000 ## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000 ## Median :0.0000 Median :4.000 Median :2.000 ## Mean :0.4062 Mean :3.688 Mean :2.812 ## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000 ## Max. :1.0000 Max. :5.000 Max. :8.000 B) Write an R script to find subset of dataset by using subset (), aggregate () functions on iris dataset subset fuction : t<-data("iris") t<-data.frame(iris) df<-subset(t,select=2:3) df output:- df<-subset(t,select=2:3) > df Sepal.Width Petal.Length 1 3.5 1.4 2 3.0 1.4 3 3.2 1.3 4 3.1 1.5 5 3.6 1.4 6 3.9 1.7 7 3.4 1.4 8 3.4 1.5 9 2.9 1.4 10 3.1 1.5 11 3.7 1.5 12 3.4 1.6 13 3.0 1.4 14 3.0 1.1 15 4.0 1.2 16 4.4 1.5 17 3.9 1.3 18 3.5 1.4 19 3.8 1.7 20 3.8 1.5 21 3.4 1.7 22 3.7 1.5 23 3.6 1.0 24 3.3 1.7 25 3.4 1.9 26 3.0 1.6 27 3.4 1.6 28 3.5 1.5 29 3.4 1.4 30 3.2 1.6 31 3.1 1.6 32 3.4 1.5 33 4.1 1.5 34 4.2 1.4 35 3.1 1.5 36 3.2 1.2 37 3.5 1.3 38 3.6 1.4 39 3.0 1.3 40 3.4 1.5 41 3.5 1.3 42 2.3 1.3 43 3.2 1.3 44 3.5 1.6 45 3.8 1.9 46 3.0 1.4 47 3.8 1.6 48 3.2 1.4 49 3.7 1.5 50 3.3 1.4 51 3.2 4.7 52 3.2 4.5 53 3.1 4.9 54 2.3 4.0 until 150 3.0 5.1 Aggregate function Aggregate() Function in R Splits the data into subsets, computes summary statistics for each subsets and returns the result in a group by form. Aggregate function in R is similar to group by in SQL. Aggregate() function is useful in performing all the aggregate operations like sum,count,mean, minimum and Maximum. Lets see an Example of following Aggregate() which computes group sum calculate the group max and minimum using aggregate() function Aggregate() function which computes group mean Get group counts using aggregate() function Syntax for Aggregate() Function in R: aggregate(x, by, FUN, …, simplify = TRUE, drop = TRUE) X an R object, Mostly a dataframe by a list of grouping elements, by which the subsets are grouped by FUN a function to compute the summary statistics simplify a logical indicating whether results should be simplified to a vector or matrix if possible drop a logical indicating whether to drop unused combinations of grouping values. Example of aggregate()function: # Aggregate function in R with mean summary statistics agg_mean = aggregate(iris[,1:4],by=list(iris$Species),FUN=mean, na.rm=TRUE) agg_mean OUTPUT: Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 5.006 3.428 1.462 0.246 2 versicolor 5.936 2.770 4.260 1.326 3 virginica 6.588 2.974 5.552 2.026 # Aggregate function in R with SUM summary statistics agg_sum = aggregate(iris[,1:4],by=list(iris$Species),FUN=sum, na.rm=TRUE) agg_sum OUTPUT: Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 250.3 171.4 73.1 12.3 2 versicolor 296.8 138.5 213.0 66.3 3 virginica 329.4 148.7 277.6 101.3 # Aggregate function in R with COUNT agg_count = aggregate(iris[,1:4],by=list(iris$Species),FUN=length) agg_count OUTPUT: Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 50 50 50 50 2 versicolor 50 50 50 50 3 virginica 50 50 50 50 # Aggregate function in R with MAXIMUM agg_max = aggregate(iris[,1:4],by=list(iris$Species),FUN=max, na.rm=TRUE) agg_max OUTPUT: Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 5.8 4.4 1.9 0.6 2 versicolor 7.0 3.4 5.1 1.8 3 virginica 7.9 3.8 6.9 2.5 # Aggregate function in R with MAXIMUM agg_min = aggregate(iris[,1:4],by=list(iris$Species),FUN=min, na.rm=TRUE) agg_min OUTPUT: Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 4.3 2.3 1.0 0.1 2 versicolor 4.9 2.0 3.0 1.0 3 virginica 4.9 2.2 4.5 1.4
Editor Settings
Theme
Key bindings
Full width
Lines