Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

R

6597 views
Kernel: R (R-Project)
options(jupyter.plot_mimetypes ='image/png')

Exercise 1

names<- c('Bob','Claire','Luisa','Matt','Marta','Mike') score<- c(34,82,59,72,50,100) game_cards<- data.frame(names,score,stringsAsFactors=FALSE) game_cards
game_cards$names
names<- c('Bob','Claire','Luisa','Matt','Marta','Mike') score1<- c(34,82,59,72,50,100) score2<- c(64,82,36,48,29,85) game_cards<- data.frame(names,score1,score2,stringsAsFactors=FALSE) game_cards
game_cards$score1
game_cards$score2

The additional field for the second match score is created by adding score2<- c( , , , , , ), with values of the same length as previous. score2 was also added to the data.frame. Each field is accessed seperately by game_cards$fieldname fieldname:score1/ score2

Ecercise 2

names(game_cards)<- c("names","match1","match2") game_cards
dim(game_cards)
min(game_cards$match1) min(game_cards$match2)

The minimum scores from match 1 and match 2 were 34 and 29 respectively.

max(game_cards$match1) max(game_cards$match2)

The maximum scores from mathc 1 and match 2 were 100 and 85 respectively.

min(game_cards[,2:3])

The minimum score from both matches was 29.

max(game_cards[,2:3])

The maximum score from both mathes was 100

which.min(game_cards$match1) which.min(game_cards$match2)
which.max(game_cards$match1) which.max(game_cards$match2)
names[which.min(game_cards$match1)] names[which.min(game_cards$match2)]
names[which.max(game_cards$match1)] names[which.max(game_cards$match2)]
x<- c(min(game_cards$match1)) y<- c(names[which.min(game_cards$match1)]) z<- c(y,as.character(x)) print(z) a<- c(min(game_cards$match2)) b<- c(names[which.min(game_cards$match2)]) c<- c(b,as.character(a)) print(c)
[1] "Bob" "34" [1] "Marta" "29"

The minimum score from match 1 was 34, which was scored by Bob. The minimum score from match 2 was 29, which was scored by Marta.

d<- c(max(game_cards$match1)) e<- c(names[which.max(game_cards$match1)]) f<- c(e,as.character(d)) print(f) g<- c(max(game_cards$match2)) h<- c(names[which.max(game_cards$match2)]) i<- c(h,as.character(g)) print(i)
[1] "Mike" "100" [1] "Mike" "85"

The maximum score from match 1 was 100, which was scored by Mike. The maximum score from match 2 was 85, which was also scored by Mike.

game_cards[order(game_cards$match1),]
game_cards[order(game_cards$match2),]

The function order() arranges the sequence of numbers into an ascending order. order(game_cards$score) rearranges the scores of the matches on the game cards into a sequential order. The output is the ordered sequence of numbers.

?plot

The command plot() is used for generic X-Y graph plotting of R objects, through the use of command plot(x,y,...). x = the coordinates of points in the plot. y = the y co-ordinates in the plot. ... = arguments to be passed to the methods, such as graphical parameters. "type" = the type of plot that should be drawn, eg. "p" for points, "l" for lines, "b" for both etc. To add an overall title to the plot, add "main", to add a subtitle fot the plot, add "sub", to add a title for the x and y axis, add "xlab" and "ylab" respectively.

par(mfrow=c(1,2)) barplot(game_cards$match1, names=game_cards$names) barplot(game_cards$match2, names=game_cards$names)
Error in barplot(game_cards$match1, names = game_cards$names): object 'game_cards' not found Traceback: 1. barplot(game_cards$match1, names = game_cards$names)

Exercise 4

?par

The par() command is used to set or query graphucal parameters. Parameters can be set by specifying them as arguments to par in tag = value form, or by passing them as a list of tagged values.

Exercise 5

match1<- c(34,82,59,72,50,100) match2<-c(64,29,36,48,82,85) plot(match1,match2) abline(0,1)
Image in a Jupyter notebook

Scatter plots are used to plot data points on a horizontal and vertical axis to show how much one variable is affected by another. It uses cartesian co-ordinates to display values for typically two variables of a set of data. A scatter plot can be used either when one continuous variable that is under the control of the experimenter and the other depends on it or when both continuous variables are independent. If a parameter exists that is systematically incremented and/or decremented by the other, it is called the control parameter or independent variable and is customarily plotted along the horizontal axis. The measured or dependent variable is customarily plotted along the vertical axis. If no dependent variable exists, either type of variable can be plotted on either axis and a scatter plot will illustrate only the degree of correlation (not causation) between two variables.

match1<- c(34,82,59,72,50,100) plot(match1,match1) abline(0,1)
Image in a Jupyter notebook

By plotting the values of a variable against itself, the resulting scatter plot shows the points falling along a straight line, with the line of best fit travelling directly through all points.

Exercise 6

data(iris) ?iris iris
summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 Median :5.800 Median :3.000 Median :4.350 Median :1.300 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Species setosa :50 versicolor:50 virginica :50
plot(iris)
Image in a Jupyter notebook
plot(iris$Sepal.Length~iris$Petal.Length)
Image in a Jupyter notebook
par(mfrow=c(1,2)) plot(iris$Sepal.Length~iris$Petal.Length, xlab="Petal Length",ylab="Sepal Length", main= "Sepal Length vs Petal Length", col=iris$Species, las=1) plot(iris$Sepal.Width~iris$Petal.Width, xlab="Petal Width", ylab="Sepal Width", main= "Sepal Width vs Petal Width", col=iris$Species, las=1) reg1<- lm(iris$Sepal.Length~iris$Petal.Length) reg2<- lm(iris$Sepal.Width~iris$Petal.Width) abline(reg1,reg2)
Image in a Jupyter notebook

Exercise 7

plot(iris$Sepal.Length~iris$Petal.Length, xlab="Petal Length",ylab="Sepal Length", main= "Sepal Length vs Petal Length", col=iris$Species, las=1) reg1<-lm(iris$Sepal.Length~iris$Petal.Length) abline(reg1) plot(iris$Sepal.Width~iris$Petal.Width, xlab="Petal Width", ylab="Sepal Width", main= "Sepal Width vs Petal Width", col=iris$Species, las=1) reg2<-lm(iris$Sepal.Width~iris$Petal.Width) abline(reg2)
Image in a Jupyter notebookImage in a Jupyter notebook
iris_table <- table(iris$Species) lbls <- paste(names(iris_table), "\n", iris_table, sep="") pie(iris_table, labels = lbls, main="Pie Chart of Species of Iris\n (sample sizes)")
Image in a Jupyter notebook

Exercise 8

data(morley) ?morley morley
morley_table <- table(morley$Run) lbls <- paste(names(morley_table), "\n", morley_table, sep="") pie(morley_table, labels = lbls, main="Pie Chart of Run number within each experiment\n (sample sizes)")
Image in a Jupyter notebook
morley_table <- table(morley$Expt) lbls <- paste(names(morley_table), "\n", morley_table, sep="") pie(morley_table, labels = lbls, main="Pie Chart of the number of experiments\n (sample sizes)")
Image in a Jupyter notebook
morley_table <- table(morley$Speed) lbls <- paste(names(morley_table), "\n", morley_table, sep="") pie(morley_table, labels = lbls, main="Pie Chart of Speed of Light in the experiments\n (sample sizes)")
Image in a Jupyter notebook
boxplot(morley$Speed ~ morley$Expt, col='light grey', xlab='Experiment #', ylab="speed (km/s - 299,000)", main="Michelson–Morley experiment") mtext("speed of light data") sol=299792.458-299000 abline(h=sol, col='red')
Image in a Jupyter notebook

Exercise 9

quantile(morley$Speed,prob=0.75)[["75%"]] + 1.5*IQR(morley$Speed)
quantile(morley$Speed,prob=0.25)[["25%"]] - 1.5*IQR(morley$Speed)
quantile(morley$Speed,prob=0.25)
quantile(morley$Speed,prob=0.50)
quantile(morley$Speed,prob=0.75)
IQR(morley$Speed)
mean(morley$Speed)
sd(morley$Speed)
Expt1<- (morley$Speed[morley$Expt==1]) Expt1 quantile(Expt1,prob=0.75)[["75%"]] + 1.5*IQR(Expt1) quantile(Expt1,prob=0.25)[["25%"]] - 1.5*IQR(Expt1) quantile(Expt1,prob=0.25) quantile(Expt1,prob=0.50) quantile(Expt1,prob=0.75) IQR(Expt1) mean(Expt1) sd(Expt1)
Expt2<- (morley$Speed[morley$Expt==2]) Expt2 quantile(Expt2,prob=0.75)[["75%"]] + 1.5*IQR(Expt2) quantile(Expt2,prob=0.25)[["25%"]] - 1.5*IQR(Expt2) quantile(Expt2,prob=0.25) quantile(Expt2,prob=0.50) quantile(Expt2,prob=0.75) IQR(Expt2) mean(Expt2) sd(Expt2)
Expt4<- (morley$Speed[morley$Expt==4]) Expt4 quantile(Expt4,prob=0.75)[["75%"]] + 1.5*IQR(Expt4) quantile(Expt4,prob=0.25)[["25%"]] - 1.5*IQR(Expt4) quantile(Expt4,prob=0.25) quantile(Expt4,prob=0.50) quantile(Expt4,prob=0.75) IQR(Expt4) mean(Expt4) sd(Expt4)
Expt5<- (morley$Speed[morley$Expt==5]) Expt5 quantile(Expt5,prob=0.75)[["75%"]] + 1.5*IQR(Expt5) quantile(Expt5,prob=0.25)[["25%"]] - 1.5*IQR(Expt5) quantile(Expt5,prob=0.25) quantile(Expt5,prob=0.50) quantile(Expt5,prob=0.75) IQR(Expt5) mean(Expt5) sd(Expt5)

Ecercise 10

hist(morley$Speed)
Image in a Jupyter notebook
par(fg=rgb(0.5,0.4,0.2)) hist(morley$Speed, prob=F, col=rgb(0.3,0.4,0.9), main='Michelson-Morley Experiment ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black')
Image in a Jupyter notebook
par(fg=rgb(0.9,0.4,0.4)) hist(morley$Speed, prob=F, col=rgb(0.9,0.2,0.3), main='Michelson-Morley Experiment ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black') lines(density(morley$Speed)) abline(v=mean(morley$Speed), col=rgb(0.5,0.5,0.5)) abline(v=median(morley$Speed), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(morley$Speed)+sd(morley$Speed), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(morley$Speed)-sd(morley$Speed), lty=2, col=rgb(0.7,0.7,0.7)) rug(morley$Speed)
Image in a Jupyter notebook
par(fg=rgb(0.6,0.2,0.3)) hist(morley$Speed[morley$Expt==1], prob=F, col=rgb(0.7,0.1,0.3), main='Michelson-Morley Experiment 1 ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black') lines(density(morley$Speed[morley$Expt==1])) abline(v=mean(morley$Speed[morley$Expt==1]), col=rgb(0.5,0.5,0.5)) abline(v=median(morley$Speed[morley$Expt==1]), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(morley$Speed[morley$Expt==1])+sd(morley$Speed[morley$Expt==1]), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(morley$Speed[morley$Expt==1])-sd(morley$Speed[morley$Expt==1]), lty=2, col=rgb(0.7,0.7,0.7)) rug(morley$Speed[morley$Expt==1])
Image in a Jupyter notebook
par(fg=rgb(0.6,0.5,0.7)) hist(morley$Speed[morley$Expt==2], prob=F, col=rgb(0.3,0.7,0.9), main='Michelson-Morley Experiment 2 ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black') lines(density(morley$Speed[morley$Expt==2])) abline(v=mean(morley$Speed[morley$Expt==2]), col=rgb(0.5,0.5,0.5)) abline(v=median(morley$Speed[morley$Expt==2]), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(morley$Speed[morley$Expt==2])+sd(morley$Speed[morley$Expt==2]), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(morley$Speed[morley$Expt==2])-sd(morley$Speed[morley$Expt==2]), lty=2, col=rgb(0.7,0.7,0.7)) rug(morley$Speed[morley$Expt==2])
Image in a Jupyter notebook
par(fg=rgb(0.6,0.6,0.6)) hist(morley$Speed[morley$Expt==4], prob=F, col=rgb(0.4,0.9,0.2), main='Michelson-Morley Experiment 4 ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black') lines(density(morley$Speed[morley$Expt==4])) abline(v=mean(morley$Speed[morley$Expt==4]), col=rgb(0.5,0.5,0.5)) abline(v=median(morley$Speed[morley$Expt==4]), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(morley$Speed[morley$Expt==4])+sd(morley$Speed[morley$Expt==4]), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(morley$Speed[morley$Expt==4])-sd(morley$Speed[morley$Expt==4]), lty=2, col=rgb(0.7,0.7,0.7)) rug(morley$Speed[morley$Expt==4])
Image in a Jupyter notebook
par(fg=rgb(0.6,0.6,0.6)) hist(morley$Speed[morley$Expt==5], prob=F, col=rgb(0.7,0.2,0.6), main='Michelson-Morley Experiment 5 ', ylab="Frequency", xlab='Difference from Speed of Light') par(fg='black') lines(density(morley$Speed[morley$Expt==5])) abline(v=mean(morley$Speed[morley$Expt==5]), col=rgb(0.5,0.5,0.5)) abline(v=median(morley$Speed[morley$Expt==5]), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(morley$Speed[morley$Expt==5])+sd(morley$Speed[morley$Expt==5]), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(morley$Speed[morley$Expt==5])-sd(morley$Speed[morley$Expt==5]), lty=2, col=rgb(0.7,0.7,0.7)) rug(morley$Speed[morley$Expt==5])
Image in a Jupyter notebook

Exercise 11

rnorm(morley$Speed)
?rnorm()
runif(morley$Speed)
rbinom(morley$Speed)
Error in rbinom(morley$Speed): argument "size" is missing, with no default Traceback: 1. rbinom(morley$Speed)

Exercise 12

t.test(morley$Speed[morley$Expt==1], morley$Speed[morley$Expt==2])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 1] and morley$Speed[morley$Expt == 2] t = 1.9516, df = 30.576, p-value = 0.0602 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2.419111 108.419111 sample estimates: mean of x mean of y 909 856

There is not a significant difference between the results of experiment 1 and 2, as p>0.05.

t.test(morley$Speed[morley$Expt==1], morley$Speed[morley$Expt==4])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 1] and morley$Speed[morley$Expt == 4] t = 3.2739, df = 30.238, p-value = 0.002659 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 33.31171 143.68829 sample estimates: mean of x mean of y 909.0 820.5

There is a significance difference between the results of experiment 1 and 4, as p<0.05

t.test(morley$Speed[morley$Expt==1], morley$Speed[morley$Expt==5])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 1] and morley$Speed[morley$Expt == 5] t = 2.9346, df = 28.471, p-value = 0.006538 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 23.44296 131.55704 sample estimates: mean of x mean of y 909.0 831.5

There is a significant difference between the results of experiment 1 and 5, as p<0.05.

t.test(morley$Speed[morley$Expt==2], morley$Speed[morley$Expt==4])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 2] and morley$Speed[morley$Expt == 4] t = 1.8523, df = 37.987, p-value = 0.07176 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.298237 74.298237 sample estimates: mean of x mean of y 856.0 820.5

There is not a significant difference between the results of experiment 2 and 4 because p>0.05.

t.test(morley$Speed[morley$Expt==2], morley$Speed[morley$Expt==5])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 2] and morley$Speed[morley$Expt == 5] t = 1.3405, df = 37.461, p-value = 0.1882 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -12.51683 61.51683 sample estimates: mean of x mean of y 856.0 831.5

There is not a significant difference between the results of experiment 2 and 5, as p>0.05.

t.test(morley$Speed[morley$Expt==4], morley$Speed[morley$Expt==5])
Welch Two Sample t-test data: morley$Speed[morley$Expt == 4] and morley$Speed[morley$Expt == 5] t = -0.60808, df = 37.611, p-value = 0.5468 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -47.63309 25.63309 sample estimates: mean of x mean of y 820.5 831.5

There is not a significant difference between the results of experiment 4 and 5 because p>0.05.