Class Exercises

Statistical Laboratory

Alessandro Ortis - University of Catania

Exercises:

N.B.: before starting any exercise that involves random numbers, set the seed to 1303 (i.e., execute set.seed(1303))

  1. Given a number n, generate a list containing the first n integer numbers starting from 5.
  2. Given two numbers a and b, compute the sum of the numbers between a and b.
  3. Given a number n, compute the factorial of n. The factorial of a number n can be defined as product of all positive numbers less than or equal to n.
  4. Consider a vector x <- c(4,5,8,10,3,5,13), what is the value of x<7?
  5. Create a list with all odd numbers in the even positions and vice-versa.
  6. Create a list A with all random numbers. Then, create a list B containing only the odd elements of A and a list C containing only the even elements of A.
  7. Create a list A with all random numbers. Then, create a list B containing all the elements of A multiplied by 4.
  8. Create a 10x10 matrix with all random numbers from a normal distribution.
  9. Create a 10x10 matrix which general element M[i][j] has value equal to (i+1)*(j+1).
  10. Create a 10x10 matrix with all random numbers from a normal distribution. Then, sum all the element of the principal diagonal.
  11. Generate 1000 random numbers of a Gaussian with mean = 50 and std dev. = 3. Then draw a plot to show the distribution of the data.
  12. Write a while loop that prints out 10 standard random normal numbers.
  13. Write a for loop that iterates over the numbers 1 to 7 and prints the cube of each number using print().
  14. Create a vector 'age' with four integer example values of age, create a vector 'name' of four strings with four name examples, create a vector 'gender' with four elements with values equal to 'M' or 'F'. Then, create a dataframe which rows are four records and columns are 'age', 'name' and 'gender' contained in the previous defined vectors.
In [3]:
# Ex 1:
#Given a number n, generate a list containing the first 
# n integer numbers starting from 5.

#Class solution:
n = 30
lista = seq(5,n+5-1)
lista
  1. 5
  2. 6
  3. 7
  4. 8
  5. 9
  6. 10
  7. 11
  8. 12
  9. 13
  10. 14
  11. 15
  12. 16
  13. 17
  14. 18
  15. 19
  16. 20
  17. 21
  18. 22
  19. 23
  20. 24
  21. 25
  22. 26
  23. 27
  24. 28
  25. 29
  26. 30
  27. 31
  28. 32
  29. 33
  30. 34
In [1]:
# Class solution:
n = 4
v<-seq(5,5+n)
v
  1. 5
  2. 6
  3. 7
  4. 8
  5. 9
In [6]:
# Solution
n <-6
seq(5,5+n)

# Alternative
n <-6
ls2 <- list(5:(n+5))
print(ls2)

ls3 <- c(5:(n+5))
print(ls3)
  1. 5
  2. 6
  3. 7
  4. 8
  5. 9
  6. 10
  7. 11
[[1]]
[1]  5  6  7  8  9 10 11

[1]  5  6  7  8  9 10 11
In [4]:
# Ex 2: Given two numbers a and b, 
# compute the sum of the numbers between a and b.

a = 3
b = 7

sum = 0
for(e in a:b){
    sum = sum + e
}
print(sum)
[1] 25
In [6]:
# Class solution:
a = 2
b = 5
#(a+1):(b-1)
sum(seq(a,b))
14
In [9]:
# Solution
a <- 5
b <- 10
sum(a:b)
45
In [5]:
#Ex 3: 
#Given a number n, compute the factorial of n. 
#The factorial of a number n can be defined as product 
# of all positive numbers less than or equal to n.

n = 3
fac = 1

while(n>0){
    fac = fac * n
    n = n -1
}
print(fac)
[1] 6
In [8]:
# Class solution:
n <- 5
a <- 1
f <- seq(1:n)
for(i in f){
    a <- a*i
}
print(a)
[1] 120
In [9]:
# Solution
n <- 5
f <- 1
while(n>1){
    f <- f * n
    n <- n - 1
}
f

# Alternative
n = 5
f = n
i = 0
while(i < n-1){
    i = i+1
    f = f*i
}
f

# Alternative
f = 1
for(i in (1:n)){
    f = f * i
}
f
120
120
120
In [6]:
#Ex 4:
# Consider a vector x <- c(4,5,8,10,3,5,13), what is the value of x < 7 ?
x <- c(4,5,8,10,3,5,13)
x < 7
  1. TRUE
  2. TRUE
  3. FALSE
  4. FALSE
  5. TRUE
  6. TRUE
  7. FALSE
In [10]:
x <- c(4,5,8,10,3,5,13)
x < 7
  1. TRUE
  2. TRUE
  3. FALSE
  4. FALSE
  5. TRUE
  6. TRUE
  7. FALSE
In [9]:
#Ex 4 (bis):
# print the values of x that are lower than 7
x <- c(4,5,8,10,3,5,13)
for(el in x){
    if(el<7){
        print(el)
    }
}

#Alternative solution
print(x[x<7])
[1] 4
[1] 5
[1] 3
[1] 5
[1] 4 5 3 5
In [17]:
for(i in x){
    if(i<7)
        print(i)
}
[1] 4
[1] 5
[1] 3
[1] 5
In [19]:
# Solution 1
for(i in x){
    if(i<7)
        print(i)
}




# Solution 2
#test <- x<7
#x[test]

# ... or directly...
x[x<7]
[1] 4
[1] 5
[1] 3
[1] 5
  1. 4
  2. 5
  3. 3
  4. 5
In [1]:
?seq
In [15]:
#Ex 5: 
#(define a code that is able to...) Create a list with all odd numbers in the even positions
# and vice-versa.

input <- x
e <- input[input %% 2 == 0] # TRUE in the even positions
o <- input[input %% 2 != 0] # TRUE in the odd positions

l <- c()
i <- 1
j <- 1
while(i<=length(input)){
    l = append(l,e[j])
    l = append(l,o[j])
    i = i + 2 
    j = j + 1
}
print(l)
[1]  5  4  3  8  5 10 13 NA
In [16]:
a <- seq(1,10)
ls <-seq(2,10,2)
a[ls] = ls-1
a[ls-1] = ls
print(a)
 [1]  2  1  4  3  6  5  8  7 10  9
In [9]:
# and the winner is...
ls <-seq(2,10)
ls
  1. 2
  2. 3
  3. 4
  4. 5
  5. 6
  6. 7
  7. 8
  8. 9
  9. 10
In [23]:
# Solution

ls <- list(2,2,2,2,2,2,2)
n <- length(ls)

cnt <- 2
repeat{
    ls[cnt] <- 1
    cnt = cnt + 2
    if(cnt>=n)
        break
}
ls
  1. 2
  2. 1
  3. 2
  4. 1
  5. 2
  6. 1
  7. 2
In [2]:
# More challenging way...

x <- seq(1:9)
j <- 1

i = 1
n = length(x)
while(i < n+1){
    x[i] = x[i] +1
    i = i +1
}

x
  1. 2
  2. 3
  3. 4
  4. 5
  5. 6
  6. 7
  7. 8
  8. 9
  9. 10
In [ ]:
# Ex5 (bis): exercise 3 using an array instead of a list
# and the while instead of repeat
# Example using arrays
In [13]:
# Solution
# Step 1: create an array with all even number (equal to 2)
v <- array(2,dim=c(1,10))
n <- length(v)
# Step 2: change all elements with even index to an odd number (equal to 1)
cnt = 0
while (cnt < n){
    if(cnt %% 2)
        v[cnt] = 1
    cnt = cnt+1
}
v
1212121212
In [3]:
# Ex 6:
# Create a list A with all random numbers. Then, create a list B containing 
#only the odd elements of A and a list C containing only the even elements of A.
set.seed(1303)
A = sample(1:10, 5, replace=TRUE)
B = c()
C = c()
for (element in A){
    if (element%%2 == 0)
        B = c(B,element)
    else
        C = c(C,element)
}
print(A)
print(B)
print(C)
[1] 7 5 5 4 4
[1] 4 4
[1] 7 5 5
In [2]:
help(sample)
In [4]:
set.seed(1303)
In [5]:
el_vector = sample(1:100, 10)
odd = c()
even = c()

for (el in el_vector){
  #  if (el ==0)
  #      next
    if(el%%2==0)
        even = c(even,el)
    else
        odd = c(odd,el)
}
print(el_vector)
print(odd)
print(even)
 [1] 87 92  5 79 68 52 80 56 59 51
[1] 87  5 79 59 51
[1] 92 68 52 80 56
In [16]:
#Solution


set.seed(1303)

# take 8 samples of numbers between 1 and 100
A <- sample(1:100, size= 8)
A 

# Create two empty lists B and C
B <-c()
C <-c()
for (el in A)
    {
    if(el %% 2 == 0)
        B <- c(B,el)  # insert a single element in B
    else
        C <- c(C,el)  # insert a single element in C
}
B
C
[1] 87  5 79 59 51
[1] 92 68 52 80 56
  1. 87
  2. 92
  3. 5
  4. 79
  5. 68
  6. 52
  7. 80
  8. 56
  1. 92
  2. 68
  3. 52
  4. 80
  5. 56
  1. 87
  2. 5
  3. 79
In [20]:
# Ex 7:
#Create a list A with all random numbers. 
#Then, create a list B containing all the elements of A multiplied by 4.
A <- sample(1:100, 5)
A
B <- A * 4
B
# the proof...
B / 4
  1. 89
  2. 39
  3. 86
  4. 68
  5. 15
  1. 356
  2. 156
  3. 344
  4. 272
  5. 60
  1. 89
  2. 39
  3. 86
  4. 68
  5. 15
In [8]:
# Alternative
A <- sample(15,5)
B <- c()
for(n in A){
    B <- c(B, n*4)
}
A
B
  1. 4
  2. 7
  3. 1
  4. 9
  5. 6
  1. 16
  2. 28
  3. 4
  4. 36
  5. 24
In [6]:
# Ex.8: Create a 10x10 matrix with all random numbers from 
# a normal distribution.

set.seed(1303)
A = matrix(rnorm(100), nrow=10)
A
-1.1439763145-0.04868712 -0.9314953 1.57327374 0.26400733 1.0018519 0.4204779 -0.56054669 -1.93202340 0.02782077
1.3421293656-0.69565622 0.8238676 0.01274651 0.63218681 0.2630014 0.3412769 -0.63876770 -0.96938200 1.58725296
2.1853904757 0.82891748 0.5233707 0.87264705 -1.33065099 -0.0283591 -1.1114696 -0.06500831 1.00148882 0.23574669
0.5363925179 0.20665286 0.7069214 0.42206619 0.02688882 -0.5562590 0.8437745 0.37530956 0.15220012 -0.21068373
0.0631929665-0.23567451 0.4202043 -0.01881579 1.04063632 -0.1195611 -0.8552578 1.30692614 -0.04515586 -0.16983068
0.5022344825-0.55631049 -0.2690522 2.61574897 1.31202380 -1.0362959 2.2478812 -0.61058086 -0.50296757 0.76280099
-0.0004167247-0.36475436 -1.5103173 -0.69314017 -0.03000208 -0.6566380 -1.3721147 0.32282993 -0.25911284 0.43017948
0.5658198405 0.86235503 -0.6902125 -0.26632178 -0.25002571 0.5307149 0.9359950 1.75126495 1.01738122 1.37181976
-0.5725226890-0.63077154 -0.1434720 -0.72063644 0.02341449 0.1123965 0.5497376 1.55928971 -1.72582568 1.57143594
-1.1102250073 0.31360213 -1.0135274 1.36773421 1.65987066 -2.0775613 0.5175874 0.64713105 0.93284077 0.13737399
In [22]:
A <- matrix(rnorm(100), 10,10)
A
-1.5481272 1.0167111 -1.30585268 -0.2645023 -1.0651632 -1.360266766 0.9375855 -1.8669580 0.35848822 -0.66946153
-0.7777504 -1.0112955 1.53001816 0.5916511 0.2780238 -0.001592186-0.1939181 0.9794920 -1.33175829 0.63988977
-0.4118236 0.9098854 -1.02960329 1.6477247 -2.4520639 0.142920636 1.2563452 1.2245535 1.48044688 -0.07661812
2.0007570 -0.8695874 -0.12970788 -0.2423971 -0.4550836 0.994938293-1.2786605 0.6858608 1.50476649 0.60181820
0.1153831 -0.3315129 0.07692826 0.5940438 -0.3731116 -0.502133958-0.4975880 -1.2937657 1.12574236 -0.44782458
0.9690973 0.7614617 1.57270030 1.8851417 -1.9406222 0.067819890 0.8385246 0.8905632 0.46527189 -0.03616904
-0.8949222 -0.6389193 -0.90820565 -0.1219255 -0.4168890 1.533740879-1.8350037 -1.5186605 -0.15489512 -0.51991673
-1.0363684 3.2529803 -0.42444579 0.9745292 -0.2027438 -1.953065549 0.7292949 -0.1709481 1.34509787 0.66239455
-0.6800149 -0.2836471 0.49320710 -0.2211568 1.8140893 1.135325218-0.1518792 -0.1301062 0.09458603 1.18638097
0.7981890 -0.6164184 0.47851340 -0.5701903 -0.4021007 -0.164771888-1.8895356 0.2688732 -0.87355060 -0.59446834
In [23]:
matr <- matrix(rnorm(100), nrow=10, ncol=10)
#print(matr)
# the proof..
hist(matr)
mean(matr)
sd(matr)
0.00552959257452103
0.922849296776224
In [8]:
# Ex 9: Create a 10x10 matrix which general 
# element M[i][j] has value equal to (i+1)*(j+1).
# Note: by passing 'NA' as data, an empty matrix is created

M = matrix(1:100, nrow=10)
for(i in 1:10)
    for(j in 1:10)
        M[i,j] = (i+1)*(j+1)

M
4 6 8 10 12 14 16 18 20 22
6 9 12 15 18 21 24 27 30 33
8 12 16 20 24 28 32 36 40 44
10 15 20 25 30 35 40 45 50 55
12 18 24 30 36 42 48 54 60 66
14 21 28 35 42 49 56 63 70 77
16 24 32 40 48 56 64 72 80 88
18 27 36 45 54 63 72 81 90 99
20 30 40 50 60 70 80 90 100110
22 33 44 55 66 77 88 99 110121
In [25]:
A = matrix('NA', 10, 10)
for(i in 1:nrow(A)){
    for(j in 1:ncol(A)){
        A[i,j] = (i+1)*(j+1)
    }
}
A
4 6 8 10 12 14 16 18 20 22
6 9 12 15 18 21 24 27 30 33
8 12 16 20 24 28 32 36 40 44
10 15 20 25 30 35 40 45 50 55
12 18 24 30 36 42 48 54 60 66
14 21 28 35 42 49 56 63 70 77
16 24 32 40 48 56 64 72 80 88
18 27 36 45 54 63 72 81 90 99
20 30 40 50 60 70 80 90 100110
22 33 44 55 66 77 88 99 110121
In [19]:
matr <- matrix(NA, ncol=10, nrow=10)
for(i in 1:10){
    for(j in 1:10){
        matr[i, j] <- (i+1)*(j+1)
    }
}

print(matr)
# M[i][j] has value equal to (i+1)*(j+1)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    4    6    8   10   12   14   16   18   20    22
 [2,]    6    9   12   15   18   21   24   27   30    33
 [3,]    8   12   16   20   24   28   32   36   40    44
 [4,]   10   15   20   25   30   35   40   45   50    55
 [5,]   12   18   24   30   36   42   48   54   60    66
 [6,]   14   21   28   35   42   49   56   63   70    77
 [7,]   16   24   32   40   48   56   64   72   80    88
 [8,]   18   27   36   45   54   63   72   81   90    99
 [9,]   20   30   40   50   60   70   80   90  100   110
[10,]   22   33   44   55   66   77   88   99  110   121
In [11]:
# Ex 10: Create a 10x10 matrix with all random numbers from 
# a normal distribution. Then, sum all the elements of
# the principal diagonal.
set.seed(1303)
M = matrix(rnorm(100), nrow=10)
sum = 0
j=1
for(i in 1:nrow(M)){
    sum = sum + M[i,j]
    print(M[i,j])
    j = j+1
    }
M
print(sum)
[1] -1.143976
[1] -0.6956562
[1] 0.5233707
[1] 0.4220662
[1] 1.040636
[1] -1.036296
[1] -1.372115
[1] 1.751265
[1] -1.725826
[1] 0.137374
-1.1439763145-0.04868712 -0.9314953 1.57327374 0.26400733 1.0018519 0.4204779 -0.56054669 -1.93202340 0.02782077
1.3421293656-0.69565622 0.8238676 0.01274651 0.63218681 0.2630014 0.3412769 -0.63876770 -0.96938200 1.58725296
2.1853904757 0.82891748 0.5233707 0.87264705 -1.33065099 -0.0283591 -1.1114696 -0.06500831 1.00148882 0.23574669
0.5363925179 0.20665286 0.7069214 0.42206619 0.02688882 -0.5562590 0.8437745 0.37530956 0.15220012 -0.21068373
0.0631929665-0.23567451 0.4202043 -0.01881579 1.04063632 -0.1195611 -0.8552578 1.30692614 -0.04515586 -0.16983068
0.5022344825-0.55631049 -0.2690522 2.61574897 1.31202380 -1.0362959 2.2478812 -0.61058086 -0.50296757 0.76280099
-0.0004167247-0.36475436 -1.5103173 -0.69314017 -0.03000208 -0.6566380 -1.3721147 0.32282993 -0.25911284 0.43017948
0.5658198405 0.86235503 -0.6902125 -0.26632178 -0.25002571 0.5307149 0.9359950 1.75126495 1.01738122 1.37181976
-0.5725226890-0.63077154 -0.1434720 -0.72063644 0.02341449 0.1123965 0.5497376 1.55928971 -1.72582568 1.57143594
-1.1102250073 0.31360213 -1.0135274 1.36773421 1.65987066 -2.0775613 0.5175874 0.64713105 0.93284077 0.13737399
[1] -2.099157
In [31]:
s = 0
A = matrix(rnorm(100),10,10)
for(i in 1:nrow(A))
    s = s + A[i,i]
A
print(s)
-0.258617267-1.7172822 -0.8630013 -0.2971652 1.10541592 -0.95506823 0.4832349 -2.1882831 1.0143844 -0.5065315
-2.108335365 2.1819453 0.9332658 -1.1895940 1.04885183 -0.03145739 0.3756295 -0.9728664 -2.2157859 -0.2721019
0.007759626 0.1321014 2.4351001 -0.2163290 2.58735203 0.88900138 0.1414757 -0.5041430 -0.3626504 -0.9510538
-0.306029120-0.7361638 0.5276475 -0.8649131 1.53363716 -0.52836099 0.3773690 0.5302590 -0.7401288 0.4891696
-1.425938561-0.4780344 2.4923428 0.9184542 -1.17132262 0.54600987 0.1043323 0.1411353 0.8761509 0.2238972
1.067482166-1.3507296 0.3523072 -0.3896859 -0.50867268 1.03246802 -0.1616681 -0.4666324 2.2852532 -0.3817865
0.256045574 1.8018036 -0.5125448 -0.6485184 -1.15360673 -0.63886185 0.4921768 0.2236000 1.9702421 -1.4280158
1.991293466 0.4847396 -1.0016927 1.0974645 1.40871456 -1.34421959 -2.0205251 -1.2511900 -1.3175528 2.0398603
-0.046838113 0.8015971 2.0526951 -0.6407438 0.01140617 -0.84329450 -0.8478645 1.5580570 -0.2215265 -0.8788152
0.296724868 0.5614541 -0.7402828 -0.3252585 -1.82110712 -0.96097717 -1.3051328 -1.1677556 0.6513053 1.1648105
[1] 3.538931
In [22]:
n <- 10
matr <- matrix(rnorm(n*n), nrow=n, ncol=n)
i <- 1
s <- 0
while(i<n+1){
    s <- s + matr[i,i]
    i <- i + 1
}
print(matr)
print(s)
           [,1]       [,2]       [,3]
[1,] -1.6451468 -1.0814768 -0.5716848
[2,] -0.2538116  0.7144439 -0.5474896
[3,] -0.3721546 -1.1830488  2.7866231
[1] 1.85592
In [23]:
# Alternative
matr <- matrix(rnorm(9), nrow=3, ncol=3)
s <- 0
for(i in 1:3){
    s <- s + matr[i,i]
}
print(matr)
print(s)
            [,1]       [,2]       [,3]
[1,] -1.59285574 0.02041516  1.1436415
[2,] -0.09731951 0.73023128 -1.7798489
[3,] -0.47395378 0.35777387  0.1252223
[1] -0.7374021
In [12]:
set.seed(1303)
M = matrix(rnorm(100),nrow=10)
S = sum(diag(M))
print(S)
[1] -2.099157
In [13]:
?diag
In [17]:
M = matrix(c(1,2,3,4,5,6,7,8,9), nrow=3)
M
diag(M)
147
258
369
  1. 1
  2. 5
  3. 9
In [18]:
# Ex 11: Generate 1000 random numbers of a Gaussian 
# with mean = 50 and std dev. = 3. 
# Then draw a plot to show the distribution of the data.


set.seed(1303)
data = rnorm(1000,mean=50,sd=3)
hist(data)
In [33]:
hist(rnorm(1000, 50, 3))
In [25]:
#?rnorm
data <- rnorm(1000, mean = 50, sd = 3)
hist(data)
In [28]:
set.seed(1303)
S = rnorm(1000,mean=50,sd=3)
hist(S,col="red")
xfit = seq(min(S),max(S),length=40)
yfit = dnorm(xfit,mean=mean(S),sd=sd(S))
yfit = yfit*diff(S[1:2])*length(S)*0.25
lines(xfit,yfit,col="blue",lwd=2)
In [34]:
set.seed(1303)
S = rnorm(1000,mean=50,sd=3)
hist(S,prob=TRUE)
lines(density(S))
In [45]:
# ... and the winner is... ???
X = rnorm(1000,mean=50,sd=3)
#hist(X,prob=TRUE)
x2 = seq(min(X),max(X),length=40)
fun= dnorm(x2,mean=mean(X),sd = sd(X))
hist(X,prob=TRUE,
     col="white",
    ylim = c(0,max(fun)),
    main = "Histogram")
lines(x2,fun,col=2,lwd=2)
In [46]:
?hist
In [ ]:

In [27]:
# Ex 12: Write a while loop that 
# prints out 10 standard random normal numbers.

# Note: in this case the numbers belong to separate distr.
cnt <- 10  #(cnt from 10 to 0)
while(cnt > 0){
   n <- rnorm(1)
   print(n)
   cnt <- cnt - 1
}
[1] -0.9117047
[1] -0.5266903
[1] 0.01639592
[1] 1.176198
[1] -0.384313
[1] -0.2170401
[1] -0.01910471
[1] 0.2543211
[1] 0.3012063
[1] -1.005857
In [28]:
# Alternative (cnt from 0 to 10)
cnt <- 0
while(cnt<10){
    n <- rnorm(1)
    print(n)
    cnt <- cnt +1
}
[1] -0.6123804
[1] -0.732695
[1] 0.7235052
[1] -0.6096134
[1] 1.613027
[1] 1.192898
[1] 0.0176625
[1] -0.6392215
[1] -1.818033
[1] 1.077054
In [1]:
# Note: in this case the numbers belong to the same distr.
data <- rnorm(10)
for (n in data){
    print(n)
}
[1] 0.4223153
[1] 0.003199343
[1] -1.827317
[1] 2.1697
[1] 0.6014317
[1] 0.4292321
[1] 0.09905764
[1] 0.4311144
[1] 1.427346
[1] -1.093919
In [2]:
# Ex 13: Write a for loop that iterates over the 
# numbers 1 to 7 and prints the cube of each number 
# using print().

for(n in 1:7){
    n <- n^3
    print(n)
}
[1] 1
[1] 8
[1] 27
[1] 64
[1] 125
[1] 216
[1] 343
In [3]:
# Ex 14: Create a vector 'age' with four integer example values of age, 
# create a vector 'name' of four strings with four name 
# examples, create a vector 'gender' with four elements with
# values equal to 'M' or 'F'. 
#Then, create a dataframe which rows are four records and 
# columns are 'age', 'name' and  'gender' contained 
# in the previous defined vectors.

Age <- c(22, 25, 18, 20)
Name <- c("Marco", "Fabio", "Sara", "Francesca")
Gender <- c("M", "M", "F", "F")

data.frame(Age, Name, Gender)
AgeNameGender
22 Marco M
25 Fabio M
18 Sara F
20 FrancescaF
In [ ]: