finding z-score for each value make so called QQ-plot. QQ-plot is a graph with normal-score (or
z-score or standardized score, welcome to call it the way you want) on y-axis and corresponding
observations to each z-score on x-axis. We have talked a bit about it. If you are unsure please use
google. Moreover, in the lecture notes we have such graphs. Does this graph suggest normality
of your dataset or not? Find all probabilities corresponding to all 100 z-scores you have
calculated from your_data. Use any software you want. There will be many repeated values,
certainly.
Question 2. How many percent of observations are located i) within 1 standard deviation
distance to either sides of the mean?; ii) 1.5 standard deviations distance; iii) 2.5 standard
deviations; iv) 3 standard deviations of the mean? Does it comply with Chebyshev rule? Why do
not we have probabilities in z-score tables for z-scores greater than 3 (sometimes 3.4) ?
Question 3. Create a new dataset from your_data by dividing each value in the sequence by 10
and call it new_dataset. For example, the values in my dataset are
2,4,4,6,4…..blah..blah…blah….. My corresponding new_dataset is therefore is as follows: 0.2,
0.4, 0.4, 0.6, 0.4…..blah….blah….blah. As you can see, all values in the new_dataset will be
between 0 and 1. Let us assume that they are probabilities. Now, find z-scores corresponding to
all 100 probabilities specified in the new_dataset ( In other words, find
Prob(z<value_you_have_to _find) = probability in new_dataset). Draw probabilities on y-axis
and z-scores in x axis. You can ignore the repeated values of z-scores. You should have 10
unique z-scores the rest will be repeated.
Question 4. Assume that the your_data is in fact binomially distributed with n=10 and with
probability of success equal to 0.5. Using such a binomial distribution find the probability that
the number of successes in 10 trials is equal to i) first value in your data; ii) first value in
your_data plus 1 iii) first value in your_data minus 1. Now, approximate those probabilities
using normal distribution and show your work explicitly i.e. how did you get the results?
Approximate the same probabilities with Poisson distribution. What are the conditions to apply
those approximations?