TD2: Gaussian Populations

Exercise 1

A bread manufacturing factory wants to establish control procedures with the primary goal of reducing overproduction issues, which result in losses for the factory. Here, we focus on the weights of baguettes produced by the factory, with a target weight of \(250\) grams. For a sample of \(n = 30\) baguettes, the empirical mean is \(\bar{X}_n = 256.3\) and the empirical variance is \(S^2_n = 82.1\). A priori, the factory reaches the target weight of \(250\) grams. We aim to test at the level of significance \(\alpha = 0.01\) whether there is an overproduction issue.

Is this a one-sided (unilatéral) or two-sided (bilatéral) testing problem?
Formulate the hypothesis testing problem (Define the parameters of the model, the corresponding distributions, what is known or unknown, and write \(H_0\) and \(H_1\)). Write the corresponding sets of parameters \(\Theta_0, \Theta_1\).
What is the test statistic to use, and what is its distribution under \(H_0\)?
Determine the rejection region.
Does the factory have an overproduction issue?

Exercise 2

We want to test the precision of a method for measuring blood alcohol concentration on a blood sample. Precision is defined as twice the standard deviation of the method (assumed to follow a Gaussian distribution). The reference sample is divided into \(6\) test tubes, which are subjected to laboratory analysis. The following blood alcohol concentrations were obtained in g/L: \[ 1.35, \; 1.26, \; 1.48, \; 1.32, \; 1.50, \; 1.44. \]

We aim to test the hypothesis that the precision is less than or equal to \(0.1 \, \text{g/L}\).

Formulate the hypothesis testing problem (Define the parameters of the model, the corresponding distributions, what is known or unknown, and write \(H_0\) and \(H_1\)). Write the corresponding sets of parameters \(\Theta_0, \Theta_1\).
Write the test statistic and give its distribution under \(H_0\).
Perform the test at a significance level of \(\alpha = 0.05\).
Show that the p-value of this test lies between \(0.001\) and \(0.01\).

Exercise 3

A candidate for the European elections wants to know if their popularity differs between men and women. A survey was conducted with \(250\) men, of whom \(42\%\) expressed support for the candidate, and \(250\) women, of whom \(51\%\) expressed support.

Formulate the hypothesis testing problem.
At a significance level of \(\alpha = 0.05\), can we say that these values indicate a statistically significant difference in popularity?
Give an approximation of the p-value in terms of \(F\) ,the CDF of \(\mathcal N(0,1)\) and read it on the graph below.

Exercise 4

We aim to compare the average daily durations (in hours) of home-to-work commutes in two departments, labeled \(A\) and \(B\). We randomly surveyed 26 people in \(A\) and 22 in \(B\). Let \(X_i\) of pepople \(i\) be the random variable representing the commute duration in department \(A\), and \(Y_j\) that in department \(B\) of people \(j\). We assume the samples obtained are i.i.d following a Gaussian distribution: \[ X_i \sim \mathcal{N}(\mu_A, \sigma_A) \quad \text{and} \quad Y_j \sim \mathcal{N}(\mu_B, \sigma_B). \]

Here is a summary of the data:

Department A	Department B
\(n_A= 26\)	\(n_B=22\)
\(\sum x_i = 535\)	\(\sum y_j = 395\)
\(\sum x_i^2 = 11400\)	\(\sum y_j^2 = 7900\)

Formulate the hypothesis testing problem
Test the equality of variances at a significance level of \(\alpha = 0.1\)
Test the equality of mean commute times between the two departments at a significance level of \(\alpha = 0.05\), and conclude
Give a Gaussian approximation of the test statistic using the CLT and the LLN, and approximate the p-value using the graph of the cdf of \(\mathcal N(0,1)\) given in the previous exercise