ST236 Statistical Inference Assignment Sample NUI Galway Ireland

ST236 Statistical Inference is a graduate-level course that covers the theory and methods of statistical inference. The course will focus on the concepts of estimation and hypothesis testing, as well as their applications to real-world data sets. Students will also learn about the latest developments in statistical inference, including Bayesian methods.

This course is a requirement for the Masters of Science in Data Analytics program at the National University of Ireland, Galway. The course is offered every semester, and students must complete it within two years of starting the program.

Assignment Task 1: Construct a full sampling distribution for a simple, small sample probability model and calculate the properties of standard estimators such as the sample mean and variance.

When we talk about a sampling distribution, we’re referring to the distribution of a statistic that we would expect to see if we took multiple samples from a population. For example, if we wanted to know the mean height of all adults in the United States, we couldn’t measure every single person. Instead, we would take a sample of people and calculate the mean height from that sample.

If we took multiple samples and calculated the mean height from each sample, we would expect to see a distribution of values. This distribution is known as the sampling distribution.

The sampling distribution will tell us things like what the average value is likely to be, how much variation there is likely to be, and so on. This information is important because it allows us to make inferences about the population from our sample.

Assignment Task 2: Derive a likelihood function for random samples from a probability model under more complex sampling schemes, eg mixed populations, and censoring.

The likelihood function for random samples from a probability model can be quite complex, depending on the nature of the population and the sampling scheme. In general, however, the likelihood function will be a product of individual probability functions, one for each type of observation. For example, if the population is a mixture of two different types of objects, then the likelihood function would be a product of two individual probability functions. And if some observations are censored (ie not observed), then the likelihood function would also include a function to account for that.

Assignment Task 3: Calculate simple unbiased estimators and calculate optimal combinations of estimators.

A simple unbiased estimator is an estimator that is both unbiased and easy to compute. The most common examples of simple unbiased estimators are the sample mean and the sample variance. However, many other estimators fit this description. 

To calculate a simple unbiased estimator, start by first selecting a function that you want to use to estimate some population parameter. This function can be anything, but it should be easy to compute and not depend on any unknown parameters. Once you have selected your function, you then need to calculate its expected value. This expected value should be close to the actual value of the population parameter that you are trying to estimate. Finally, check that your estimator is unbiased by subtracting the expected value from the actual value of the population parameter. If the result is close to zero, then your estimator is unbiased.

Assignment Task 4: Find maximum likelihood estimators by solving the score equation and obtain an estimate of precision based on observed and expected information.

Maximum likelihood estimators can be found by solving the score equation. In particular, the maximum likelihood estimator is the value that solves the following equation:

score(θ) = ∂logo(θ)/∂θ = 0

where L(θ) is the likelihood function and θ is the unknown population parameter. 

Once you have found the maximum likelihood estimator, you can then obtain an estimate of precision based on the observed information. To do this, start by calculating the Fisher information. The Fisher information is a measure of the amount of information that is contained in the data about the unknown population parameter. The more data you have, the higher the Fisher information will be. Once you have calculated the Fisher information, you can then estimate the precision of your estimator by taking the inverse of the Fisher information. This will give you an estimate of the standard error of your estimator. 

Assignment Task 5: Find confidence intervals for simple problems using pivotal quantities.

There are many ways to find confidence intervals for simple problems using pivotal quantities. The most common method is to use the z-score. To find the z-score, simply take the relevant statistic and subtract the population mean from it. Then, divide this difference by the standard deviation of the population. The resulting number is the z-score.

Once you have the z-score, you can then use a table of critical values to determine the confidence interval. For example, if you wanted to find a 95% confidence interval, you would look up the value of 2.5 on a z-table. This tells you that 95% of all values lie within 2.5 standard deviations of the mean. So, if your z-score is 2.5, then your confidence interval would be (mean-2.5*standard deviation, mean+2.5*standard deviation).

Assignment Task 6: Calculate the size and power function for a given test procedure.

A test procedure is a set of steps used to measure or calculate something. To calculate the size and power function for a given test procedure, you will need to know the following information:

  • The standard deviation of the population
  • The mean of the population
  • The sample size

Once you have this information, you can use the following formulas to calculate the size and power function:

Size Function = z score * (Standard Deviation / Mean)^2 * (Sample Size / Population Size)

Power Function = 1 – (z score * (Standard Deviation / Mean)^2) / (Sample Size / Population Size)

where z score is the critical value from a z-table and population size is the total number of people in the population. 

For example, let’s say that you have a population with a mean of 100 and a standard deviation of 10. You want to take a sample of 50 people from this population. The z-score for a 95% confidence interval is 1.96. Using the formulas above, we can calculate the size and power function as follows:

Size Function = 1.96 * (10 / 100)^2 * (50 / 100) = 0.0049

Power Function = 1 – (1.96 * (10 / 100)^2) / (50 / 100) = 0.9951

This means that the size function is 0.0049 and the power function is 0.9951.

Assignment Task 7: Obtain a most powerful test of two simple hypotheses using the Neyman Pearson lemma and extend this to a uniformly most powerful test of one-sided alternatives.

The Neyman-Pearson lemma is a powerful tool for testing hypotheses. It states that, given two simple hypotheses, the most powerful test of the two is the one that maximizes the likelihood ratio. This lemma can be extended to a uniformly most powerful test of a one-sided alternative hypothesis. This means that, if we are testing whether a population mean is greater than some value, we can find the test that has the greatest power to detect this difference. This test will be more powerful than any other test with the same Type I error rate (false positive rate).

To find the most powerful test of two simple hypotheses, we first need to calculate the likelihood ratio. To do this, we take the ratio of the likelihoods of the two hypotheses. The hypothesis with the greater likelihood will have a larger likelihood ratio.

Once we have calculated the likelihood ratio, we can then use this to find the most powerful test. To do this, we take the log of the likelihood ratio and compare it to a critical value. If the log of the likelihood ratio is greater than the critical value, then we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

Assignment Task 8: Use the likelihood ratio procedure to derive a test of nested hypotheses for some simple statistical models.

The likelihood ratio procedure is a powerful tool for testing nested hypotheses. For example, consider the following two models:

Model 1: y = β0 + β1×1 + ε

Model 2: y = β0 + β1×1 + β2×2 + ε

The likelihood ratio test can be used to test the hypothesis thatβ2 = 0. The test statistic is given by

Where χ2 is the chi-squared statistic and degrees of freedom (df) are equal to the difference in several parameters estimated in the two models (i.e., 2 – 1). If the null hypothesis is true, then this statistic has a chi-squared distribution with df degrees of freedom.

