Analysis of Variation and Chi-Squared Analysis The limitations that may exist if

Analysis of Variation and Chi-Squared Analysis

The limitations that may exist if this database is used as a sample of all North Carolina births are as follows:

This data isn’t exhaustive set of all births of the North Carolina births.

There are many data values in many variables missing within the database.

Hence, there is a high chance that these factors could potentially affect our statistical analysis if this data is employed as a sample for all North Carolina births.

Analysis of Variance

We are getting to perform the Analysis of variance in the 0.05 significance level for the subsequent question:

Question: Does the factors smoking habits of the mother and legal status of the mother during pregnancy have an impression on the child’s birthweight?

We have the following populations:

Let X be the population of birthweight of children whose mother was a non-smoker during pregnancy.

Let Y be the population of birthweight of children whose mother was a smoker during pregnancy.

Let Z be the population of birthweight of children whose mother was married during pregnancy.

Let P be the population of birthweight of children whose mother is unmarried during pregnancy.

The Null and Alternate hypotheses are as follows:

Null hypothesis: The means of the populations of birthweights of children are equal.

Alternate hypothesis: Atleast one among the population means are different.

After doing the calculations, we arrive at the following results:

Populations

Count

Sum

Average

Variance

X (non-smoker)

878

6262.666

7.132877

2.356056

Y (smoker)

131

884.9056

6.755005

2.259016

Z (married)

618

4497.282

7.277155

2.053943

P (not married)

391

2650.641

6.779134

2.679846

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

75.67428

3

25.22476

10.87303

4.38E-07

2.609321

Within Groups

4672.356

2014

2.319938

Total

4748.03

2017

 

 

 

 

We see that the p-value is 4.38E-07 which is clearly very less than 0.05, our alpha value. Hence we reject the null hypothesis. And our alternate hypothesis is significant within the 0.05 significance level. So we conclude that atleast one among the population means are different.

Chi-square test

Using the chi-square test within the 0.05 significance level we are getting to analyze the subsequent question:

Question: Is the smoking habit of a woman during pregnancy associated with the ethnicity of the woman?

We have the subsequent data in hand:

Smoking habits

Hypothesised proportion

Observed

Expected

Non-white mothers

0.5

30

63

White mothers

0.5

96

63

The Null and Alternate hypotheses are as follows:

Null Hypothesis: The smoking habits aren’t associated with the ethnicity of the woman.

Alternate Hypothesis: The smoking habits are associated with the ethnicity of the woman.

After performing the calculations with the information we have, we arrive at the subsequent results:

p-value is 4.10893E-09

The chi squared test statistic is 34.57142858

By looking at the p-value we see that it is very less than 0.05. So, we reject our null hypothesis and conclude that our alternate hypothesis is significant within the 0.05 significance level. We also see that the chi squared test statistic for the test is 34.57142858 and this is clearly greater than the critical value for the test which is 3.841458821. Hence the test statistic is bigger than the critical value. Hence this also agrees with the conclusion of rejecting the null hypothesis.