**Assignment Task**

**Task **

**Biostatistics – Assignment **

Get Help Now!**Instructions:****• **To complete the assignment, please type your answers into this document. **• **You do not need to show your working, unless asked for in the question.

**Question 1 **

Kaplan et al. (2013) studied Female Genital Mutilation / Cutting (FGM/C) in The Gambia, where the overall prevalence of FGM/C is 76.3%. The aim of the study was to investigate the relationship between FGM/C in pregnant women and infant birthweight. Data were collected from 588 women receiving antenatal care or delivery in hospitals and health centres of the Western Health Region. The information, collected through a questionnaire and medical examination, included sociodemographic factors, the presence or not of FGM/C, the type of FGM/C, the long-term health consequences of FGM/C, and maternal and fetal complications during delivery. The investigators, knowing that you have just completed a course in biostatistics, ask for your assistance in analysing the data and drawing appropriate conclusions.

**(a) **What is the first step that you should take in analysing the data? (Assume that the data are already in a computer file and you have statistical software which can be used to analyse the data.) What will you learn from this step?

**(b)** What statistical procedure will you use to test the null hypothesis that, in the population from which the sample was selected, the average birthweight is the same for infants born to women who have undergone FGM/C and infants born to women who have not undergone FGM/C?

**(c) **You run the appropriate statistical test to test the null hypothesis that in the population the average infant birthweight is the same for pregnant women who have undergone FGM/C and those who have not. Below is part of the output from your computer software.

**(d)** If the observed significance level for the test of the null hypothesis that the two population mean birthweights are equal is 0.03, what can you conclude?

**(e)** If you were to perform a similar study, how could you amend the design to increase the power?

**(f)** What statistical test could you use to test the null hypothesis that, in the population, the average age at cutting is the same for three different ethnic groups?

**(g)** Give an example, not from the e-book, of a study for which you would use a paired t-test to analyse the data. Give an advantage of using paired t-test as compared to an independent samples t-test for your example.

**(h)** You want to see if there is a relationship between ethnic group and FGM/C.

**(i)** Which statistical test would you use to test the null hypothesis that there is no association between ethnic group and FGM/C? **(j)** Use openepi.com, graphpad.com, or any other software, to test the null hypothesis that there is no association between Mandinka / Wolof ethnic groups and the presence of FGM/C. State the p value and how you would interpret the results.

**(k)** Name one assumption that is needed for the test you used in part (j), above. Do your data meet this assumption? **(l)** Based on the following table, the study investigators want to look at whether FGM/C is related to complications after delivery:

Delivery Complication. Calculate the relative risk for complications during delivery for women with FGM/C compared to those without FGM/C. Show your working.

**(m)** The 95% confidence interval for the relative risk calculated in part (l), above, is (2.7 to 5.5). What can you conclude about the relationship between FGM/C and delivery complications?

**(n)** Using the data in the table given for part (l), calculate the odds ratio for the odds of FGM/C in a woman with delivery complications compared to one without delivery complications. Show your working.

**(o)** Do you think that the association between delivery complications and FGM/C estimated from this study is a good estimate for all pregnant women in The Gambia? Why / why not?

**Question 2 **

The figure below shows the results from a study looking at the relationship between perceived quality of care measured on a scale from 1 to 132 (y axis) and duration of pregnancy in weeks at first clinic visit (POA [“point of arrival”], x axis) for women enrolling in a clinic in Sri Lanka.

**(a) **Give two possible problems with the data shown on the scatterplot.

**(b) **How would you describe the relationship between quality score and duration of pregnancy? Make a guess as to what the correlation coefficient would be between these two variables.

**(c**) True or false: a correlation coefficient of –0.8 indicates a weaker linear relationship between two variables than a correlation coefficient of + 0.7? Explain your answer.

**(d)** In the study above, perceived quality of care was compared to a random blood glucose measurement (mg/dL) taken at delivery. Statistical analysis found a Pearson correlation coefficient of -0.4, with a two-tailed p value of 0.042. What can you conclude? (You can assume that this is an appropriate statistical test for these data.)

**(e)** In the study above, perceived quality of care was also compared to the stated religion of each woman. Religion was coded in alphabetical order, from 1 to 6. Statistical analysis found a Pearson correlation coefficient of 0.7, with a two-tailed p value of 0.015. What can you conclude?

**Question 3 **

The figure below, from the 2010 WHO World Malaria report, shows the relationship between the percentage of children under 5 years old who attend public health facilities and receive antimalarial medications (Y axis) and the percentage that are estimated to need them (X axis) based on surveys from 38 countries.

**(a)** What is the value of the correlation coefficient between the two variables? Show how you calculate this. **(b)** What is the gradient (slope) and intercept of the fitted line?

**(c) **What is the intercept of the line labelled “% receiving = % need”?

**(d) **What is another name for the fitted line? How is it calculated, in principle?

**(e)** Based on the fitted line, what is the predicted value for the percentage of children receiving antimalarial medications when the need for them is 50%? Show your working.

**(f) **Based on the fitted line, what is the residual for a country with an observed value of 50% for need and an observed value of 25% for received? Show your working.

**(g)** What percent of the observed variability in the percentage of children receiving antimalarials is “explained” by differences in need?

