Assignment Task
Task
1.Mark each statement as TRUE or FALSE. Justify your answers.
a.You are given a task to analyze the salaries of employees in your company and find a representative value for them. Taking into account the information that the CEO and partners have substantially higher salaries than other employees, you should take the average salary as the representative middle point.
b.The Dubai General Hospital generally expects 60 patients per day who require urgent care, on average. However, in the past 80 days, the data shows that there are on average 57 urgent cases per day, with a standard deviation of 5. The number of urgent cases has significantly deviated from the expectation of the hospital, at the 5% significance level.
c.Your retail firm is considering buying data from a company to improve her current demand forecasts, which are particularly inaccurate for large demand values. One new feature is highly correlated with independent variables in your current model for large demand values, yet not correlated for small demand values. Ideally, you should not use this new independent variable since it is highly correlated with the independent variable you already have (and thus, brings little new information).
d.You are considering a potential investment opportunity in a hedge fund. You are given access to the historical monthly returns since the inception of the fund. To be on the safe end, you should remove all the outliers.
e.You fit a simple linear regression model to explain consumption of milk (target variable) using the consumption of coffee (explanatory variable). The R-squared value of your linear regression is less than 20%. This means that there is no statistically significant relationship between milk consumption and coffee consumption at the 5% significance level.
f.You have access to sales data of 30000 stores of a Fortune Global 500 company. You are interested in finding if there exist a significant change in overall sales in year 2019 in comparison to year 2018. Because you want to keep the analysis relatively simple, you pick 1000 stores at random and conduct a two-sample hypothesis test comparing overall sales in 2019 with overall sales in 2018 for these stores. To make sure the result is reliable, you should repeat this process an arbitrarily large number of times, and take the smallest p-value as your final p-value.
A-Plus Writing Help For University Students
Get expert assistance in any academic field. All courses and programs covered.
Get Help Now!2.Regression: Jamila is a factory manager of a manufacturer of branded chocolate and confectionery products. The trademark cookie of the company produced in the factory is made in a similar way to how cookies are made at home, all of the ingredients are fresh, and there are no preservatives added. Thus, Jamila is very confident of the quality of the product, yet she is concerned that cookies produced in the factory have shorter shelf lives in comparison to the competition. She is interested in finding a way to make the cookies last longer without introducing any preservatives to the recipe. To this end, she puts one of the production managers, Karim, to work on experimenting with and collecting data about the baking process.
a.Construct a linear regression model that predicts the shelf-life (dependent variable) as a function of the oven temperature and baking time. Interpret the model. Assuming that this model is accurate, how does improving the oven temperature for one degree change the
shelf life? Are there any outliers?
b.Which variables are statistically significant for the model in part (a) at the significance level of 5%? What is the overall quality of the model? Justify your answer.
c.Construct a new linear regression model by dropping the independent variable that was significant at the 5% significance level for the model obtained in part (a). Is the remaining variable significant at significance level of 5%? How do you explain that the remaining variable is significant in the new model? (Hint: Examine the correlation between independent variables.)
d.Fit a regression model using only baking time as an independent variable, and shelf life as the dependent variable. Using this regression model, are you 95% confident that the shelf life exceeds 60 days if the baking time is 1.5 hours?
e.Karim is looking into how the linear regression of question (d) can be further improved. Examining the residual plots, he is worried that assuming a linear relationship between baking time and shelf life might be misleading. How can the independent variable be transformed to reflect this observation in the linear regression model? Construct a new linear regression model following this transformation.
3.Hypothesis Testing: Jennifer is the founder of a chain of restaurants in Paris and Dubai, specialized in high-quality plant-based burgers. The restaurant chain is notorious for an awardwinning premium burger meal, known as the “Beyond Burger”. In 2019, the chain of restaurants rolled-out a new plant-based recipe, labelled “GooD Burger”. Both meals are equivalent in terms of costs and selling price. Recently, Jennifer has reviewed the sales data in her restaurants in 2020. She notices a drop in the number of orders for the “Beyond Burger” meals. She is concerned that the recent data signals that customers have lost interest in her flagship meal, possibly due to a cannibalization of the demand by the “GooD Burger”. The goal of this exercise is to give a datadriven recommendation on whether the chain of restaurants should roll-back the new recipe.
a.Jennifer would like to confirm whether the total number of monthly meals, combining both recipes, has decreased in 2020 compared to the previous years (2019, 2018 and 2017). Formulate and conduct the corresponding two-sample hypothesis test. Has the total monthly sales significantly changed in 2020 compared to previous years, at the 5% significance level? Hint: Compute the total monthly sales as a new column in your data set. This new quantity is defined as the sum of the number of orders of “Beyond Burger” and “GooD Burger” in each row (combination of month and city). Do not aggregate the quantities contained in different rows.
b.Jennifer would like to better understand whether this change of demand started in 2020 or in 2019 when the new recipe was introduced. To this end, you will repeat the same analysis as Q(a), but this time, you will compare the sales in 2019 to those in 2018-2017. Formulate and conduct the corresponding two-sample hypothesis test. What was the effect of the introduction of the “GooD Burger” on the total monthly sales?
c.Did the “GooD Burger” cannibalize the demand for “Beyond Burger” when it was introduced in 2019? Formulate and conduct the corresponding two-sample hypothesis test. Hint: You need to test whether the demand for “Beyond Burger” meals changes in 2019, after the introduction of the “GooD Burger”, in comparison with the previous years.
d.Jennifer now wishes to compare the total monthly sales in Paris and Dubai during the entire period 2017-2020. Formulate and conduct the corresponding two-sample hypothesis test. Is there any significant difference between Paris and Dubai?
e. Based on answers to questions (a)-(d), what is your recommendation to Jennifer?
f. Develop a linear regression model that synthetically captures all the findings of the previous questions.
Welcome to our Online Essay Writing Agency. Securing higher grades costing your pocket? Order your assignment online at the lowest price now! Our online essay writers are able to provide high-quality assignment help within your deadline. With our homework writing company, you can order essays, term papers, research papers, capstone projects, movie review, presentation, annotated bibliography, reaction paper, research proposal, discussion, or another assignment without having to worry about its originality – we offer 100% original content written completely from scratch
We write papers within your selected deadline. Just share the instructions