**Assignment Task**

**Task **

**Machine Learning**

A-Plus Writing Help For University Students

Get expert assistance in any academic field. All courses and programs covered.

Get Help Now!**1 Least Squares and Double Descent Phenomenon (17P)**

The goal in this assignment is to learn about linear least squares regression and the double descent phenomenon, shown in Figure 1. In the classical learning setting, the U-shaped risk curve can be observed indicating a bad test error while the training error is very low, i.e. the model does not generalize well to new data. However, a highly over-parameterized model with a large capacity allows the test error to go down again in a second descent (“double descent”), which can sometimes be observed in over-parametrized deep learning settings.

**Tasks****1**. Rewrite eq. (2) in pure matrix/vector notation, such that there are no sums left in the final expression. Use ? = ?(x) for the feature transform which can be computed prior to the optimization. Additionally, state the matrix/vector dimensions of all occurring variables.

**2**. Analytically derive the optimal parameters w? from eq. (2).

**3**. Give an analytic expression to compute predictions yˆ given w?.

This can also be interpreted as a small feedforward neural network with one hidden layer for an input x ? R d and output ˆy ? R. Draw a simple schematic of this neural network and include exemplary labels of its neurons and connections.

**4**. Create a training dataset comprised of input data x = {x1, …, xN } and corresponding targets y = {y1, …, yN } with N = 200, d = 5 and ? = 2 according to eq. (1).

In the same manner, create a test dataset with Nt = 50 for both test input data and test targets.

**5**. Generate M = 50 d-dimensional random feature vectors v = {v1, …, vM} on the unit sphere.

**6**. Implement the computation of w? from the training data using a QR decomposition. Further, compute the mean squared error denoted in eq. (4) for both the training and test data based on the optimal parameters w?

.**7**. Use ? = 1 × 10?8 to reproduce the double descent behaviour. Run this experiment for a number of feature vectors M = 10k + 1 | k ? {0, 1, 2, …, 60} and save the training and test loss in each run. For each M, do the experiment r = 5 times to obtain averaged scores.

**8**. Plot both the averaged (over the r = 5 runs) train and test errors depending on the number of feature vectors M in the same plot. Include the standard deviation of each setting in addition to the averaged loss. Give an interpretation of your results.

**9**. Repeat the same experiment for ? = {1 × 10?5 , 1 × 10?3} and explain the influence of ?. Include the resulting curves containing train and test error for each ? in two additional subplots.

**Implementation details**

• To efficiently solve a system Ax = b, QR decomposition of the matrix A into an orthogonal matrix Q and an upper triangular matrix R can be used instead of direct matrix inversion (see numpy.linalg.qr). This can be computed as follows:

A = QR

z = QT

b

Rx = z ? x = R?1z (solve this with numpy.linalg.solve)

**• **If you want to plot the standard deviation ±? in addition to an averaged curve use matplotlib.pyplot.fill between, where y1 and y2 denote µ ? ? and µ + ?, respectively. You can set an alpha value for blending.

Implement the double descent phenomenon in task12 of lsq regression.py. Do not use any other libraries than numpy and matplotlib.

**2 Dual Representation (8P)**

The linear least squares problem from Task 1 can be reformulated in its dual representation, where an equivalent solution can be obtained.

The dual problem is given by

**Tasks****1**. Analytically compute the optimal parameters a? from eq. (5). State the dimension of the resulting matrix that has to be inverted in the process and compare them those required in

**Task 1.** When is it favourable to use the primal and when the dual solution?

**2**. Give an analytic expression to compute predictions yˆ given a? using eq. (7), such that you only rely on K and do not need to compute the features ? explicitely.

**3**. For the train data x compute the kernel matrix as given in eq. (6). Repeat the same process for the test data, ensuring that the resulting kernel matrices are of dimensionality R N×N and R Nt×N , respectively.

**4**. Implement the computation of a? and report the mean squared error on the train and test data, using ? = 1 × 10?8.

This **IT Computer Science Assignment** has been solved by our **IT Computer Science Expert **at **TV Assignment Help**. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing Style. Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered.

You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turn tin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Welcome to our Online Essay Writing Agency. Securing higher grades costing your pocket? Order your assignment online at the lowest price now! Our online essay writers are able to provide high-quality assignment help within your deadline. With our homework writing company, you can order essays, term papers, research papers, capstone projects, movie review, presentation, annotated bibliography, reaction paper, research proposal, discussion, or another assignment without having to worry about its originality – we offer 100% original content written completely from scratch

** We write papers within your selected deadline. Just share the instructions**