chiphuyen · tiggreen · May 26, 2023
diff --git a/basic-ml-review.md b/basic-ml-review.md
@@ -116,7 +116,7 @@ While developing your model, if time permits, you should experiment with differe
 
 ### Learning Procedure
 
-Learning procedures the procedures that help your model find the set of parameters that minimize a given objective function for a given set of data, are diverse[^3]. In some cases, the parameters might be calculated exactly. For example, in the case of linear functions, the values of w and b can be calculated from the averages and variances of x and y. In most cases, however, the values of parameters can’t be calculated exactly and have to be approximated, usually via an iterative procedure. For example, K-means clustering uses an iterative procedure called expectation–maximization algorithm. 
+Learning procedures are procedures that help your model find the set of parameters that minimize a given objective function for a given set of data, are diverse[^3]. In some cases, the parameters might be calculated exactly. For example, in the case of linear functions, the values of w and b can be calculated from the averages and variances of x and y. In most cases, however, the values of parameters can’t be calculated exactly and have to be approximated, usually via an iterative procedure. For example, K-means clustering uses an iterative procedure called expectation–maximization algorithm. 
 
 The most popular family of iterative procedures today is undoubtedly **gradient descent**. The loss of a model at a given train step is given by the objective function. The gradient of an objective function with respect to a parameter tells us how much that parameter contributes to the loss. In other words, the gradient is the direction that lowers the loss from a current value the most. The idea is to subject that gradient value from that parameter, hoping that this would make the parameter contribute less to the loss, and eventually drive the loss down to 0.