Fast and more accurate implementation of Least-Squares Monte Carlo (Longstaff-Schwartz) for American options in C++. This project aims to optimize the original Longstaff-Schwartz method to greatly increase accuracy and decrease computation time given equivalent computational parameters (number of grid divisions) via mathematical and programmatic improvements.
The original Longstaff-Schwartz implementation was provided by R.S. of the UIUC Industrial Engineering Department. The idea to augment this method came from discussing option pricing using Monte Carlo methods after class.
fast-lsmc.cpp contains the optimized methods for Longstaff-Schwartz and serves as a calculator.
fast-lsmc.hpp is the header file for generate-data.cpp that contains the "function" version of the fast-lsmc.cpp method and a modified version of the original method provided by R.S. that can calculate put and call option prices.
generate-data.cpp generates the option prices over an array of parameters and saves it as a .csv file. The estimated prices and computation times will be used to compare the efficiency of each method.
option_pricing.csv contains the data generated from the file listed above. This will be used for data analysis regarding accuracy and computation time to draw general conclusions and those regarding the bases.
optimization_analysis.ipynb contains data analysis comparing the accuracy of the original and option pricing methods.
These optimizations are meant to make the original Longstaff-Schwartz method faster and more accurate. The following contains a short explanation of each improvement.
The eigen 3.4.0 C++ package was used for all linear algebra needs.
After reviewing literature, choosing a basis other than the standard power basis (that was used in the original Longstaff-Schwartz paper) was consistent amongst many papers. The three tested bases were the Power, Hermitian, and Laguerre bases. They are defined as follows (k is the number of desired basis functions):
Power:
Hermitian:
Laguerre:
I will omit further explanation of the bases for brevity; refer to the top of page 6 of this paper for more clarity regarding basis construction.
I implemented a method that programmatically chose the optimal basis based on which had the greatest
In Least-Squares Monte Carlo methods for options pricing, the condition
for calls, and
for puts. Put simply, a path is excluded if the price in the previous step does not reach a certain threshold based on the decay imposed by the risk-free rate.
The Andersen trigger method (otherwise known as LSA) was also added to improve the convergence rate by maximizing the average cutoff over simulated paths. This is essentially adding a constant to one side of the above inequality. More details can be found on page 12 of the paper linked above.
A Brownian Bridge was simulated instead of a Brownian Motion as a Brownian Bridge requires only the last iteration of movement to be stored whereas Brownian Motion requires storage of the whole walk. This saves memory and decreases computation time since a greater portion of the cached information resides in the CPU instead of DRAM. A Brownian Bridge is essentially when the final value of a walk is chosen first, then a pseudo-random iteration process walks the process back to some starting value (basically a Brownian Motion in reverse). I will now present the mathematical formulation of this concept.
Suppose
where
Walk backward using the following equation:
One may observe that this achieves the starting value at
The formulation was taken from this article.
Another attempted optimization was the stratification and a double-regression enhancement (from this paper). This achieves essentially the same outcome as the Path Conditions as it tightens the bounds for paths chosen for regression; it compares the regression values of all viable paths and paths for which "far" in or out-of-the-money values are discarded. The regression uses the paths between strike times for prediction. The issue for American options is that there is no "path" between the exercise times since every time is an exercise time. Thus, this method would be too biased toward the most recent iterations to predict the next price accurately.
A potential workaround is to use the
Another issue is that this stratification requires access to all paths used for the calculation. Loop blocking was used to optimize runtime, so paths were generated in batches of around
Loop blocking (or tiling) is implemented when iterating over the total number of desired simulations by dividing the number of total simulations by
Loop interchanges were tested in the generate-data.cpp file, but improvements were negligible. I assume this is because the order of the vectors being iterated over was in the single digits so changing them around would not make a great difference. The concept of "loop interchange" boils down to making the outermost for loop the loop with the greatest number of iterations/data and assigning the inner loop to that with fewer iterations/data since computing fewer iterations at a time will allow the data to stay in the CPU cache instead of moving to DRAM. This achieves a similar outcome to loop blocking. An example of this is multiplying a tall-and-skinny matrix
I also learned the term loop fusion for which independent computations are put into the same for loop to save computation time. This is something I already do, but it's good to know the code is more efficient this way.
Multithreading was implemented via thread pooling to decrease computation time. The linked article explains the concept well, but I essentially split the price computation for each of the bases onto a different thread since the prices generated by each basis for each set of inputs are independent. At a high level, I created a lambda function to add all iteration information to a row vector and used a mutex lock to separate the computations. I then joined/appended the computed materials to a data frame for processing.
The optimizations made the method more accurate and faster. The Power basis usually performs about the same as the Hermitian and Laguerre basis which should (hypothetically) perform better due to their increased complexity. Thus, using more complex bases is not incredibly valuable since it does not significantly decrease computation time when compared to the Power basis.
Typically, the higher the volatility, the further the strike price out-of-the-money, the lower the risk-free rate, and the higher the expiry time implies a greater difference between the original and optimized error (the optimized is almost always more accurate). I drew this conclusion by observing each of the plots and changing the standard input values to cross-confirm the trends (mentally constructing a 3D plot between three variables to make sure the results align). One may similarly check themselves via the optimization_analysis.ipynb file, or generate 3D plots.
There seems to be a significantly greater amount of error across all cases for put options than for calls. I am certain the error for the original method is correct, meaning there must be a methodology error when computing the put prices. This (likely) stems from how the path conditions are formulated. The Laguerre basis performed relatively poorly for expiry times greater than
On average, the optimized method decreased computation time by around
In short, use the optimized method with the Power basis since it is about as fast and provides similar accuracy to the more complex Laguerre and Hermitian bases while maintaining implementation simplicity and interpretability.
Fast. Really slow to debug. It would be better with a wrapper language with a bunch of implicit functions, simpler syntax, and pointer handling so the user can spend less time developing. If only a version of C++ like that existed...