beating_uniswap.Rmd

---
title: "Beating Uniswap"
author: "charliemarketplace // Flipside Crypto"
date: "`r Sys.Date()`"
output:
  html_document:
    css: "styles.css"
    includes:
      in_header: header.html
    code_folding: hide
    toc: true
    toc_float: true
editor_options: 
  chunk_output_type: console
---

```{r}
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
# set TRUE to run calculate_profits, otherwise just read from saved runs in repo
run = FALSE
```

# Abstract

Uniswap v3 allows for concentrated liquidity in automatic market making. 
Given a budget constraint, e.g., 100 ETH, a period of time, e.g., 10,000 blocks, 
and the set of known trades within a specific pool, e.g., ETH-WBTC 0.3% pool on 
Ethereum mainnet, it is possible to calculate the profit maximizing range
for a set of trades using common optimization methods such as bounded L-BFGS-B
to enforce constraints such as the budget and the tick_lower.

For example (detailed in the vignette of the corresponding uniswap R package),
there were 169 trades in the ETH-BTC 0.3% pool on Ethereum Mainnet between 
blocks 16,000,000 and 16,010,000. Starting with 100 ETH of total value, depositing 
91 ETH and 0.655 BTC in the 13.529 - 13.757 ETH/BTC range generates 
32391087272403693 Liquidity. 

At block 16,010,000 the position becomes 80.688 ETH and 1.398 BTC. It accumulates
1.530 ETH and 0.0151 BTC in trading fees. The total value of fees and the position, using the price *at* block 16,010,000 (13.71331 ETH / BTC) results in 101.60 ETH, 
a 1.6% gain in 33 hours (400% APY).

The goal of this paper is to model these perfect ranges over both defined and randomized intervals to identify if there is a simple but profitable auto-regressive strategy available for highly correlated pools. More advanced models such as LSTM or RNNs are not analyzed here.

# Methodology 

A detailed guide/presentation is available in the repo as `Beating_Uniswap.pdf`, here is a shortened summary.

- A liquidity position is comprised of a balance of assets `[token0, token1]` 
added to a liquidity pool at a `current_price` with a `[tick_lower, tick_upper]` 
indicating the concentration of liquidity between those assets (i.e., when prices are below tick_lower or above tick_upper, the position is 100% in the lower valued asset).

- This means providing liquidity is the act of selling winners (rising assets) to buy losers (automatically market making for traders) and taking a fee for doing so.

- This creates a standard accounting/business dilemma. The liquidity provider (LP) seeks to earn revenue (fees from market making) above their costs (price divergence between the winning and losing asset).

- Going a level deeper, this is actually a constrained optimization problem. Liquidity of a position is an actual number calculable from the inputs `[token0, token1, current_price, tick_lower, tick_upper]` (see: ?get_liquidity in the package). Given the same assets at the same price (here, current_price means the price when position is created) the more narrow the range (i.e., the closer tick_lower and tick_upper are around current_price) the higher the marginal liquidity is and thus the higher percent of trading fees the position gets (this is called capital efficiency. Same assets, but more revenue).

- But when prices fall out of the range, the position is both getting *no* fees for those trades and is taking on all the price divergence risk.

- Thus, given a `budget` (e.g., 100 ETH) at a starting point (e.g., Block 16M) and the exact series of trades that occur between the starting point and a user defined end point (e.g., Block 16,010,000), it is possible to calculate what *would have been* the absolute best asset breakdown and range for a given target outcome (e.g., growing from 100 ETH to >100 ETH). In some cases the optimal may be to not participate in market making during the time period at all (price divergence was significantly more than fees). In other cases the optimal may be an extremely narrow (capital efficient) range because price divergence (cost) was minimal so fees (revenue) should be maximized.

# Framework

161,984 trades occur in the ETH-WBTC 0.3% pool from inception (Block 12,369,886)
to block 17,000,000.

Starting at 12,400,000 for simplicity (early pool history had low liquidity) 
there are 161,311 trades.

Breaking this into 460 chunks of 10,000 blocks each we can calculate the optimal 
range for each chunk given a 100 ETH starting value and taking the initial tick 
of the trade as the current price and tick of the last trade as the final price.

```{r}
library(dplyr)
library(plotly)
library(uniswap)

data("ethwbtc_trade_history")
trades_124_170 <- ethwbtc_trade_history[ethwbtc_trade_history$block_number >= 12400000, ]

breaks <- seq(12400000, 17000000, by = 10000)

trade_breaks <- list()
for(i in 1:(length(breaks)-1)){
  trade_breaks[[i]] <- trades_124_170[trades_124_170$block_number >= breaks[i] & 
                                    trades_124_170$block_number < (breaks[i]+10000),
                                      ]
}

```

For each chunk we do a brute force search for a maximum strategy value 
using an expanded grid of (0.1 - 0.9) * budget as the initial amount1 and 
(0.1 - 0.9) * current_price as the initial tick_lower. I.e., we test 81 potential 
values to identify an initial parameter to then apply optimization.

```{r}
budget = 100 # 100 ETH

if(run == FALSE){
  range_info <- readRDS("optimal_ranges_info_12M400K_17M.rds")
} else {
range_info <- lapply(X = 1:460, FUN = function(i){
print(i)
  # 1e10 decimal is b/c WBTC is 8 decimals and WETH is 18
  p1 <- tick_to_price(trade_breaks[[i]]$tick[1],decimal_adjustment =  1e10)
  p2 <- tick_to_price(tail(trade_breaks[[i]]$tick, 1),decimal_adjustment =  1e10)
  
 low_price <- ((1:9)/10)*p1
 amount_1 <- c(1, budget*(1:9)/10)

grid <- expand.grid(x = amount_1, y = low_price)

sv <- lapply(1:nrow(grid), function(j){
tryCatch({
    calculate_profit(
      params = c(grid[j,1], grid[j,2]),
      budget = budget, p1 = p1, p2 = p2, trades = trade_breaks[[i]],
      decimal_x = 1e8, decimal_y = 1e18, fee = 0.003,
      denominate = 1,
      in_optim = TRUE)
  }, error = function(e){return(0)})
})

sv <- unlist(sv)

# initialize using naive search min
init_params <- as.numeric(grid[which.min(sv), 1:2])

# lower_bounds(amount1 = 0.01 ETH, p1 = 1 ETH/BTC)
# upper_bounds(amount1 = 99.9 ETH, p1 = 0.99 * current price)
lower_bounds <- c(0.01, 1)
upper_bounds <- c(99.9, 0.99*p1)

# in_optim = TRUE provides *only* -1*strategy value for optimization
# (-1 b/c algorithm looks for minimums and we want maximum)

result <- optim(init_params,
                calculate_profit,
                method = "L-BFGS-B",
                lower = lower_bounds,
                upper = upper_bounds, budget = 100,
                p1 = p1, p2 = p2,
                trades = trade_breaks[[i]],
                decimal_x = 1e8, decimal_y = 1e18,
                fee = 0.003, denominate = 1, in_optim = TRUE)

 # in_optim = FALSE provides full audit of calculation 
  profit = calculate_profit(params = result$par,
                 budget = 100, p1 = p1, p2 = p2,
                 trades = trade_breaks[[i]],
                 decimal_x = 1e8, decimal_y = 1e18,
                 fee = 0.003, denominate = 1, in_optim = FALSE)
  
  return(
    list(
      p1 = p1, 
      p2 = p2,
      init_params = init_params,
      result_par = result$par,
      result_warn = result$message,
      position_details = profit$position,
      strategy_details = profit$strategy_value
    )
  )
  
})
}

```

# AR(1) Model

Given a `perfect position` for set of 10,000 blocks (33 hours), a simple Autoregressive-1 model would re-use the parameters of the perfect position for the next set, i.e., making the assumption
that today will be similar to yesterday.

But, because our parameters are `[amount1, tick_lower]` and the price can change 
(e.g., the price can fall below the previous perfect range's `tick_lower`) some parameters 
may no longer be acceptable. If yesterday the price went from 13 to 12 ETH/BTC. And the optimal
range was 12.5 - 13.5 ETH/BTC. You cannot re-use 12.5 as the `tick_lower` because it is above 
the current price (12 ETH/BTC).

So an augmentation is applied to the AR(1) model where the `tick_lower` goes through a test: 

- If tick_lower from the perfect position < current price; re-use tick_lower
- Otherwise make the tick lower the same percent of the current price as the tick_lower from the perfect position was from the previous beginning price.

In the example case, the range was 12.5 - 13.5 ETH/BTC. We cannot use 12.5 because the 
current price is 12. So instead we use the fraction 12.5/ 13 (tick_lower / yesterday's initial price) which is 0.961, and set our tick_lower to that ratio of th current price, 0.961 * 12 = 11.538

In some instances, even this adjustment results in unusable parameters. For example, the budget in one period may be 80% BTC and 20% ETH for a given tick_lower. Attempting to keep this ratio for a given tick_lower at a new price may result in the math returning a tick_upper that is below the tick_lower. 

In the smart contracts of Uniswap v3, e.g., when calculating liquidity, anytime the math would result in tick_lower > tick_upper they simply switch the ticks. tick_lower becomes tick_upper and vice versa to ensure tick_lower < tick_upper.

But this can't work in the AR(1) case as it results in entirely different position structure that doesn't align to how anyone would think of AR(1). In these cases, the budget ratio (80% BTC and 20% ETH) is preserved and the tokens are simply kept out of range entirely earning 0 fees but still be assessed a strategy value.


```{r}

if(run == FALSE){
  ar1_info <- readRDS("ar1_range_info_12M410K_17M.rds")
} else {
  
ar1_info <- lapply(2:460, function(i){
  
  params <- range_info[[(i-1)]]$result_par
  
  if(params[2] >= range_info[[i]]$p1){
  params[2] <- range_info[[i]]$p1 * (params[2] / range_info[[i-1]]$p1) 
  }
  
  ar1_profit = calculate_profit(params = params,
                              budget = 100,
                              p1 = range_info[[i]]$p1, 
                              p2 = range_info[[i]]$p2,
                 trades = trade_breaks[[i]],
                 decimal_x = 1e8, decimal_y = 1e18,
                 fee = 0.003, denominate = 1, in_optim = FALSE)

  return(
    list(
      p1 = range_info[[i]]$p1, 
      p2 = range_info[[i]]$p2,
      used_params = params,
      position_details = ar1_profit$position,
      strategy_details = ar1_profit$strategy_value
    )
  )
})

# shift breaks forward
ar1_info <- c(list(NULL), ar1_info)
}

```

Using this simple, augmented if required, model to identify if profitability can be forecasted.

Generally, AR(1) is not a profitable strategy. With a median loss of -0.2852 ETH and 
average loss of -0.536 ETH, and an interquartile range of -0.97 to +0.449.

```{r}

perfect_netprofit_eth <- unlist(lapply(range_info, FUN = function(x){
    x$strategy_details$value - 100
}))

ar1_netprofit_eth <-c(0, unlist(lapply(ar1_info, FUN = function(x){
    x$strategy_details$value - 100
})))

plotly::plot_ly(x = ar1_netprofit_eth, type = "histogram", nbinsx = 1000, color = 'AR1') %>% 
  layout(title = list(text = "Distribution of ETH Profit using AR(1) Parameters",
                      y = 0.95),
         xaxis = list(title = "ETH Profit above Budget"),
         yaxis = list(title = "#")) %>% 
  add_trace(
    x = perfect_netprofit_eth, type = "histogram", nbinsx = 1000, color = 'Perfect'
  )
  
```

# Visualizations 

```{r}
plot_ly(x = 1:460, y = perfect_netprofit_eth, type = 'scatter',
        name = 'Perfect',
        mode = 'lines') %>% 
  add_trace(x = 1:460, y = ar1_netprofit_eth, name = 'AR(1)') %>%
  layout(title = list(text = "Profits for Perfect and AR(1) over Time",
                      y = 0.95),
         xaxis = list(title = "1 Break = 10,000 Blocks"),
         yaxis = list(title = "# ETH Profit above Budget"))

```

The distribution of perfect parameters is expectedly positive given we are denominating 
in a single asset within the pool (ETH) with a few negatives due to how parameter boundaries may 
require participation in breaks where it would have been best to not participate. 

```{r}
plotly::plot_ly(x = perfect_netprofit_eth, type = "histogram", nbinsx = 1000) %>% 
  layout(title = list(text = "Distribution of ETH Profit using Perfect Parameters",
                      y = 0.95),
         xaxis = list(title = "ETH Profit above Budget"),
         yaxis = list(title = "#"))

```

Overall AR(1) is a pretty weak model structure for this. But post-merge the variance 
has fallen.

```{r}
plotly::plot_ly(x = perfect_netprofit_eth, 
                y = ar1_netprofit_eth, color = c(rep("Pre-Merge",314), rep("Post-Merge",146)),
                type = "scatter") %>%
  layout(title = list(text = "Distribution of ETH Profit Perfect vs AR(1)",
                      y = 0.95),
         xaxis = list(title = "Perfect Profit"),
         yaxis = list(title = "AR(1) Profit"),
         legend = list(title = list(text = "10K Block Era"))) 
```

Price divergence in ETH/BTC terms has fallen since the merge which likely explains why AR(1) is slightly 
better post merge as volume stays within a smaller price range.

```{r}
open_close <- do.call(what = rbind, args = lapply(range_info, FUN = function(x){
   data.frame(open = x$p1,
              close = x$p2
   )
}))

plot_ly(data = open_close, x = 1:460, y = ~(close - open), type = "scatter", name = "Close-Open") 

```

Because Profit is denominated in ETH, price divergence would make some profits trivial. 
If BTC becomes more valuable against ETH (ETH/BTC rises) then earning ETH is simple jsut 
sell the BTC for rising ETH.

It is more interesting to see where price divergence against ETH can still have positive 
profit, but in general, if ETH becomes more valuable you're losing in ETH terms as you sell 
it for WBTC.

The rough linearity visible aligns to this expected relationships. Deviations from the 
best fit line are a mix of (1) significant volume causing high trade fees when profit is positive and (2) the AR(1) parameters being especially far from the actual perfect range parameters (e.g., if AR(1) has a low ETH allocation but price divergence is highly positive causing high loss in ETH terms and low trading fees as the position falls out range).

```{r}
plot_ly() %>%
  add_trace(data = open_close, 
        x = ~(close-open),
        y = ar1_netprofit_eth,
        color = c(rep("Pre-Merge",314), rep("Post-Merge",146)), 
        hovertext = breaks[1:460]) %>%
  add_trace(x = c(-2,-1,0,1,2), y = c(-10, -5, 0, 5, 10), mode = "lines", name = "y=5x") %>% 
  layout(title = list(text = "AR(1) Profit over 10K Block Period",
                      y = 0.95),
         xaxis = list(title = "Price Deviation (Close ETH/BTC - Open ETH/WBTC)"),
         yaxis = list(title = "AR(1) Profit in ETH")) 

```