ENGINEERING
2nd April 2023

How To Create A Trillion Dollars On Uniswap

Problem:

Accurately determining USD prices through liquidity pools on decentralized exchange (DEX) protocols, such as Uniswap V3, presents a significant challenge in safeguarding against edge cases. These edge cases can have substantial implications for higher-order metrics like trading volume and total value locked (TVL).

Goal

Our objective is to thoroughly examine the root causes of these edge cases and devise strategies to strengthen the robustness of our pricing mechanism. As our subgraphs on The Graph Network specifically rely on these liquidity pools for deriving token prices, enhancing their resilience is crucial. Doing so aims to improve our subgraphs' overall quality and reliability.

Summary:

This article will explore the challenges and potential solutions for deriving token prices from liquidity pools in decentralized exchange protocols. We begin by reviewing the fundamentals of AMMs and concentrated liquidity, offering a solid foundation for understanding the mechanics at play. Next, we delve into two specific scenarios where edge cases for pricing can arise, highlighting the complexities involved. Finally, we discuss potential solutions to address these edge cases, ultimately contributing to a more robust and reliable pricing mechanism.

Constant Function Market Maker (CFMM):

CFMMs are a model used for automated market-makers (AMMs) such as Uniswap to create a mathematically determined and controlled price that fluctuates automatically as users interact with the protocol.

One such CFMM model is a constant product market-maker (CPMM).

  • X*Y = k
  • Changes in X or Y instigate a difference in the other that satisfies a constant k.
  • In a liquidity pool on a DEX like Uniswap, k remains fixed when token X and token Y are traded for one another, but token X, token Y, and k may change when liquidity (X & Y) is deposited or withdrawn.

Example:

  • If a Uniswap V2 pool contains 1 ETH and 1000 Link, then k = 1000
  • How much Link would you get back if you want to swap in 0.75 ETH?
    • New Eth Balance = 1.75 ETH
    • New Link Balance = Y
      • Y = 1000 / 1.75 ETH = 571 Link
  • You receive 1000 - 571 Link = 429 Link.

Concentrated Liquidity:

On Uniswap V3, users can create positions where the liquidity is only distributed within a specific range.

Liquidity is distributed across this range in discrete intervals at locations called ticks according to a truncated curve. When the price reaches a point on the curve that touches or surpasses an axis, all X is converted to token Y or vice versa.

The price moves depending on the liquidity available at the current price (tick). During a swap, when the liquidity of the outgoing token is exhausted at the current price, the price then moves to the next tick, and the rest of the swap will occur against the liquidity at this price, and so on, until the swap order is fulfilled. The price may move multiple or no ticks, depending on the swap size and the liquidity available at each tick. Discrete movements in price occur on a continuous curve represented by X*Y = k.

Single Position Balance Curve

Liquidity Pool Price Curve

How Do We Price Tokens In DEX Subgraphs?

All token prices are ultimately derived from a USD-denominated token such as USDC, DAI, and Tether.

When a token is paired with a USD-denominated token, we can derive the price of the other token by determining its relative value in terms of the USD-denominated token. The relative balance of tokens determines their relative value, as in our review of CFMMs, or their relative value is determined by the current tick, as explained in our review of concentrated liquidity.

If a token has received a price from a USD-denominated token, we can use this token now to derive the price of other tokens as well, using this token because we now have a USD price for it.

Pricing Mechanism - Implementation Constraints

Allowing a price to be propagated from one token to another without discretion can lead to price instability or inaccuracy, as some liquidity pools are less reliable or completely unreliable. Many liquidity pools may be on offer to price a token, so when should we rule them out, and how do we choose?

Starting with two constraints on the price derivation process:

  • Tokens must receive their price from a liquidity pool that pairs it with one of a small set of curated tokens.
    • Usually, the curated tokens are selected based on TVL in the protocol. We have this preference because, with higher TVL, they are more likely to propagate a stable and accurate price, and the prices of other tokens will be tethered to commonly traded assets.
  • The token that gives a price to another must have a TVL in that pool above a chosen threshold.
    • Similarly, with higher TVL, there is more likely to be high visibility and activity on these pools, which makes them less likely to represent prices far off the market value.

Among the liquidity pools that meet the above constraints, we select the pool with the highest TVL to give a price to the token.

Edge Cases

While the methodology and constraints outlined above do well in assigning an accurate price, some edge cases still need to be corrected in the data. These outliers are impactful on metrics like TVL and trading volume. Below, we'll illustrate two scenarios that can cause these outliers.

Scenario 1

Explanation

There is no liquidity between the upper and lower positions on the curve. As a result, a small swap can cause the price to slip very far down the curve and cause a massive imbalance in the relative price of tokens.

  • 1 wETH = 0.00000001 Link
  • 1 wETH = $1,500
  • 1 Link = 100,000,000 wETH
  • 1 Link = $150,000,000,000
  • 10 Link = 1.5 Trillion USD!!!

Note: There is still considerable liquidity in the upper and lower positions pool, so the minimum liquidity threshold can still easily be met.

Scenario 2

A token's value may be artificially propped up in a pricing pool, and a much more significant amount of this token exists in some other pool. This circumstance may also lead to the appearance of an arbitrage opportunity between two pools, yet this does not occur, and a large amount of value appears to exist, more or less created out of thin air.

Imagine we have two pools:

  • Pool 1 = Token X & Token Y
  • Pool 2 = Token Y & Token Z

Pool 1 is a pool that propagates a USD price from token X to token Y. Imagine it has > 250k USD in token X to meet our criteria.

Pool 2 is a pool causing the TVL to spike in our data, as seen in the image below.

How could these pools work together to produce a price that would cause a spike in the data?

  • Calculating the USD value of token Y from token X in pool 1:
    • token X price * (token X/token Y relative price).
      • $1525 * 0.0003956.
        • $0.603 per token Y.
  • Calculating USD TVL of token Y in each pool from this price.
    • Pool 1:
      • 467,090 token Y balance * $0.603 per token Y.
      • $281,655 token Y TVL.
    • Pool 2:
      • 1,456,594,115 token Y balance * $0.603 per token Y.
      • $878,326,251 token Y TVL.

Pool 2 now has a TVL of over 800 million, which is far too large for an average liquidity pool.

This example is a case we found in our Uniswap V3 Arbitrum subgraph. However, when we looked deeper, pool 1 and pool 2 were both pools that contained a token pair between wETH and Arcadeum.

GraphQL API request

This data request was made using The Graph on the Uniswap V3 subgraph. Play around with our Uniswap V3 Ethereum subgraph here using the subgraph Hosted Service. Here are some example queries you can use!

Pool 1

Pool 2

If both pools contained wETH and Arcadeum, why would pool 1 propagate the price and not pool 2, since pool 2 has a higher balance of wETH?

We discovered wETH in pool 2 and Arcadeum in both pools are tokens that imitate the real versions. They have:

  • Same name and/or symbol
  • Different address

Between pool 1 and pool 2, we have three tokens:

  • Pool 1: wETH/Arcadeum
  • Pool 2: wETH(imitation)/Arcadeum

Pool 2 Tokens

Arcadeum Token (Imitation)

Arcadeum Token (Real)

wETH Token (Imitation)

What is essential as it relates to the cause of this edge case is these imitation versions of the token are not tradeable on the public market or the ERC20 contracts may be modified to set up a honeypot in the pricing pool, as evidenced by their low activity on the chain, and an evident and profitable arbitrage opportunity is not followed up on between pool 1 and pool 2.

If someone could acquire a small amount of this imitation wETH in pool 2, which does not have any value outside of this context, they would be able to trade for a large amount of real wETH in pool 1 by first trading for Arcadeum in pool 2.

Note: There is also an illusion of an arbitrage opportunity in the other direction if the user is unaware that the wETH in pool 2 is not the real wETH that has no value except for being used to exchange it back for Arcadeum in this same pool. This may be the real intent of this scheme. For example, a user tries to swap some wETH for Arcadeum in pool 1 and then attempts to trade the Arcadeum for more wETH in pool 2 since its relative value in Arcadeum is lower, only to realize it is not the real wETH. As they try to swap back for their wETH in pool 1, the LP removes the liquidity or is blocked by the imitation token contract (honeypot), and they are stuck holding Arcadeum or the imitation wETH. Both of which have no value.

Addressing The Problem

Two Classes:

  1. Poor liquidity distribution can cause significant shifts in price from small trades.
    • Especially prone to causing issues on concentrated liquidity protocols.
  2. Token value is artificially propped up in a pricing pool, where a much more significant amount of this token exists in another pool. The value created cannot be utilized for arbitrage, as specific tokens are unavailable on the public market.
    • This has only been seen to happen with imitation tokens, but this may not be the case in all circumstances.
    • It's hard to say why people create these imitation tokens and pools, but it looks like an attempt at scamming users looking for arbitrage.

Solution - Outlier arising from poor liquidity distribution

In each example we have seen thus far, these problematic liquidity pools remain inactive or shift back to a normal price quickly after a swap causes a price to move drastically. Requiring price changes more significant than some threshold to be re-confirmed has proven helpful in improving robustness against large spikes in our metrics.

  • Not a rock-solid solution, but it is empirically tested and appears to work in practice for this class of issue.
  • This constraint would not be robust against attacks or circumstances where trades continue to confirm the price.

Instead of pricing a token based on the relative price of tokens in a liquidity pool, we price them using ratios reflected in recent swaps. Swaps would give a more accurate price since only a minimal amount in token value, or none, of the swap will occur on the price at the tail end of the curve.

Solution - Outliers arising from artificial propping up token prices

The solutions above will not address outliers of this type. We have peered at the activity in the pools causing these outliers, and in some cases, they appear to be moderately active so that price re-confirmation may be circumvented.

The best path forward is to devise clever ways to detect and prune calculations involving tokens where we cannot derive a fair market value price like in the scheme outlined in scenario 2. A few options are:

  • Curate a list of tokens to get prices for and ignore others.
  • Since imitation tokens tend to reuse names and symbols of real tokens, we can ignore pricing pools when a token in the pool reuses a symbol of a known token.
  • Finding attributes of ERC20 tokens that indicate we may not be able to get a fair market value price.
    • # holders.
    • # transactions.

Discussion

This article explored two scenarios that can cause anomalies in higher-level subgraph metrics from token pricing and solution for each. Through this investigation, we have determined three main ways to improve our pricing mechanism's resilience. These were:

  1. Reconfirmation of token prices that deviate more than some threshold from the previous price or its impact on protocol-level metrics.
  2. Utilizing swaps instead of the pool state to retrieve a price for a token.
  3. Improving detection and exclusion of tokens that cannot be priced at a fair market value using liquidity pools.

As previously discussed, swaps are a good solution for addressing pricing outliers arising from poor liquidity distributions in concentrated liquidity pools. This is because it estimates the pricing from actual tokens exchanges vs. a liquidity pool state that may have come to exist only because of this poor liquidity distribution with very little or no value being exchanged at the current price. Time-weighted average pricing (TWAP) is a common implementation of swaps to create tokens' pricing. You can learn more about Uniswap V3's implementation of TWAP here!


We will continue to learn about and improve our ability to provide reliable prices from on-chain data and hope to share more of our findings with you in the future. Thanks for reading!

Come work with us!

If you’re a software engineer interested in helping us contextualize and categorize the world’s crypto data, we’re hiring. Check out our open engineering positions to find out more.