Accurately determining USD prices through liquidity pools on decentralized exchange (DEX) protocols, such as Uniswap V3, presents a significant challenge in safeguarding against edge cases. These edge cases can have substantial implications for higher-order metrics like trading volume and total value locked (TVL).
Our objective is to thoroughly examine the root causes of these edge cases and devise strategies to strengthen the robustness of our pricing mechanism. As our subgraphs on The Graph Network specifically rely on these liquidity pools for deriving token prices, enhancing their resilience is crucial. Doing so aims to improve our subgraphs' overall quality and reliability.
This article will explore the challenges and potential solutions for deriving token prices from liquidity pools in decentralized exchange protocols. We begin by reviewing the fundamentals of AMMs and concentrated liquidity, offering a solid foundation for understanding the mechanics at play. Next, we delve into two specific scenarios where edge cases for pricing can arise, highlighting the complexities involved. Finally, we discuss potential solutions to address these edge cases, ultimately contributing to a more robust and reliable pricing mechanism.
CFMMs are a model used for automated market-makers (AMMs) such as Uniswap to create a mathematically determined and controlled price that fluctuates automatically as users interact with the protocol.
One such CFMM model is a constant product market-maker (CPMM).
On Uniswap V3, users can create positions where the liquidity is only distributed within a specific range.
Liquidity is distributed across this range in discrete intervals at locations called ticks according to a truncated curve. When the price reaches a point on the curve that touches or surpasses an axis, all X is converted to token Y or vice versa.
The price moves depending on the liquidity available at the current price (tick). During a swap, when the liquidity of the outgoing token is exhausted at the current price, the price then moves to the next tick, and the rest of the swap will occur against the liquidity at this price, and so on, until the swap order is fulfilled. The price may move multiple or no ticks, depending on the swap size and the liquidity available at each tick. Discrete movements in price occur on a continuous curve represented by X*Y = k.
All token prices are ultimately derived from a USD-denominated token such as USDC, DAI, and Tether.
When a token is paired with a USD-denominated token, we can derive the price of the other token by determining its relative value in terms of the USD-denominated token. The relative balance of tokens determines their relative value, as in our review of CFMMs, or their relative value is determined by the current tick, as explained in our review of concentrated liquidity.
If a token has received a price from a USD-denominated token, we can use this token now to derive the price of other tokens as well, using this token because we now have a USD price for it.
Allowing a price to be propagated from one token to another without discretion can lead to price instability or inaccuracy, as some liquidity pools are less reliable or completely unreliable. Many liquidity pools may be on offer to price a token, so when should we rule them out, and how do we choose?
Starting with two constraints on the price derivation process:
Among the liquidity pools that meet the above constraints, we select the pool with the highest TVL to give a price to the token.
While the methodology and constraints outlined above do well in assigning an accurate price, some edge cases still need to be corrected in the data. These outliers are impactful on metrics like TVL and trading volume. Below, we'll illustrate two scenarios that can cause these outliers.
There is no liquidity between the upper and lower positions on the curve. As a result, a small swap can cause the price to slip very far down the curve and cause a massive imbalance in the relative price of tokens.
Note: There is still considerable liquidity in the upper and lower positions pool, so the minimum liquidity threshold can still easily be met.
A token's value may be artificially propped up in a pricing pool, and a much more significant amount of this token exists in some other pool. This circumstance may also lead to the appearance of an arbitrage opportunity between two pools, yet this does not occur, and a large amount of value appears to exist, more or less created out of thin air.
Imagine we have two pools:
Pool 1 is a pool that propagates a USD price from token X to token Y. Imagine it has > 250k USD in token X to meet our criteria.
Pool 2 is a pool causing the TVL to spike in our data, as seen in the image below.
How could these pools work together to produce a price that would cause a spike in the data?
Pool 2 now has a TVL of over 800 million, which is far too large for an average liquidity pool.
This example is a case we found in our Uniswap V3 Arbitrum subgraph. However, when we looked deeper, pool 1 and pool 2 were both pools that contained a token pair between wETH and Arcadeum.
GraphQL API request
This data request was made using The Graph on the Uniswap V3 subgraph. Play around with our Uniswap V3 Ethereum subgraph here using the subgraph Hosted Service. Here are some example queries you can use!
If both pools contained wETH and Arcadeum, why would pool 1 propagate the price and not pool 2, since pool 2 has a higher balance of wETH?
We discovered wETH in pool 2 and Arcadeum in both pools are tokens that imitate the real versions. They have:
Between pool 1 and pool 2, we have three tokens:
What is essential as it relates to the cause of this edge case is these imitation versions of the token are not tradeable on the public market or the ERC20 contracts may be modified to set up a honeypot in the pricing pool, as evidenced by their low activity on the chain, and an evident and profitable arbitrage opportunity is not followed up on between pool 1 and pool 2.
If someone could acquire a small amount of this imitation wETH in pool 2, which does not have any value outside of this context, they would be able to trade for a large amount of real wETH in pool 1 by first trading for Arcadeum in pool 2.
Note: There is also an illusion of an arbitrage opportunity in the other direction if the user is unaware that the wETH in pool 2 is not the real wETH that has no value except for being used to exchange it back for Arcadeum in this same pool. This may be the real intent of this scheme. For example, a user tries to swap some wETH for Arcadeum in pool 1 and then attempts to trade the Arcadeum for more wETH in pool 2 since its relative value in Arcadeum is lower, only to realize it is not the real wETH. As they try to swap back for their wETH in pool 1, the LP removes the liquidity or is blocked by the imitation token contract (honeypot), and they are stuck holding Arcadeum or the imitation wETH. Both of which have no value.
In each example we have seen thus far, these problematic liquidity pools remain inactive or shift back to a normal price quickly after a swap causes a price to move drastically. Requiring price changes more significant than some threshold to be re-confirmed has proven helpful in improving robustness against large spikes in our metrics.
Instead of pricing a token based on the relative price of tokens in a liquidity pool, we price them using ratios reflected in recent swaps. Swaps would give a more accurate price since only a minimal amount in token value, or none, of the swap will occur on the price at the tail end of the curve.
The solutions above will not address outliers of this type. We have peered at the activity in the pools causing these outliers, and in some cases, they appear to be moderately active so that price re-confirmation may be circumvented.
The best path forward is to devise clever ways to detect and prune calculations involving tokens where we cannot derive a fair market value price like in the scheme outlined in scenario 2. A few options are:
This article explored two scenarios that can cause anomalies in higher-level subgraph metrics from token pricing and solution for each. Through this investigation, we have determined three main ways to improve our pricing mechanism's resilience. These were:
As previously discussed, swaps are a good solution for addressing pricing outliers arising from poor liquidity distributions in concentrated liquidity pools. This is because it estimates the pricing from actual tokens exchanges vs. a liquidity pool state that may have come to exist only because of this poor liquidity distribution with very little or no value being exchanged at the current price. Time-weighted average pricing (TWAP) is a common implementation of swaps to create tokens' pricing. You can learn more about Uniswap V3's implementation of TWAP here!
We will continue to learn about and improve our ability to provide reliable prices from on-chain data and hope to share more of our findings with you in the future. Thanks for reading!
If you’re a software engineer interested in helping us contextualize and categorize the world’s crypto data, we’re hiring. Check out our open engineering positions to find out more.