Impact Study of EIP-3074
May 19, 2021
ABSTRACT
Dedaub was commissioned by the Ethereum Foundation to perform an audit/study of the impact of Ethereum Improvement Proposal (EIP) 3074 (AUTH and AUTHCALL) on existing contracts.
In order to appraise the impact of the proposed change, we performed extensive queries over the source code and bytecode of deployed contracts, inspected code manually, examined past transactions/balances/approvals, and informally interviewed developers.
EXECUTIVE SUMMARY
The extent of the impact is not straightforward to ascertain. There are many affected contracts, estimated at 1.85% of unique (i.e., contracts with the same bytecode are counted once) active deployed contracts. Many of these contracts handle substantial sums and interact with an Automated Market Maker (AMM), such as Uniswap or Balancer. The result is that the contract is being threatened with flash-loan or other pool-tilting attacks.
However, our considered opinion after this study is that a) the impact will be significantly limited with appropriate awareness of the upcoming change; b) the vulnerable code patterns are already exploitable by miners and flashbots, and such exploits will become more threatening in the near future. As a result, we believe that the impact of EIP-3074 is certainly manageable and perhaps a net positive in the overall security of the Ethereum blockchain ecosystem. We recommend reading the section titled “Opinion” at the end of this report for more detail and documentation of our (subjective) opinion. The objective, numeric findings of the study are listed in detail in the “Experimental Findings and Study” section.
SETTING AND BACKGROUND
The focus of the study is to examine to which extent existing contracts are adversely impacted by the changes introduced by EIP-3074. The most significant impact of EIP-3074 (in its current “strong” form) on past contracts is the inability to reliably distinguish the msg.sender
(in Solidity) of a transaction. In particular EIP-3074 enables programmatically setting the transaction’s msg.sender
, thus rendering obsolete common checking patterns such as “msg.sender == tx.origin
”. This pattern is often used to ensure that a contract’s caller is an Externally Owned Account (EOA) and not a smart contract, and, thus, cannot have manipulated on-chain quantities atomically without the rest of the environment (i.e., real-world actors) getting a chance to correct or punish such manipulation. As the study confirms, protection against flash-loans in contracts that interact with AMMs is the primary use of such patterns.
EXPERIMENTAL FINDINGS AND STUDY
Task 1: Source Queries
Retrieve all contracts with msg.sender == tx.origin in their published source.
As the first quick experiment, we queried our database (behind app.dedaub.com) for contracts with common combination patterns between tx.origin
and msg.sender
in their source. This means that there are two sources of incompleteness in this query: a contract may not have source, or a contract may be checking tx.origin
against msg.sender
but without using this exact source code pattern. On the other hand, this query is over a fairly complete set of contracts, virtually all contracts ever deployed (with very minor exceptions). Notably, the numbers concern accounts, i.e., contracts with the same code are counted as many times as deployed. (This aspect will be different in our next experiments.)
|
|
Clearly, the number of contracts that employ the pattern is substantial.
Task 2: Bytecode Queries
Retrieve all contracts with msg.sender == tx.origin checks regardless of whether the contract has published source and of whether the contract has this exact code pattern or merely an equivalent one (e.g., getting msg.sender through an internal function, which is common).
In order to reduce the time overhead of our queries and not have our dataset be dominated by old and unused contracts, we limit our dataset to contracts that have transacted recently (within 200K blocks from block number 12374455). This is effectively all contracts that saw activity in the past month. We also disregard duplicates: contracts with the same bytecode are counted only once, regardless of how many times they are deployed. We end up with 34,962 unique contracts. For comparison, the total unique bytecodes in the DB of app.dedaub.com (up until the same block) are 398,275. By limiting our attention to contracts that transacted in the past month, we reduce our workload by over 10x, allowing more targeted exploration and quick experimentation with the most relevant contracts.
After decompiling the contracts using the gigahorse decompiler, we ran a simple analysis for the detection of comparisons between msg.sender
and tx.origin
. Because of the way our query is written (to enable completeness of the results) this will consider both “msg.sender == tx.origin
” and “msg.sender != tx.origin
”, as well as any other data-flow combination.
Through our pipeline we are able to successfully analyze 99.8% of all contracts in our dataset (with 41 decompilation timeouts) finding that the comparison between msg.sender
and tx.origin
is present in 1.85% of them (648 unique contracts).
Sanity checking of static analysis completeness:
The analysis detecting the pattern of this task will be used as a building block for the queries of the next tasks. Because of this, it is crucial to evaluate its completeness. To do so we reran the 2nd database query of task 1 (the one that returned 2312 rows), this time limiting its results to contracts transacted in the last 200K blocks. This query of our source database returned 387 unique contracts.
The sanity check is how many of the 387 are not in the 648 flagged by the bytecode-level analysis. The sanity check returned 59 contracts, i.e., the analysis would seem to have 85% recall. We sampled the first few: in most cases the guard is not present in the deployed contract but its deployers had uploaded the source of several files on etherscan and some other file had the guard. We had to sample 8 (of the missing 59) files before we found the first actual false negative for the static analysis. This leads to the conclusion that the 387 number is greatly inflated, therefore the static analysis at the bytecode level misses very few actual combinations of tx.origin
and msg.sender
.
Conclusion: the static analysis of bytecode is solid, finding at least ~98% of real guards that combine msg.sender
and tx.origin
and twice as many as a source-level query.
We now have a solid base for continuing with deeper analysis combinations. For the tasks that follow, we report results over the 648 contracts (1.85% of all active contracts of the past 200K blocks) returned by the analysis of this task.
Task 3: Revert for EOA Caller
Retrieve all contracts that do (effectively) require(msg.sender == tx.origin), i.e., revert (directly or soon thereafter) when the above check pattern fails.
In the next experiment, we examine more closely (yet still with automated analysis) the results of the previous step. We detect instances of the comparisons between msg.sender
and tx.origin
produced for Task 2 where the result of the condition flows to a conditional jump that can control whether the program will reach a REVERT/THROW
statement or not.
Our query is flexible enough to recognize simple guards such as “require(msg.sender == tx.origin)
” but also more complex patterns such as “require(isApproved(msg.sender) || msg.sender == tx.origin)
” while being agnostic to the exact shape of the checking code.
The results of our query show that such a pattern is present in 94.44% (612 unique contracts) of the contracts returned by Task 2.
Task 4: Automatic Classification: AMM Interactions
Retrieve all contracts that have msg.sender == tx.origin that guards an interaction with an AMM (Uniswap, Balancer). This indicates that the pattern is used as flash loan protection.
We next use static analysis to determine the extent to which a tx.origin+msg.sender
guard pattern clearly protects AMM interactions. This will necessarily be an under-estimate! The code semantics could be arbitrarily complex. We capture AMM interactions that are discernible in the same contracts and that can clearly be affected by price manipulation--e.g., swaps.
We provide two static analysis variants:
-
The first, optimized for completeness, detects programs that have at least one instance of the condition that combines
msg.sender
andtx.origin
and at least one external call with a function signature matching the Uniswap and Balancer APIs we model, even if our analysis cannot detect a way for them to be part of the same transaction execution. -
For the second variant, we optimize for precision, detecting instances of the conditions that combine
msg.sender
andtx.origin
produced in Task 2, where the condition can be followed by an external call with a signature matching the Uniswap and Balancer APIs we model.
The API calls we consider are the following:
- Uniswap/Sushiswap:
swapExactTokensForTokens(uint256,uint256,address[],address,uint256)
swapExactTokensForTokensSupportingFeeOnTransferTokens(uint256,uint256,address[],address,uint256)
swapExactTokensForTokens(uint256,uint256,address[],address,uint256,bool)
swapTokensForExactTokens(uint256,uint256,address[],address,uint256)
swapTokensForExactETH(uint256,uint256,address[],address,uint256)
swapTokensForExactETH(uint256,uint256,address[],address,uint256,bool)
swapExactTokensForETH(uint256,uint256,address[],address,uint256)
swapExactTokensForETHSupportingFeeOnTransferTokens(uint256,uint256,address[],address,uint256)
swapExactETHForTokens(uint256,address[],address,uint256)
swapExactETHForTokensSupportingFeeOnTransferTokens(uint256,address[],address,uint256)
swapETHForExactTokens(uint256,address[],address,uint256)
swapETHForExactTokens(uint256,address[],address,uint256,bool)
getAmountsOut(uint256,address[])
getAmountsIn(uint256,address[])
swap(uint256,uint256,address,bytes)
- Balancer
swapExactAmountIn(address,uint256,address,uint256,uint256)
swapExactAmountOut(address,uint256,address,uint256,uint256)
joinswapExternAmountIn(address tokenIn, uint tokenAmountIn, uint minPoolAmountOut)
exitswapExternAmountOut(address tokenOut, uint tokenAmountOut, uint maxPoolAmountIn)
joinswapPoolAmountOut(address tokenIn, uint poolAmountOut, uint maxAmountIn)
exitswapPoolAmountIn(address tokenOut, uint poolAmountIn, uint minAmountOut)
The results of our analysis variants are:
-
131 contracts (20.22% of the 648 contracts of Task 2) are detected by the completeness-optimized variant (merely containing both a Uniswap/Balancer API call and a
tx.origin+msg.sender
guard). -
104 contracts (16.05% of the 648 contracts of Task 2) are detected by the precision-optimized variant (containing a Uniswap or Balancer swap that the analysis determines to be controlled by the
msg.sender+tx.origin
comparison).
The discrepancy suggests that the static analysis is incomplete. This is expected. One source of incompleteness is mere program complexity. In contrast to the previous, >98%-complete, analysis, this one is much harder: the statement “tx.origin == msg.sender
” (or any variation) could be very far in the bytecode from the AMM function call. To estimate the impact of this source of incompleteness we inspected 5 of the contracts among those in the set of 131 but not the set of 104. In one of them the condition is used to control the execution of an AMM call while in the four others the condition is used to disallow accidental ETH receivals from EOAs (but accepting any contract). Therefore the 104 number is more reliable than the 131.
In addition to the above, the analysis produces an underestimate (i.e., incomplete results) for two reasons:
-
There may be AMM APIs that we do not take into account. We believe that we have covered the most frequent cases.
-
Our static analysis cannot reason about more complex inter-contract interactions. We can detect direct interactions with AMMs but not indirect ones (i.e. the analyzed contract calls another contract which interacts with an AMM).
The impact of these sources of incompleteness will become clear with the manual inspection of task 6.
To summarize, we estimate 104-of-648 (16.05% of uses of the pattern, 0.30% of all contracts) to be a lower bound of contracts using the guard for flash-loan/price manipulation protection because of interactions with an AMM.
Task 5: Automatic Classification: Reentrancy Protection
_Retrieve all contracts that have msg.sender == tx.origin as a guard over code that performs an external call. Consider which of these calls would correspond to a possible reentrancy pattern, under both strict and loose definitions. _
In the next task, we consider whether a tx.origin+msg.sender
guard may be used for reentrancy protection (or may inadvertently provide it, although this is a much harder question). Invalidating such guarding, via EIP-3074, can render unsuspecting contracts vulnerable.
Although our analysis will be necessarily incomplete, it is good to contrast it with the analysis of the previous task, for AMM interactions. This gives at least a relative measure of the scale of the impact of EIP-3074 on reentrancy, compared to other impacts.
Again in the universe of 648 contracts with a guard combining tx.origin
and msg.sender
, we have two analysis versions, one completeness-optimized and one precision-optimized. Their results show:
- 29 contracts in which the guard controls an external call that may be reentrant, i.e., with:
- receiver contract being a function of the caller (including a value derived from a mapping by looking up the caller)
- without a limit in the gas passed (i.e., no 2300 gas limit, as in Solidity’s
transfer
).
- 3 contracts (a subset of the above 29) that in addition have a likely-reentrant call:
- the call is performed after checking a storage address
- the same storage address is written-to after the external call.
This analysis is complex and very likely incomplete: the static analysis may not be able to see the semantic connection between the msg.sender+tx.origin
guard and the external call. However, it is fair to expect that the incompleteness will be analogous to that of the previous case, that of swap calls. (Note that in this case, of reentrancy, we don’t have a telltale sign as to which call may be reentrant, whereas Uniswap/Balancer swaps are readily identifiable, since they are standard API calls. Therefore, for reentrancy, we have to speculate as to how many calls are not found by referring to how many were not found in the case of swaps.)
Even with generous assumptions about incompleteness, the numbers suggest that it is highly unlikely that programmers employ the “tx.origin == msg.sender
” guard as protection against reentrancy. Even considered as accidental protection, the numeric impact seems low--e.g., 3 of 648 contracts (0.46% of contracts with a tx.origin+msg.sender
guard, or 0.009% of all contracts). Even a doubling of this number (to account for expected incompleteness) suggests a very low impact.
Of course, the real number may be 29 (i.e., the result of the completeness-oriented analysis) and not 3, and, further, this could be 1.5-2x higher if one accounts for incompleteness. However, manual sampling shows this to be unlikely and further confirms that the guard pattern combining tx.origin
and msg.sender
is very rarely (if at all) used as reentrancy protection in practice. Specifically, we sampled at random:
-
md5 hash 84B1EDCCA6EC726C6E1D99C6F9B23925, deployed twice, at account addresses 0x0FAE0AF7BA4C3AB3527382931B110B0D0C901175 and 0xBA85C3AF2DEA9A5FB1541AC68B92711E19764537. There is no source code, therefore ascertaining the property is hard. However, it seems very unlikely that the guard protects against reentrancy: the two external calls under the guard are to a
transferFrom
function (which should be safe, unless the token is tainted) and to a_solver
contract, initialized at construction (i.e., trusted). -
md5 hash 934758B4E5877AF9554D717B5B522074, deployed at account address 0x08C82F7513C7952A95029FE3B1587B1FA52DACED. The contract has source. It seems unlikely that the guard offers reentrancy protection, on purpose or inadvertently. There is no other reentrancy protection mechanism in the contract. The calls under the guard are withdrawals, transfers, and swaps. (These should be safe, unless the token is tainted.) The purpose of the guard is clear from a source comment:
// Try to make flash-loan exploit harder to do by only allowing externally-owned addresses.
-
md5 hash F1D155B8A6FEDD4FC7E30895A7520E86, deployed at account address 0xC3E2FAFF079BF3864060D61BB892FE21D0D27A93. The guard is certainly not used for reentrancy protection: both functions protected by the guard also have a
nonReentrant
modifier.
Finally, we inspected the 3 warnings of the precision-oriented analysis, for a likely-reentrant call. (This does not mean that the call is reentrant, but that it will be if it is not guarded.)
-
md5 hash 1A2D206B826F47A5E2B34BE52842547C deployed twice, at addresses 0x44BD4608AC3BBF8A8677F8B1EA00BD59595F5F9E and 0xECB456EA5365865EBAB8A2661B0C503410E9B347. The guard is not used for reentrancy protection: all its uses also have a separate
nonReentrant
annotation (this is a Vyper contract). In fact, a comment explains the use of the guard:
@dev Only callable by an EOA to prevent flashloan exploits
-
md5 hash 39C87BC9D4E18CDD25CB3EB399F7AE6B deployed at address 0x0BD1D668D8E83D14252F2E01D5873DF77A6511F0. This is a Mushrooms Finance contract. There is no threat of reentrancy: the guard is part of a “Keepers” modifier and ensures that a keeper is an EOA. A keeper is trusted anyway and the call that would have been subject to reentrancy is to a trusted “strategy” contract.
-
md5 hash 876A2FA9E21AFF9B23C129AA6751BC03 deployed once at address 0xB883F041C0FE3992197FF051644A507C6896C718. The guard is used to allow only EOAs and whitelisted contracts to call
deposit()
anddepositFor()
. The underlying_deposit()
function employs a reentrancy guard so themsg.sender == tx.origin
guard is not used as such.
To summarize, by analysis and inspection of contracts that have had any transaction in the past month (200K blocks) we have found zero evidence of use of a tx.origin+msg.sender
guard that protects against reentrancy, either by design or inadvertently. Given the analysis incompleteness and the limited extent of sampling, it is certainly conceivable that such cases exist (and later manual inspection tasks will uncover some likely instances), but it is virtually certain that they are very rare.