Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine and Stabilize EOF Support #15310

Open
3 of 61 tasks
ekpyron opened this issue Jul 31, 2024 · 7 comments
Open
3 of 61 tasks

Refine and Stabilize EOF Support #15310

ekpyron opened this issue Jul 31, 2024 · 7 comments
Labels
EOF epic high effort A lot to implement but still doable by a single person. The task is large or difficult. high impact Changes are very prominent and affect users or the project in a major way. selected for development It's on our short-term development

Comments

@ekpyron
Copy link
Member

ekpyron commented Jul 31, 2024

I'm creating this issue to keep a list of tasks that will need to be done after #15294 is merged before we can consider EOF support "non-experimental".
The list is non-exhaustive so far and not ordered by importance, please just edit in additional tasks (or comment) for anything still missing.

General/Miscellaneous

  • Topological ordering of code blocks (see #15633)
  • Proper errors due to exceeding EOF limits (too many containers, too long functions, too many arguments, too many immutables, too long jumps, stack overflow/underflow, unreachable code, etc.).
    • With test coverage both via Yul and via high-level Solidity code.
  • loadimmutable() on EOF.
  • msize() on EOF.
  • Unoptimized compilation on EOF.
  • EOF opcode support in gas meter.
  • Proper workarounds for cases of "out of reach" rjumps
    (e.g. using intermittent jumps, outlining or splitting out control flow using JUMPF)
  • Refactor parts of Assembly::assemble to better split into EOF/non-EOF parts (see e.g. Implementation of EOF in Solidity [DO NOT MERGE] #15294 (comment) and Implementation of EOF in Solidity [DO NOT MERGE] #15294 (comment))

Language Level Changes

Solidity:

  • Properly reject non-salted creation during Analysis (when targetting EOF)
  • Ensure proper analysis treatment of various high-level constructs with EOF, including:
    • Disallow gas in call options (#15638).
    • Disallow type(C).creationCode, type(C).runtimeCode, <address>.code, <address>.codehash, gasleft() and selfdestruct().
    • low level calls (disallow; reintroduce for EOF-low-level-calls - consider either another unified version that works for both targets or implementing the EOF-style version for legacy as well, if at all possible) As implemented now (#15559) low-level calls work for both targets.
  • Builtin address calculation for salted creation (compatible with both EOF and legacy) (rejected/impossible)
  • In-language construct to check for EOF (like a global bool isEOF)? Option to define both legacy and EOF specific assembly blocks? (Ideally we're fine without this, but will need to check demand from library authors)
  • Allow deploying custom subcontainers (like https://gist.github.com/frangio/940f21778f411154dc58ff818c712929)
  • Related to the above: consider if we need to disallow verbatim for targetting EOF (since we cannot perform stack height analysis) and can/need to replace with full custom subcontainers. To retain verbatim, it'd need to declare it's max stack effects, if used with EOF.
  • Decision: disallow abicoder v1 pragma on EOF? => Yes

Yul:

  • Expose a more complete set of Yul builtins for accessing the data section (only auxdataload() available now).
  • Decision: should the names of non-EOF instructions and builtins remain reserved in Yul on EOF?

EVM assembly:

  • Support for remaining EOF opcodes: DATALOAD, DATASIZE, DATACOPY, RETURNDATALOAD, RJUMPV, DUPN, SWAPN, EXCHANGE.

Noteworthy Documentation Entries

  • Compile a list of breaking changes, similar to what we did for IR codegen.
    • Difference in address calculation in salted creation and
    • Constructor arguments no longer affecting salted-create address.
    • EXTCODESIZE check.
    • gas no longer allowed in call options.
    • Builtins no longer available in Solidity.
    • Opcodes/builtins no longer available in inline assembly and Yul.
    • Existence of implicit limits on some language constructs resulting from EOF size constraints.
  • Document the EOF-Yul builtins.
    • Including new restrictions in how nested objects and data can be referenced.
  • Document the EOF opcodes.
  • Check for required documentation changes on all changing output artifacts.
  • Check general introductions for legacy/EOF specific parts.

Compiler Output (and Input) Artifacts

  • Settle the details on CBOR metadata. (Where? -> Likely beginning of data; Should we tweak the format?) See List of metadata improvements sourcify#1523
    • May involve documenting that e.g. --no-cbor-metadata induces code changes (in particular dataload offsets) - similar prerelease vs release due to the length of the cbor-encoded version string.
  • Generate proper source mappings.
  • Properly define and output assembly text and assembly json for EOF.
  • After the above allow importing EOF assembly json.
  • (Minor) print JUMPDEST as NOP for EOF assembly/opcode.
  • Make sure the "immutable references" output remains accurate - and whether it can/should be reused for information within the data section or we should have a new artifact for that instead.
    • Also, restore validation against reading a non-existent immutable.

Optimizations

  • Use RJUMPV for dense Yul switches; investigate trying to turn sparse switches into dense switches (especially for the external dispatch)
  • CALLF RETF -> JUMPF peephole optimizer rule; tail-call optimization in codegen
  • Consider potentially new peephole optimizer rules (e.g. certain chains of swaps to exchange)
  • Adjust Yul inlining heuristics
  • Allow stack depth larger than 16 with SWAPN/DUPN during Yul->EVM code transform and remove all stack-to-memory logic
  • Exploit EXCHANGE for stack shuffling
  • Consider creating a version of the libevmasm low-level inliner that can inline EOF function calls according. (Very low-priority - most can be expected to be done on the Yul level - would be more relevant if we backported EOF to legacy codegen)
  • Relax RETURNDATACOPY restrictions in the optimizer.
  • Exploit RETURNDATALOAD
  • Sub-assembly deduplication on EOF (Do not duplicate subassemblies. #13804). Cannot be done during assembling like in legacy, but should be doable in the optimizer.

Testing

Note: Obviously any changes in all other points need proper testing. But beyond that generally:

  • Set up fuzzing pipelines for EOF vs legacy (needs robustness against minor differences in e.g. salted creation addresses)
  • Try to unify tests (e.g. if not already done, refactor tests that depend on address calculations (e.g. move into a testing contract)); try to reduce test copies between legacy and EOF in general, whenever possible.
  • Gas measurements in semantic tests on EOF. Consider also adding some gas tests (disabled in #15653).
  • Review correctness of properties of EOF opcodes in SemanticInformation.cpp and add more coverage via functionSideEffects tests (disabled in #15654).
  • Correct codehash calculation for EOFCREATE in EVMHost (see #15635)
  • Enable SMTChecker tests on EOF (disabled in #15659).
  • More coverage for EOF use in inline assembly.

Breaking Changes

Note: Non-backwards-compatible changes we should do even with legacy as target in the breaking release that switches to EOF per default. (EOF-specific breaking changes for EOF as target can generally be done outside of breaking releases if restricted to only affect compilation when EOF is enabled)

  • Consider renaming the previous special datacopy, etc., Yul builtins for legacy codegen.

New Language Features

Note: we don't need these to call EOF support stable, but notably a few language features become significantly easier to support with EOF, e.g.:

  • Reference-type Immutables (resp. a data location for the data section)
  • returndata as data location
  • calldata arguments in constructors
@ekpyron ekpyron moved this to To do in Solidity Focus Board Jul 31, 2024
@ekpyron ekpyron added selected for development It's on our short-term development high impact Changes are very prominent and affect users or the project in a major way. epic labels Jul 31, 2024
@cameel cameel added high effort A lot to implement but still doable by a single person. The task is large or difficult. EOF labels Oct 11, 2024
@hirwabr

This comment was marked as spam.

@rodiazet
Copy link
Contributor

There are a couple of remaining issues to address I would like to discuss

  1. msize usage in .sol or .yul tests disables _optimize flag in EVMObjectCompiler which results that this yulAssert fails. Not sure why we assumed that the flag must be true for EOF.
  2. There is no way to nicely rewrite tests like this because the contract address changes depending on optimization being on or off. I dissabled this test for now.
  3. For now we do not remove unused data from the bytecode which this test is supposed to test. I would leave it for now just remembering that we have to implement this later.
  4. We do not verify that immutable values where assigned before reading them. Not sure if this applies to EOF. It's being tested by this test. We should probalby add this to assembleEOF function.

@rodiazet
Copy link
Contributor

We should consider if legacy reserved identifiers for deprecated instructions by EOF should be released.

@cameel
Copy link
Member

cameel commented Dec 20, 2024

msize usage in .sol or .yul tests

Support for msize is not essential in the first phase of the implementation. You can mark the msize code path as unimplemented and restrict any tests touching it to legacy (unless msize is just incidental to the test - then remove it or add an EOF variant without it). I'll add to to the list above to deal with later.

There is no way to nicely rewrite tests (...) because the contract address changes depending on optimization being on or off.

Depends on how many of these we have. If not too many, my suggestion would be to make a copy of the test for EOF. The test is still relevant, it's just that isoltest does not let us generalize those expectations that easily and copying the test is less work than coming up with a custom solution for this particular case.

we do not remove unused data from the bytecode

I'll add this to the list. Disabling the test on EOF is fine, just add a TODO explaining that eventually we do want it to work, it's just not yet implemented.

We do not verify that immutable values where assigned before reading them.

I guess this is because we're not tracking immutable references. I'm not sure if reporting their positions on EOF is still necessary, but at least it's still doable so we may keep doing it to keep the interface the same. That depends on how we'll deal with loadimmutable() - the current solution with using auxdataloadn() does not let us track them.

So again, this is something to add to the list above. For now adding a TODO about EOF and disabling the test is fine.

@cameel
Copy link
Member

cameel commented Dec 21, 2024

I added the things I mentioned above. I also did a bigger update of the list. I had a bunch of TODOs noted down from reviewing the PRs and now that we're nearly done with the initial implementation (only some test tweaks left), it was a good moment to dump it all here.

Here's a rough summary of what I added:

  • General/Miscallaneous: leftover stuff that we should have but wasn't needed just to enable semantic tests
  • Language Level Changes: more analysis checks, disallow abicoder v1 pragma, Yul data builtins, reserved names, remaining EOF opcodes
  • Noteworthy Documentation Entries: more breaking changes, new restrictions on nested objects.
  • Compiler Output (and Input) Artifacts: assembly format details, validation for unset immutable
  • Optimizations: subassembly deduplication
  • Testing: gas tests, review semantic info, correct codehash, smetchecker tests, inline assembly
  • New Language Features: calldata args in constructor

It's not all equally important but I did not attempt to prioritize them at this point. Some of if may also need to be discussed first, especially those marked as "decision".

@cameel
Copy link
Member

cameel commented Jan 15, 2025

Forum thread to gather some feedback on the isEOF flag: [EOF] In-language construct to check for EOF.

@ekpyron
Copy link
Member Author

ekpyron commented Jan 22, 2025

As for some prioritization here, I'd start with (in this order - but open to discussion):

  • High-Level language: Rejecting unsalted creation and access to code in the high-level language (type(C).creationCode, etc.), if not already done
  • Allow stack depth larger than 16 with SWAPN/DUPN during Yul->EVM code transform
  • Full EVM assembly support (text/json/import/export)
  • Unoptimized compilation on EOF. (If we find it feasible to extract the necessary transformations as preprocessing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EOF epic high effort A lot to implement but still doable by a single person. The task is large or difficult. high impact Changes are very prominent and affect users or the project in a major way. selected for development It's on our short-term development
Projects
Status: 🌱 Q2 2025
Development

No branches or pull requests

4 participants