Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dataflow] Supplement, refine, and organize designs of unified systolic array #282

Merged
merged 3 commits into from
Dec 25, 2024

Conversation

AdrianLiu00
Copy link
Contributor

Description

This PR supplemented, refined, and organized several designs of unified systolic array, including:

  1. unified_gemm_simple: simple unified array achieving GEMM, similar to the one in Gemmini;
  2. unified_gemm_tiling: enabling GEMM tiling, which means the array size can be decoupled from the input dimensions;
  3. unified_gemm_daisy_chain: non-instructional inputs are also transmitted in a daisy-chain manner, reducing simultaneous access to the array.

Also take this opportunity to provide a comparative explanation of other dataflow array designs:

  1. systolic: simple output-stationary array;
  2. systolic_daisy_chain: output-stationary array, in which inputs are transmitted in the daisy-chain manner;
  3. systolic_multi_cache: On top of the daisy-chain design, the IO uses packing streaming types as arguments instead of the default buffer-based IO wrapping, enabling large-scale GEMM operations, similar to AutoSA.

All designs mentioned above pass the csim functionality tests and also csyn dataflow check. Most of them have passed hw_emu and on-board tests as well, except for the more recent ones conducted after the server shutdown. But I hold the belief that it can be easily tested once the server is restored.

Additionally, this PR introduces df_primitive_default as a pass container to integrate default schedule optimizations for dataflow designs, such as the previously implemented df_pipeline and potential future passes. Users can disable this default optimization if they prefer to apply their own custom optimizations.

Examples

Test cases are provided for each design.

Checklist

  • PR's title starts with a category (e.g. [Bugfix], [IR], [Builder], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage (It would be better to provide ~2 different test cases to test the robustness of your code)
  • Code is well-documented

@AdrianLiu00 AdrianLiu00 requested a review from chhzh123 December 22, 2024 21:50
Copy link
Member

@chhzh123 chhzh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thx!

@chhzh123 chhzh123 merged commit 63e83a7 into cornell-zhang:main Dec 25, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants