[dataflow] Supplement, refine, and organize designs of unified systolic array #282
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR supplemented, refined, and organized several designs of unified systolic array, including:
unified_gemm_simple
: simple unified array achieving GEMM, similar to the one in Gemmini;unified_gemm_tiling
: enabling GEMM tiling, which means the array size can be decoupled from the input dimensions;unified_gemm_daisy_chain
: non-instructional inputs are also transmitted in a daisy-chain manner, reducing simultaneous access to the array.Also take this opportunity to provide a comparative explanation of other dataflow array designs:
All designs mentioned above pass the csim functionality tests and also csyn dataflow check. Most of them have passed hw_emu and on-board tests as well, except for the more recent ones conducted after the server shutdown. But I hold the belief that it can be easily tested once the server is restored.
Additionally, this PR introduces
df_primitive_default
as a pass container to integrate default schedule optimizations for dataflow designs, such as the previously implementeddf_pipeline
and potential future passes. Users can disable this default optimization if they prefer to apply their own custom optimizations.Examples
Test cases are provided for each design.
Checklist