Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed build and test workflow for intel self-hosted runner #17

Merged
merged 7 commits into from
Jun 9, 2024

Conversation

gshimansky
Copy link
Collaborator

This workflow now is able to execute tests, some of them still fail but many tests pass.

Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
@gshimansky gshimansky requested a review from ptillet as a code owner June 8, 2024 02:04
@minjang minjang merged commit a6e6362 into main Jun 9, 2024
3 of 6 checks passed
minjang pushed a commit to minjang/triton-cpu that referenced this pull request Jun 22, 2024
…ng#17)

* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
minjang pushed a commit that referenced this pull request Jun 24, 2024
When running
[convert_blocked1d_to_slice0](https://github.com/triton-lang/triton/blob/0ba5f0c3cd029d5c3d1f01b9bf29dac32c27345e/test/Conversion/tritongpu_to_llvm.mlir#L924)
Triton ends up computing a rank of a matrix with 0 columns during linear
layout lowering, which trips up f2reduce, and causes undefined behavior,
detectable through
[UBSAN](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html).

Fix this by returning the rank (0) early in these cases, without calling
f2reduce.

<details><summary>Stack trace</summary>
<p>

```
third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30: runtime error: shift exponent 18446744073709551615 is too large for 64-bit type 'unsigned long long'
    #0 0x556ee2fea3be in inplace_rref_small third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30
    #1 0x556ee2fea3be in f2reduce::inplace_rref_strided(unsigned long*, unsigned long, unsigned long, unsigned long) third_party/triton/third_party/f2reduce/f2reduce.cpp:470:9
    #2 0x556ee2ea70da in getMatrixRank third_party/triton/lib/Tools/LinearLayout.cpp:125:3
    #3 0x556ee2ea70da in mlir::triton::LinearLayout::checkInvariants(bool) third_party/triton/lib/Tools/LinearLayout.cpp:299:7
    #4 0x556ee2ea656d in mlir::triton::LinearLayout::tryCreate(llvm::MapVector<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>, llvm::DenseMap<mlir::StringAttr, unsigned int, llvm::DenseMapInfo<mlir::StringAttr, void>, llvm::detail::DenseMapPair<mlir::StringAttr, unsigned int>>, llvm::SmallVector<std::__u::pair<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>>, 0u>>, llvm::ArrayRef<std::__u::pair<mlir::StringAttr, int>>, bool) third_party/triton/lib/Tools/LinearLayout.cpp:190:41
    #5 0x556ee2eb2150 in mlir::triton::LinearLayout::divideRight(mlir::triton::LinearLayout const&) third_party/triton/lib/Tools/LinearLayout.cpp:654:51
    #6 0x556ee2ee1c39 in mlir::cvtNeedsSharedMemory(mlir::RankedTensorType, mlir::RankedTensorType) third_party/triton/lib/Analysis/Utility.cpp:652:14
    #7 0x556ee2cf38fd in mlir::triton::getRepShapeForCvtLayout(mlir::triton::gpu::ConvertLayoutOp) third_party/triton/lib/Analysis/Allocation.cpp:66:8
    #8 0x556ee2cf3efa in mlir::triton::getScratchConfigForCvtLayout(mlir::triton::gpu::ConvertLayoutOp, unsigned int&, unsigned int&) third_party/triton/lib/Analysis/Allocation.cpp:95:19
    #9 0x556ee2cf6057 in mlir::triton::AllocationAnalysis::getScratchValueSize(mlir::Operation*) third_party/triton/lib/Analysis/Allocation.cpp:272:24
    #10 0x556ee2cf5499 in operator() third_party/triton/lib/Analysis/Allocation.cpp:343:7
    #11 0x556ee2cf5499 in void llvm::function_ref<void (mlir::Operation*)>::callback_fn<mlir::triton::AllocationAnalysis::getValuesAndSizes()::'lambda'(mlir::Operation*)>(long, mlir::Operation*) third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
    #12 0x556edeeee7a9 in operator() third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
    #13 0x556edeeee7a9 in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:174:5
    #14 0x556edeeee87c in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:182:9
    #15 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), mlir::Operation *, void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:313:10
    #16 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Operation.h:794:12
    #17 0x556ee2cf49e7 in mlir::triton::AllocationAnalysis::getValuesAndSizes() third_party/triton/lib/Analysis/Allocation.cpp:341:16
    #18 0x556ee2cf4852 in run third_party/triton/lib/Analysis/Allocation.cpp:182:5
    #19 0x556ee2cf4852 in AllocationAnalysis third_party/triton/lib/Analysis/Allocation.cpp:169:5
    #20 0x556ee2cf4852 in mlir::Allocation::run(llvm::DenseMap<mlir::FunctionOpInterface, mlir::Allocation, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>, llvm::detail::DenseMapPair<mlir::FunctionOpInterface, mlir::Allocation>>&) third_party/triton/lib/Analysis/Allocation.cpp:627:3
    #21 0x556ee1677402 in operator() third_party/triton/include/triton/Analysis/Allocation.h:227:26
    #22 0x556ee1677402 in void mlir::CallGraph<mlir::Allocation>::doWalk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)>(mlir::FunctionOpInterface, llvm::DenseSet<mlir::FunctionOpInterface, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>>&, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)) third_party/triton/include/triton/Analysis/Utility.h:350:7
    #23 0x556ee16756b3 in walk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, (lambda at third_party/triton/include/triton/Analysis/Allocation.h:222:9), (lambda at third_party/triton/include/triton/Analysis/Allocation.h:224:9)> third_party/triton/include/triton/Analysis/Utility.h:242:7
    #24 0x556ee16756b3 in mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp) third_party/triton/include/triton/Analysis/Allocation.h:220:5
    #25 0x556ee2c2bf18 in (anonymous namespace)::AllocateSharedMemory::runOnOperation() third_party/triton/lib/Conversion/TritonGPUToLLVM/AllocateSharedMemory.cpp:26:22
...
UndefinedBehaviorSanitizer: invalid-shift-exponent third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 
```
</p>
</details>
minjang pushed a commit that referenced this pull request Jun 24, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Aug 13, 2024
…ng#17)

* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
int3 pushed a commit that referenced this pull request Aug 29, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
minjang pushed a commit that referenced this pull request Sep 22, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
minjang pushed a commit that referenced this pull request Oct 22, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
minjang pushed a commit that referenced this pull request Oct 24, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Nov 13, 2024
Adds a python wrapper for a parallelized in-place copy function using libxsmm and OpenMP.
It is intended to be used for efficient tensor padding implementation.

The libxsmm path have to be specified through env variables:
  - XSMM_ROOT_DIR - path to libxsmm root dir with headers
  - XSMM_LIB_DIR - path to libxsmm.so location

libxsmm .so also has to be available during runtime execution e.g., exposed through LD_LIBRARY_PATH.
The XSMM python module can be built and installed using command:
  pip install -e ./third_party/cpu/python/
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Nov 13, 2024
Adds a python wrapper for a parallelized in-place copy function using libxsmm and OpenMP.
It is intended to be used for efficient tensor padding implementation.

The libxsmm path have to be specified through env variables:
  - XSMM_ROOT_DIR - path to libxsmm root dir with headers
  - XSMM_LIB_DIR - path to libxsmm.so location

libxsmm .so also has to be available during runtime execution e.g., exposed through LD_LIBRARY_PATH.
The XSMM python module can be built and installed using command:
  pip install -e ./third_party/cpu/python/
ienkovich pushed a commit to ienkovich/triton-cpu that referenced this pull request Nov 20, 2024
Adds a python wrapper for a parallelized in-place copy function using libxsmm and OpenMP.
It is intended to be used for efficient tensor padding implementation.

The libxsmm path have to be specified through env variables:
  - XSMM_ROOT_DIR - path to libxsmm root dir with headers
  - XSMM_LIB_DIR - path to libxsmm.so location

libxsmm .so also has to be available during runtime execution e.g., exposed through LD_LIBRARY_PATH.
The XSMM python module can be built and installed using command:
  pip install -e ./third_party/cpu/python/
int3 pushed a commit that referenced this pull request Dec 6, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
ienkovich pushed a commit that referenced this pull request Dec 6, 2024
* Fixed yaml syntax

Signed-off-by: Gregory Shimansky <[email protected]>

* Removed cpu label from run-on

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing zlib-dev

Signed-off-by: Gregory Shimansky <[email protected]>

* Added missing apt-get update

Signed-off-by: Gregory Shimansky <[email protected]>

* Remove pip cache because on self-hosted runner it slows things down

Signed-off-by: Gregory Shimansky <[email protected]>

* Corrected path to tests

Signed-off-by: Gregory Shimansky <[email protected]>

* Added installation of torch==2.1.2

Signed-off-by: Gregory Shimansky <[email protected]>

---------

Signed-off-by: Gregory Shimansky <[email protected]>
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Jan 20, 2025
Adds a python wrapper for a parallelized in-place copy function using libxsmm and OpenMP.
It is intended to be used for efficient tensor padding implementation.

The libxsmm path have to be specified through env variables:
  - XSMM_ROOT_DIR - path to libxsmm root dir with headers
  - XSMM_LIB_DIR - path to libxsmm.so location

libxsmm .so also has to be available during runtime execution e.g., exposed through LD_LIBRARY_PATH.
The XSMM python module can be built and installed using command:
  pip install -e ./third_party/cpu/python/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants