Skip to content

Commit

Permalink
Deploying to gh-pages from @ 9162879 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
chhzh123 committed Jan 13, 2025
1 parent 3b78bfa commit 4045774
Show file tree
Hide file tree
Showing 60 changed files with 5,243 additions and 231 deletions.
2 changes: 1 addition & 1 deletion .buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
config: c9e032fa7f49dc809f164063dfb2e28e
config: dabe4189cc532017a30e750cb5d38b1e
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file not shown.
183 changes: 183 additions & 0 deletions _downloads/39c6904b3f007c07e3d59200d0bf98b4/dive_03_composition.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Kernel Composition\n\n**Author**: Hongzheng Chen ([email protected])\n\nThis document will discuss kernel composition.\nIn the previous tutorials, we have seen how to write a simple kernel.\nHowever, in real applications, we often need to compose multiple kernels together.\n\nIn the following example, we define a ``matrix_add`` and a ``gemm`` kernel, and wrap them into a ``top``-level function.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import allo\nfrom allo.ir.types import int32, float32\n\nM, K, N = 32, 32, 32\n\n\ndef matrix_add(A: int32[M, N]) -> int32[M, N]:\n B: int32[M, N] = 0\n for i, j in allo.grid(M, N):\n B[i, j] = A[i, j] + 1\n return B\n\n\ndef gemm(A: int32[M, K], B: int32[K, N]) -> int32[M, N]:\n C: int32[M, N] = 0\n for i, j in allo.grid(M, N):\n for k in allo.reduction(K):\n C[i, j] += A[i, k] * B[k, j]\n return C\n\n\ndef top(A: int32[M, K], B: int32[K, N]) -> int32[M, N]:\n C = gemm(A, B)\n D = matrix_add(C)\n return D"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Different teams or people can then work on different parts of the code and optimize each kernel.\nWe first create a schedule for the ``matrix_add`` kernel, and add several optimizations.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s1 = allo.customize(matrix_add)\ns1.pipeline(\"j\")\nprint(s1.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we create a schedule for the ``gemm`` kernel and optimize it.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s2 = allo.customize(gemm)\ns2.reorder(\"k\", \"j\")\ns2.buffer_at(s2.C, axis=\"i\")\ns2.pipeline(\"j\")\nprint(s2.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that now we only optimize the separate kernels but do not incorporate them into the top-level function, as shown in the following printed module.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s = allo.customize(top)\nprint(s.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Therefore, after each part has been optimized, we need to explicitly *compose* them together.\nIn Allo, we can use the ``.compose()`` primitive to compose the schedules together into the parent function.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s.compose([s1, s2])\nprint(s.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the schedules for the ``matrix_add`` and ``gemm`` kernels are both correctly optimized in the top-level function.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Template Composition\nSometimes we may define template kernels and invoke the kernel with different template arguments. Allo provides an *id* option to specify the exact kernel to be composed.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def kernel[T_in, T_out, S](A: \"T_in[S]\") -> \"T_out[S]\":\n B: T_out[S] = 0\n for i in range(S):\n with allo.meta_if(T_out == int32):\n B[i] = A[i] + 1\n with allo.meta_else():\n B[i] = A[i] * 2\n return B\n\n\ndef top2(A: int32[M]) -> float32[M]:\n C = kernel[int32, int32, M, \"K1\"](A)\n D = kernel[int32, float32, M, \"K2\"](C)\n return D"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Specifically, the last argument of the template kernel is the *id* of the kernel. Later on we can use this ID for distinguishing different kernels during composition.\nWe also customize the two template kernels with different optimizations first.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s1 = allo.customize(kernel, instantiate=[int32, int32, M])\ns1.unroll(\"i\", factor=4)\nprint(s1.module)\n\ns2 = allo.customize(kernel, instantiate=[int32, float32, M])\ns2.pipeline(\"i\")\nprint(s2.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we compose the two template kernels into the top-level function with the ID specified.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"s = allo.customize(top2)\ns.compose(s1, id=\"K1\")\ns.compose(s2, id=\"K2\")\nprint(s.module)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see from the printed module that the loop in the first kernel is unrolled by a factor of 4, and the loop in the second kernel is pipelined.\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Binary file not shown.
Binary file not shown.
114 changes: 114 additions & 0 deletions _downloads/4fba383e419c1fc1ea22179140eb2d12/dive_01_data_types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Copyright Allo authors. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

"""
Data Types and Type Casting
===========================
**Author**: Hongzheng Chen ([email protected])
This document will discuss the Allo-supported data types in detail.
All the data types are defined in the ``allo.ir.types`` module.
"""

import allo
from allo.ir.types import int16, int32, float32, Int, UInt, Float, Fixed

##############################################################################
# Currently, Allo supports three base data types for mathematical operations:
#
# - Integers: ``Int(bitwdith)``, ``UInt(bitwidth)``
# - Floating points: ``Float(bitwidth)`` (only support 16, 32, and 64 bits)
# - Fixed points: ``Fixed(bitwidth, frac)``, ``UFixed(bitwidth, frac)``
#
# For example, one can declare a 15-bit integer as ``Int(15)`` and an unsigned 8-bit fixed-point number with 3 fractional bits as ``UFixed(8, 3)``.
# For all the C/C++ supported data types, we provide shorthands like ``float32`` and ``int16`` to easily declare them.

# %%
# Notice different from native Python, Allo requires the program to be **strongly and statically typed**.
# The variable types are either declared explicitly or inferred from the context.
# For a variable that first appears in the program, we should declare it with an expected data type using Python's type hint notation:

a: int32

# %%
# Once the data types are defined, an important consideration is how to handle
# operations between variables of different types. Allo supports two types of casting:
# (1) implicit casting that is automatically done by the Allo compiler;
# and (2) explicit casting that is manually done by the user.

##############################################################################
# Implicit Casting
# ----------------
# Allo has a strong type system that follows the `MLIR convention <https://mlir.llvm.org/docs/Dialects/ArithOps/>`_ to enforce the operand types are the same for the arithmetic operations.
# However, it is burdensome for users to cast the variables every time, and it is also error-prone to avoid overflow when performing computations.
# Therefore, Allo is equipped with builtin casting rules to automatically cast the variables to the same type before the operation, which is called *implicit casting*.
# An example is shown below:


def add(a: int32, b: int32) -> int32:
return a + b


s = allo.customize(add)
print(s.module)

# %%
# We can see that ``a`` and ``b`` are firstly casted to ``int33``, added
# together, and converted back to ``int32``.
# This is to avoid overflow and is automatically inferred by the Allo compiler.


##############################################################################
# Explicit Casting
# ----------------
# One can also explicitly cast the variable to a specific type by creating an intermediate variable,
# or use Python-builtin functions like ``float()`` and ``int()`` to explicitly cast a variable to ``float32`` or ``int32``.
# Another example is shown below:


def cast(a: int32) -> int16:
b: float32 = a # explicit
c: float32 = b * 2
d: float32 = float(a) * 2
e: int16 = c + d
return e


s = allo.customize(cast)
print(s.module)

# %%
# By explicitly creating an intermediate variable ``b``, we can cast the ``int32`` variable ``a`` to the desired floating-point type.
# Similarly, calling ``float(a)`` can also cast ``a`` to a floating-point type.
#
# .. note::
#
# The above stated explicit casting between integers and floating points preserves the value but the precision may be changed.
# If you want to use a union type to represent both integers and floating points, please use the `.bitcast()` API instead. For example, ``a.bitcast()`` can convert ``int32`` to ``float32`` representation with the bit pattern preserved.

##############################################################################
# Bit Operations
# --------------
# As hardware accelerators have ability to manipulate each bit of the data, Allo supports bit operations on
# those integer types. For example, we can access a specific bit in an integer ``a`` using the indexing operator:
#
# .. code-block:: python
#
# a[15]

# %%
# We can also extract a chunk of bits from an integer using the slicing operator:
#
# .. code-block:: python
#
# a[0:16]
#
# .. note::
#
# Allo follows the Python convention that the upper bound is not included, so ``[0:16]`` means
# extracting the first 16 bits, which is different from the Xilinx HLS convention that uses ``[0:15]``
# to indicate the first 16 bits.

# %%
# Not only constant values are supported, but also variables can be used as the index or the slice range.
Binary file not shown.
Loading

0 comments on commit 4045774

Please sign in to comment.