fee based txs/chunks selector #1833

tsachiherman · 2024-12-13T18:29:41Z

What ?

This PR adds support for selecting the largest set of transactions / chunks based on their fees.

In this PR, I haven't included any usage of that algorithm, but rather just the algorithm itself ( + unit test )

Description

LargestSet takes a slice of dimensions and a dimensions limit, and find the
largest set of dimensions that would fit within the provided limit. The return
slice is the list of indices relative to the provided [dimensions] slice.
note that the solution of the largest set is not
deterministic ( by the nature of the problem, there could be multiple "correct"
and different results )

Algorithm ( high level )

Scaling: Each dimension is scaled relative to its respective limit
to ensure fair comparison. This step normalizes the dimensions,
preventing larger dimensions from dominating the selection process.
Vector Sizing: Each dimension is treated as a vector, and its
magnitude is calculated. This metric is used to assess the
relative significance of each dimension.
Greedy Selection: Dimensions are iteratively selected based on
their scaled magnitudes, adding them to the subset until the size
limit is reached. The greedy approach prioritizes the most significant
dimensions, maximizing the overall subset size.

Implementation notes:

Precision: To mitigate potential precision issues arising from
small-scale dimensions, the function employs a scaling factor to
amplify values before normalization. This ensures accurate calculations
even when dealing with very small numbers.
Efficiency: The squared magnitude calculation is used as a proxy
for Euclidean distance, optimizing performance without sacrificing
accuracy for relative comparisons. This optimization avoids the
computationally expensive square root operation.

Motivation

The motivation behind this algorithm, in contrast to a FIFO ordering is the following -

When using a FIFO ordering ( more precisely, an optimistic attempt to include the transactions in the order they were received ), we are effectively using a random ordering. While this random ordering could be the ideal one, it would rarely be such.
Any stable algorithm ( such as this ) would allow us to prioritize transaction volume throughput over chunk/block saturation.

Consider the following ( single dimension ) challenge:

You have 5 transactions with the following weights [5,5,1,1,1]. The chunk capacity is 4.
Using FIFO, you would generate the following chunks [5][5][1,1,1]
Using the above algorithm, you would have generate the following [1,1,1][5][5].

This doesn't seems very meaningful as is, but imagine that you keep getting a stream of 3 transaction of weight 1 once every chunk/block ( high demand scenario ). Now, you would be generating the following:

In the case of FIFO: [5],[5],[1,1,1,1,1],[1,1,1,1,1]..
In case of the above algorithm: [1,1,1][1,1,1],...,[5],[5]

Hence, the network would favor TPS over chunk/block saturation. Big transaction would get pushed down the road.

…st-fee

aaronbuchwald · 2024-12-17T17:33:11Z

fees/dimension.go

@@ -73,6 +73,18 @@ func (d Dimensions) Add(i Dimension, v uint64) error {
 	return nil
 }

+func (d Dimensions) AddDimentions(a Dimensions) (Dimensions, error) {


Can we change the function signature to AddDimensions(a Dimensions, b Dimensions) since this is returning a new dimensions object rather than mutating to match the other functions in this package (ex. Add) ?

Can we fix the typo AddDimensions as well?

aaronbuchwald · 2024-12-17T20:10:37Z

fees/set.go

+	var err error
+	for i := 0; i < len(outIndices); i++ {
+		dim := dimensions[outIndices[i]]
+		if !accumulator.CanAdd(dim, limit) {
+			outIndices = append(outIndices[:i], outIndices[i+1:]...)
+			i--
+			continue
+		}


Can we add a benchmark for this function? I'm curious the performance difference of making this O(n^2) due to append remainder of the slice

you're correct that it would create a O(n^2) in the worst case scenario. I'll address that..

aaronbuchwald · 2024-12-17T20:15:07Z

fees/set_test.go

+
+		for i := range dimensions {
+			d := uint64(i)
+			dimensions[i] = Dimensions{d % 1000, (d + 200) % 1000, (d + 400) % 1000, (d + 600) % 1000, (d + 800) % 1000}


Where is this coming from?

It creates arbitrarily set of dimensions, that are uniformly distributes and are (mostly) non-identical.
I believe that for the purpose of evaluation the overall performance of the algorithm, it would provide a reasonable test-set. We could (naturally) pick a different test set, but I wouldn't expect any material performance difference.

Could we make this more obvious by using a random number modulo the desired upper bound rather than constructing it this way?

Could we also pre-generate if we're going to use the same seed and make sure to reset the timer when starting the actual benchmark?

fees/set_test.go

…github.com:ava-labs/hypersdk into tsachi/configurable-chunk-rate-limiter-largest-fee

…st-fee

…github.com:ava-labs/hypersdk into tsachi/configurable-chunk-rate-limiter-largest-fee

…st-fee

tsachiherman added 7 commits December 2, 2024 10:28

initial version.

b71b7be

update

52f5d28

update

e8fe102

update

113f0b3

Merge branch 'main' into tsachi/configurable-chunk-rate-limiter-large…

c2767f2

…st-fee

update

debd1b3

undo.

fd96a38

tsachiherman self-assigned this Dec 13, 2024

tsachiherman marked this pull request as ready for review December 13, 2024 18:30

tsachiherman requested review from aaronbuchwald and joshua-kim as code owners December 13, 2024 18:30

tsachiherman added 4 commits December 13, 2024 13:37

lint

a432928

update desc

74a9433

fix typo

b756c90

Merge branch 'main' into tsachi/configurable-chunk-rate-limiter-large…

e8d8569

…st-fee

aaronbuchwald requested changes Jan 6, 2025

View reviewed changes

tsachiherman added 3 commits January 6, 2025 10:47

update per CR

5ff7c83

Merge branch 'tsachi/configurable-chunk-rate-limiter-largest-fee' of …

372d112

…github.com:ava-labs/hypersdk into tsachi/configurable-chunk-rate-limiter-largest-fee

Merge branch 'main' into tsachi/configurable-chunk-rate-limiter-large…

2ef435a

…st-fee

tsachiherman requested a review from aaronbuchwald January 6, 2025 15:49

tsachiherman and others added 6 commits January 7, 2025 10:19

update unit test per feedback.

306ddb7

Merge branch 'tsachi/configurable-chunk-rate-limiter-largest-fee' of …

6417c3b

…github.com:ava-labs/hypersdk into tsachi/configurable-chunk-rate-limiter-largest-fee

update

32d9c92

Merge branch 'main' into tsachi/configurable-chunk-rate-limiter-large…

4755bb0

…st-fee

nit: use existing Add dimensions function instead of adding duplicate

86226ff

Merge branch 'main' into tsachi/configurable-chunk-rate-limiter-large…

bbd0b0c

…st-fee

aaronbuchwald approved these changes Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fee based txs/chunks selector #1833

fee based txs/chunks selector #1833

tsachiherman commented Dec 13, 2024 •

edited

Loading

aaronbuchwald Dec 17, 2024

aaronbuchwald Jan 6, 2025

aaronbuchwald Dec 17, 2024

tsachiherman Jan 6, 2025

aaronbuchwald Dec 17, 2024

tsachiherman Jan 6, 2025

aaronbuchwald Jan 6, 2025

fee based txs/chunks selector #1833

Are you sure you want to change the base?

fee based txs/chunks selector #1833

Conversation

tsachiherman commented Dec 13, 2024 • edited Loading

What ?

Description

Algorithm ( high level )

Implementation notes:

Motivation

aaronbuchwald Dec 17, 2024

Choose a reason for hiding this comment

aaronbuchwald Jan 6, 2025

Choose a reason for hiding this comment

aaronbuchwald Dec 17, 2024

Choose a reason for hiding this comment

tsachiherman Jan 6, 2025

Choose a reason for hiding this comment

aaronbuchwald Dec 17, 2024

Choose a reason for hiding this comment

tsachiherman Jan 6, 2025

Choose a reason for hiding this comment

aaronbuchwald Jan 6, 2025

Choose a reason for hiding this comment

tsachiherman commented Dec 13, 2024 •

edited

Loading