Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: columns to Eth2Processor and BlockProcessor #6862

Draft
wants to merge 28 commits into
base: unstable
Choose a base branch
from
Draft

Conversation

agnxsh
Copy link
Contributor

@agnxsh agnxsh commented Jan 20, 2025

No description provided.

@agnxsh agnxsh marked this pull request as draft January 20, 2025 05:56
Copy link

github-actions bot commented Jan 20, 2025

Unit Test Results

       15 files  ±0    2 285 suites  ±0   1h 6m 40s ⏱️ + 1m 28s
  5 368 tests ±0    5 021 ✔️ ±0  347 💤 ±0  0 ±0 
37 069 runs  ±0  36 579 ✔️ ±0  490 💤 ±0  0 ±0 

Results for commit 3ea4f12. ± Comparison against base commit 5547d2a.

♻️ This comment has been updated with latest results.

@@ -199,9 +245,62 @@ proc get_data_column_sidecars*(signed_beacon_block: electra.TrustedSignedBeaconB

sidecars

# Additional overload to perform reconstruction at the time of gossip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return err("DataColumnSidecar: Length should not be 0")

var
columnCount = data_columns.len
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can/should be let


var
columnCount = data_columns.len
blobCount = data_columns[0].column.len
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also can/should be let

for column_index in 0..<NUMBER_OF_COLUMNS:
var
column_cells: seq[KzgCell]
column_proofs: seq[KzgProof]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these could use newSeqOfCap with cellsAndProofs.len


let kzgCommits =
signedBlock.message.body.blob_kzg_commitments.asSeq
if dataColumns.len > 0 and kzgCommits.len > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to actually use kzgCommits though, aside from this check?

(Also, if it's just checking len, it shouldn't need asSeq)

for dc in data_columns:
if dc.index in custody_columns:
final_columns.add dc
dataColumnRefs = Opt.some(final_columns.mapIt(newClone(it)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why even create the non-ref final_columns to begin with, only to mapIt immediately?

Once you know you're going to set dataColumnRefs = Opt.some ..., can build it up in place.

@@ -1269,6 +1303,12 @@ func getSyncCommitteeSubnets(node: BeaconNode, epoch: Epoch): SyncnetBits =

subnets + node.getNextSyncCommitteeSubnets(epoch)

func readCustodyGroupSubnets*(node: BeaconNode): uint64=
var res = CUSTODY_REQUIREMENT.uint64
Copy link
Contributor

@tersec tersec Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function can be as

func readCustodyGroupSubnets(node: BeaconNode): uint64 =
  if node.config.peerdasSupernode:
    NUMBER_OF_CUSTODY_GROUPS.uint64
  else:
    CUSTODY_REQUIREMENT.uint64

without this setting of a maybe incorrect value first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, not sure why it's exported. It's only used in the same module in which it's defined:

beacon_chain/nimbus_beacon_node.nim
1306:func readCustodyGroupSubnets*(node: BeaconNode): uint64=
1349:    targetSubnets = node.readCustodyGroupSubnets()

@@ -79,6 +79,8 @@ const
int(ConsensusFork.Phase0) .. int(high(ConsensusFork))
BlobForkCodeRange =
MaxForksCount .. (MaxForksCount + int(high(ConsensusFork)) - int(ConsensusFork.Deneb))
DataColumnForkCodeRange =
MaxForksCount .. (MaxForksCount + int(high(ConsensusFork)) - int(ConsensusFork.Fulu))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect range because it overlaps with Blobs, the idea is
[blocks range][blobs range][columns range] ... high(int)
Allocated range for blocks is [0 .. 16383]
Allocated range for blobs is [16384 .. 32767]
So for blobs you should allocate [32768 .. 49151].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  DataColumnForkCodeRange =
    (BlobForkCodeRange.high + 1) .. (BlobForkCodeRange.high + 1 + int(high(ConsensusFork)) - int(ConsensusFork.Fulu))

around this value?

func getDataColumnForkCode(fork: ConsensusFork): uint64 =
case fork
of ConsensusFork.Fulu:
uint64(MaxForksCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of invalid code range it provides invalid codes which overlaps with blobs range.

@@ -132,6 +135,27 @@ proc getShortMap*[T](req: SyncRequest[T],
res.add('|')
res

proc getShortMap*[T](req: SyncRequest[T],
data: openArray[ref DataColumnSidecar]): string =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code is based on old and ugly version of blobs map, but it was recently changed to more suitable map format. So the idea is to show how blobs being distributed over the range request, so now it could look like
12.............................., which means that for request range there was returned 1 blob for first block in range and 2 blobs for second one, all other blocks are without blobs, it looks like you still using MAX_BLOBS_PER_BLOCK constant (i'm not sure is it appropriate) but if it so - please change this procedure to new version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also confused about the usage of MAX_BLOBS_PER_BLOCK. It seems like NUMBER_OF_COLUMNS (https://github.com/ethereum/consensus-specs/blob/v1.5.0-beta.1/specs/fulu/p2p-interface.md#configuration for example) is more relevant.

data: openArray[Slot]):
Result[void, cstring] =
if data.len == 0:
# Impossible to verify empty response
Copy link
Contributor

@tersec tersec Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a nit, I guess? But it's clearly not impossible, since this function does it (affirmatively).

I'd call this something akin to https://en.wikipedia.org/wiki/Vacuous_truth -- maybe vacuously valid.


static: doAssert MAX_BLOBS_PER_BLOCK_ELECTRA >= MAX_BLOBS_PER_BLOCK

if lenu64(data) > (req.count * MAX_BLOBS_PER_BLOCK_ELECTRA):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/ethereum/consensus-specs/blob/v1.5.0-beta.1/specs/fulu/p2p-interface.md#datacolumnsidecarsbyrange-v1 the correct check is:

The response MUST contain no more than count * NUMBER_OF_COLUMNS data column sidecars.

None of the MAX_BLOBS_PER_BLOCK_ELECTRA/MAX_BLOBS_PER_BLOCK checks apply here.

return err("incorrect order")
if slot == pSlot:
inc counter
if counter > MAX_BLOBS_PER_BLOCK_ELECTRA:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also NUMBER_OF_COLUMNS

@@ -429,6 +429,17 @@ proc initFullNode(
node.network.nodeId.resolve_column_sets_from_custody_groups(
max(SAMPLES_PER_SLOT.uint64,
localCustodyGroups))
custody_columns_list =
node.network.nodeId.resolve_column_list_from_custody_groups(
max(SAMPLES_PER_SLOT.uint64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This max(SAMPLES_PER_SLOT.uint64, localCustodyGroups) computation occurs 3 times in 11 lines of code. Maybe better to extract it once.

custody = node.network.nodeId.get_custody_groups(max(SAMPLES_PER_SLOT.uint64,
targetSubnets.uint64))

for i in 0'u64..<NUMBER_OF_CUSTODY_GROUPS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just iterate directly through custody? This retains the ordering but that's not important for node.network.subscribe.

@@ -642,6 +726,24 @@ proc storeBlock(
msg = r.error()
return err((VerifierError.Invalid, ProcessingStatus.completed))

elif typeof(signedBlock).kind >= ConsensusFork.Fulu:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when typeof(signedBlock).kind >= ConsensusFork.Deneb:
if blobsOpt.isSome:
let blobs = blobsOpt.get()
let kzgCommits = signedBlock.message.body.blob_kzg_commitments.asSeq
if blobs.len > 0 or kzgCommits.len > 0:
let r = validate_blobs(kzgCommits, blobs.mapIt(KzgBlob(bytes: it.blob)),
blobs.mapIt(it.kzg_proof))
if r.isErr():
debug "blob validation failed",
blockRoot = shortLog(signedBlock.root),
blobs = shortLog(blobs),
blck = shortLog(signedBlock.message),
kzgCommits = mapIt(kzgCommits, shortLog(it)),
signature = shortLog(signedBlock.signature),
msg = r.error()
return err((VerifierError.Invalid, ProcessingStatus.completed))
elif typeof(signedBlock).kind >= ConsensusFork.Fulu:
if dataColumnsOpt.isSome:
let columns = dataColumnsOpt.get()
let kzgCommits = signedBlock.message.body.blob_kzg_commitments.asSeq
if columns.len > 0 and kzgCommits.len > 0:
for i in 0..<columns.len:
let r =
verify_data_column_sidecar_kzg_proofs(columns[i][])
if r.isErr:
malformed_cols.add(i)
debug "data column validation failed",
blockRoot = shortLog(signedBlock.root),
column_sidecar = shortLog(columns[i][]),
blck = shortLog(signedBlock.message),
signature = shortLog(signedBlock.signature),
msg = r.error()
return err((VerifierError.Invalid, ProcessingStatus.completed))

This has the same ConsensusFork.Deneb/ConsensusFork.Fulu check order issue as the other case, where the Fulu one will never run.

MsgSource.gossip, parentBlck.unsafeGet().asSigned(), blobs,
Opt.none(DataColumnSidecars))

var columnsOk = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let
parent_root = signedBlock.message.parent_root
parentBlck = dag.getForkedBlock(parent_root)
if parentBlck.isSome():
var blobsOk = true
let blobs =
withBlck(parentBlck.get()):
when consensusFork >= ConsensusFork.Deneb:
var blob_sidecars: BlobSidecars
for i in 0 ..< forkyBlck.message.body.blob_kzg_commitments.len:
let blob = BlobSidecar.new()
if not dag.db.getBlobSidecar(parent_root, i.BlobIndex, blob[]):
blobsOk = false # Pruned, or inconsistent DB
break
blob_sidecars.add blob
Opt.some blob_sidecars
else:
Opt.none BlobSidecars
if blobsOk:
debug "Loaded parent block from storage", parent_root
self[].enqueueBlock(
MsgSource.gossip, parentBlck.unsafeGet().asSigned(), blobs,
Opt.none(DataColumnSidecars))
var columnsOk = true
let columns =
withBlck(parentBlck.get()):
when consensusFork >= ConsensusFork.Fulu:
var data_column_sidecars: DataColumnSidecars
for i in self.dataColumnQuarantine[].custody_columns:
let data_column = DataColumnSidecar.new()
if not dag.db.getDataColumnSidecar(parent_root, i.ColumnIndex, data_column[]):
columnsOk = false
break
data_column_sidecars.add data_column
Opt.some data_column_sidecars
else:
Opt.none DataColumnSidecars
if columnsOk:
debug "Loaded parent block from storage", parent_root
self[].enqueueBlock(
MsgSource.gossip, parentBlck.unsafeGet().asSigned(), Opt.none(BlobSidecars),
columns)

It looks like:

  • for a pre-Deneb block (still potentially relevant while genesis syncing, for example) or a Deneb/Electra block with no blobs (less common than before, but it happens), the blobsOk loop will be vacuously valid (no blobs means no invalid blobs, and blobsOk is initialized to true), so
    self[].enqueueBlock(
    MsgSource.gossip, parentBlck.unsafeGet().asSigned(), blobs,
    Opt.none(DataColumnSidecars))

    will run, but because the columnsOk-based check separately/independently enqueues blocks and has the same vacuously-valid logic,
    if columnsOk:
    debug "Loaded parent block from storage", parent_root
    self[].enqueueBlock(
    will enqueueBlock(...) the same block again.

In general, I think but haven't enumerated all cases that for the usual valid-blobs-or-columns case (either) the vacuous-truth aspect means that it will typically double-enqueueBlock things (blobs but not columns: the columns check is vacuously true, and the blobs check might be, if the blobs are valid).

Actually, this is even more -- it means that because blocks will have one or the other, all blocks will be either vacuously valid here for either blobs or columns and get enqueued at least once, so this opens a security bypass in block_processor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants