fix: track rate limited errors#8116
Conversation
|
new rate limited error messages with this branch:
|
| } | ||
| } | ||
|
|
||
| // other errors like RequestErrorCode.REQUEST_RATE_LIMITED could come from ourself, not the peer so we should not penalize them |
There was a problem hiding this comment.
I'm confused about this. Wouldnt an outgoing request that we "rate limit" mean that we are not making the request. How would the error "come from ourself"?
There was a problem hiding this comment.
per spec we should not send more than 2 requests per protocol to the same peer
in the future we should implement that at client side, ie do not send more requests to the same peer/protocol
then we can avoid that error
There was a problem hiding this comment.
ahh... as in we throw RequestErrorCode.REQUEST_RATE_LIMITED from the client instead of just not making the request?
There was a problem hiding this comment.
right now it'll throw RESP_RATE_LIMITED, ie we makes a request and peer returns that error. We don't have self rate limiter yet, the strategy for now is to control at the client side, ie in SyncChain (I have a local branch for it, will create a PR soon but want some more testings)
I'll also implement self rate limiter soon, in that case it'll throw REQUEST_RATE_LIMITED. Then we'll track this error and see which module needs to control active requests like in SyncChain
| return {code: RequestErrorCode.RESP_TIMEOUT}; | ||
| } | ||
|
|
||
| switch (status) { |
There was a problem hiding this comment.
Can we also add a RespStatus.RATE_LIMITED (which == 139) and check in this switch?
This will help in case error messages in clients change.
There was a problem hiding this comment.
rate limited error is not specified in the response code https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#responding-side
lighthouse returned 139 but other clients returned 1
I don't think we can improve unless it's specified in the spec
| { | ||
| name: "NA - rate limited", | ||
| errorMessage: "rate limited", | ||
| }, |
There was a problem hiding this comment.
Do you think we should add a test case for the "wait #.###" condition?
| outgoingErrorReasons: register.gauge<{reason: RequestErrorCode}>({ | ||
| name: "beacon_reqresp_outgoing_requests_error_reason_total", | ||
| help: "Count total outgoing request errors by reason", | ||
| labelNames: ["reason"], | ||
| }), |
There was a problem hiding this comment.
This metric is defined as a gauge but is being used with the inc() method in ReqResp.ts, which is counter behavior. For accurate metrics reporting, this should be defined as a counter instead of a gauge. Gauges represent values that can go up and down, while counters are for values that only increase over time, which appears to be the intended usage here.
| outgoingErrorReasons: register.gauge<{reason: RequestErrorCode}>({ | |
| name: "beacon_reqresp_outgoing_requests_error_reason_total", | |
| help: "Count total outgoing request errors by reason", | |
| labelNames: ["reason"], | |
| }), | |
| outgoingErrorReasons: register.counter<{reason: RequestErrorCode}>({ | |
| name: "beacon_reqresp_outgoing_requests_error_reason_total", | |
| help: "Count total outgoing request errors by reason", | |
| labelNames: ["reason"], | |
| }), |
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
|
seems like since we merged this PR our unit tests are failing consistently |
**Motivation** - fix unit tests and e2e tests of `peerDAS` branch **Description** - when sending status, based on its body we set correct version, otherwise peers cannot deserialize the request body - at fulu fork transition, update local status cache so that it sends the correct version of status message - fix failed unit test as introduced by [PR-8116](#8116 (comment)) --------- Co-authored-by: Tuyen Nguyen <twoeths@users.noreply.github.com> Co-authored-by: Nico Flaig <nflaig@protonmail.com> Co-authored-by: Cayman <caymannava@gmail.com>
**Motivation** - track req/resp outgoing request error by reason **Description** - the metric was added in #8116 <img width="835" height="609" alt="Screenshot 2025-08-28 at 15 25 59" src="https://github.com/user-attachments/assets/92b97adc-9ae1-4ce0-a1d3-ef32378d5ee0" /> --------- Co-authored-by: Tuyen Nguyen <twoeths@users.noreply.github.com> Co-authored-by: Cayman <caymannava@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
🎉 This PR is included in v1.34.0 🎉 |
Motivation
REQUEST_RATE_LIMITEDdoes not happen in that branchDescription
REQUEST_RATE_LIMITEDif message contains "rate limit", see rate limited error messages of all clients hereCloses #8065
Closes #8110
Test on dev nodes