ARROW-11434: [Rust][DataFusion] Rename length kernel to octet_length#9366
Closed
seddonm1 wants to merge 1 commit intoapache:masterfrom
Closed
ARROW-11434: [Rust][DataFusion] Rename length kernel to octet_length#9366seddonm1 wants to merge 1 commit intoapache:masterfrom
seddonm1 wants to merge 1 commit intoapache:masterfrom
Conversation
Member
|
I agree that it may be misleading, but from Rust's perspective, it is not "incorrect" to use This also collides with #9353, where One idea is to keep the name as is on the arrow crate, but name it |
Contributor
Author
|
No problem. I will close this PR and raise one with the function comments updated to clarify it's intended behavior. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR renames the
lengthkernel tooctet_lengthto clearly indicate what it returns and allows differentiation fromcharacter_length. The use of the termoctetcould be replaced withbytesbut was chosen given there is an ANSI SQL functionoctet_length.I have created the correct
character_lengthfunction as part of #9243.Issue
The rust
lengthkernel currently counts number ofbytes/octetswhich may or may not be the same as the number of characters given that Arrow uses UTF8 encoding. This means that the result of thelengthkernel on a string likejoséwill be 5 bytes rather than 4 characters.