ARROW-10386 [R]: List column class attributes not preserved in roundtrip#8549
ARROW-10386 [R]: List column class attributes not preserved in roundtrip#8549romainfrancois wants to merge 2 commits intoapache:masterfrom
Conversation
|
This increases again the weight of the metadata, which now has to include attributes for each element of a list column, aka a struct array, but this seems on this example to work. Can't use test_that("metadata of list elements (ARROW-10386)", {
df <- data.frame(x = I(list(structure(1, foo = "bar"), structure(2, foo = "bar"))))
tab <- Table$create(df)
expect_identical(attr(as.data.frame(tab)$x[[1]], "foo"), "bar")
expect_identical(attr(as.data.frame(tab)$x[[2]], "foo"), "bar")
}) |
|
Thanks - can confirm the sf object now roundtrips correctly on my machine as well using the current HEAD version. |
…st itself. ARROW-10386.
d716a74 to
4d1e73d
Compare
jonkeane
left a comment
There was a problem hiding this comment.
This looks good to me, a few minor suggestions
| apply_arrow_r_metadata(.x, .y) | ||
| }) | ||
| x | ||
| } |
There was a problem hiding this comment.
Could we move !is.null(columns_metadata) up to line 290 instead of having it on both 291 and 296?
| df <- data.frame(x = I(list(structure(1, foo = "bar"), structure(2, foo = "bar")))) | ||
| tab <- Table$create(df) | ||
| expect_identical(attr(as.data.frame(tab)$x[[1]], "foo"), "bar") | ||
| expect_identical(attr(as.data.frame(tab)$x[[2]], "foo"), "bar") |
There was a problem hiding this comment.
I wonder if it would be clearer to have different attributes on each item/row to make it super obvious that we're not picking the attributes of the first item/row and copying them for the whole column.
How much does it blow up the metadata? Is this going to scale to be able to handle normal/large shapefiles? Is there a more efficient representation for this metadata (considering that we have to serialize it to a string)? Should we special-case |
|
Closing in favor of #9182; relevant commits cherry-picked to there. |
…Hub issue numbers (#34260) Rewrite the Jira issue numbers to the GitHub issue numbers, so that the GitHub issue numbers are automatically linked to the issues by pkgdown's auto-linking feature. Issue numbers have been rewritten based on the following correspondence. Also, the pkgdown settings have been changed and updated to link to GitHub. I generated the Changelog page using the `pkgdown::build_news()` function and verified that the links work correctly. --- ARROW-6338 #5198 ARROW-6364 #5201 ARROW-6323 #5169 ARROW-6278 #5141 ARROW-6360 #5329 ARROW-6533 #5450 ARROW-6348 #5223 ARROW-6337 #5399 ARROW-10850 #9128 ARROW-10624 #9092 ARROW-10386 #8549 ARROW-6994 #23308 ARROW-12774 #10320 ARROW-12670 #10287 ARROW-16828 #13484 ARROW-14989 #13482 ARROW-16977 #13514 ARROW-13404 #10999 ARROW-16887 #13601 ARROW-15906 #13206 ARROW-15280 #13171 ARROW-16144 #13183 ARROW-16511 #13105 ARROW-16085 #13088 ARROW-16715 #13555 ARROW-16268 #13550 ARROW-16700 #13518 ARROW-16807 #13583 ARROW-16871 #13517 ARROW-16415 #13190 ARROW-14821 #12154 ARROW-16439 #13174 ARROW-16394 #13118 ARROW-16516 #13163 ARROW-16395 #13627 ARROW-14848 #12589 ARROW-16407 #13196 ARROW-16653 #13506 ARROW-14575 #13160 ARROW-15271 #13170 ARROW-16703 #13650 ARROW-16444 #13397 ARROW-15016 #13541 ARROW-16776 #13563 ARROW-15622 #13090 ARROW-18131 #14484 ARROW-18305 #14581 ARROW-18285 #14615 * Closes: #33631 Authored-by: SHIMA Tatsuya <ts1s1andn@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
On the original example:
Created on 2020-10-29 by the reprex package (v0.3.0.9001)