ARROW-14989: [R] Update num_rows methods to output doubles not integers to prevent integer overflow#13482
ARROW-14989: [R] Update num_rows methods to output doubles not integers to prevent integer overflow#13482thisisnic wants to merge 12 commits intoapache:masterfrom
Conversation
|
|
paleolimbot
left a comment
There was a problem hiding this comment.
These changes should get it to work...the explicit keyword disables C++'s automatic generation of implicit construction via assignment (i.e., r_vec_size some_variable = (R_x_len_t) 123;), which is what happens when you return something (maybe I'm getting the details slightly wrong here but that's how I've internalized it...the documentation on this is not awesome: https://en.cppreference.com/w/cpp/language/explicit ).
It would be a little more slick to define the proper converting constructor so that you can just return some_int64; but I don't know the exact incantation to make that work.
Co-authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Co-authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
|
Benchmark runs are scheduled for baseline = 41f8bdf and contender = 7124baf. 7124baf is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
I think there's at least one more row count that needs this attention: https://github.com/apache/arrow/blob/master/r/src/dataset.cpp#L514 |
…rs to prevent integer overflow This PR enables `num_rows()` methods to be called on `Table` and `RecordBatch` objects without integer overflow when the value of `num_rows()` is higher than `.Machine$integer.max`. I originally wrote some tests but they take ages to run and crashed on CI anyway so I removed them, but they can be seen in https://github.com/apache/arrow/pull/13482/commits/e7cf8a66beab6d1b7d85304362086b6205a31279/. Closes apache#13482 from thisisnic/ARROW-14989_num_rows_double Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
…Hub issue numbers (#34260) Rewrite the Jira issue numbers to the GitHub issue numbers, so that the GitHub issue numbers are automatically linked to the issues by pkgdown's auto-linking feature. Issue numbers have been rewritten based on the following correspondence. Also, the pkgdown settings have been changed and updated to link to GitHub. I generated the Changelog page using the `pkgdown::build_news()` function and verified that the links work correctly. --- ARROW-6338 #5198 ARROW-6364 #5201 ARROW-6323 #5169 ARROW-6278 #5141 ARROW-6360 #5329 ARROW-6533 #5450 ARROW-6348 #5223 ARROW-6337 #5399 ARROW-10850 #9128 ARROW-10624 #9092 ARROW-10386 #8549 ARROW-6994 #23308 ARROW-12774 #10320 ARROW-12670 #10287 ARROW-16828 #13484 ARROW-14989 #13482 ARROW-16977 #13514 ARROW-13404 #10999 ARROW-16887 #13601 ARROW-15906 #13206 ARROW-15280 #13171 ARROW-16144 #13183 ARROW-16511 #13105 ARROW-16085 #13088 ARROW-16715 #13555 ARROW-16268 #13550 ARROW-16700 #13518 ARROW-16807 #13583 ARROW-16871 #13517 ARROW-16415 #13190 ARROW-14821 #12154 ARROW-16439 #13174 ARROW-16394 #13118 ARROW-16516 #13163 ARROW-16395 #13627 ARROW-14848 #12589 ARROW-16407 #13196 ARROW-16653 #13506 ARROW-14575 #13160 ARROW-15271 #13170 ARROW-16703 #13650 ARROW-16444 #13397 ARROW-15016 #13541 ARROW-16776 #13563 ARROW-15622 #13090 ARROW-18131 #14484 ARROW-18305 #14581 ARROW-18285 #14615 * Closes: #33631 Authored-by: SHIMA Tatsuya <ts1s1andn@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This PR enables
num_rows()methods to be called onTableandRecordBatchobjects without integer overflow when the value ofnum_rows()is higher than.Machine$integer.max. I originally wrote some tests but they take ages to run and crashed on CI anyway so I removed them, but they can be seen in https://github.com/apache/arrow/pull/13482/commits/e7cf8a66beab6d1b7d85304362086b6205a31279/.