Skip to content

chore: copy output integer vector#841

Merged
krlmlr merged 3 commits intomainfrom
vector-type-conversion
Jun 15, 2023
Merged

chore: copy output integer vector#841
krlmlr merged 3 commits intomainfrom
vector-type-conversion

Conversation

@Antonov548
Copy link
Copy Markdown
Contributor

For 0.10 integer type will be int64 and on R side such vectors should be copied with conversion to double type

for: #840

@Antonov548 Antonov548 requested review from krlmlr and szhorvat June 14, 2023 14:02
@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented Jun 15, 2023

Thanks. Do we need to free this memory? Can you please run with sanitizers to double-check?

@Antonov548
Copy link
Copy Markdown
Contributor Author

Thanks. Do we need to free this memory? Can you please run with sanitizers to double-check?

Ah, yes. Right, we need free. Mixing different languages during the day makes me forgetting about this.
I will fix it now.

@szhorvat
Copy link
Copy Markdown
Member

Yes, we'll need igraph_vector_int_destroy() for all of these. We should also take the opportunity to start cleaning up the error handling.

There are two kinds of functions, which are not currently distinguished by their naming, but they probably should:

  1. Functions that are called directly from R using .Call. Let's call these top-level functions.
  2. Functions which are called from C code, and typically return an igraph error code. Let's call these internal functions.

An example of a top-level function that this PR touches is R_igraph_count_isomorphisms_vf2().

An example of an internal function is R_SEXP_to_vector_int_copy().

It would be good to eventually start distinguishing them by using different naming, as they need to handle errors differently.

Inside of internal functions, always wrap any igraph function that returns an error code in IGRAPH_CHECK(). Also use the "finally" mechanism when there's more than one allocation happening.

int R_SEXP_to_vector_int_copy(SEXP sv, igraph_vector_int_t *v) {
  long int i, n=GET_LENGTH(sv);
  int *svv=INTEGER(sv);
  IGRAPH_CHECK(igraph_vector_int_init(v, n)); // this needed an  IGRAPH_CHECK
  for (i = 0; i<n; i++) {
    VECTOR(*v)[i] = svv[i];
  }
  return IGRAPH_SUCCESS; // use IGRAPH_SUCCESS instead of 0 for clarity
}

Question to @krlmlr and @Antonov548 regarding int *svv=INTEGER(sv), as I'm not familiar with the R API. What exactly does this line do? Does it expect sv to be of type integer on the R side, or does it convert to integer automatically? As I understand now, it expects an integer type, hence the as.integer() call on the R side.

At this point, the R code should be passing a double vector here, not an integer one, so we can make use of the full 64-bit range once transitioning to 0.10. Since we are copying anyway, instead of mapping memory, there is no reason to convert to integers in R. The conversion is now done in the C code instead.

@szhorvat
Copy link
Copy Markdown
Member

szhorvat commented Jun 15, 2023

Now regarding top-level functions, I'll have to think a little bit about the correct approach here: Is it safe to just wrap internal functions in IGRAPH_R_CHECK even if the "finally" mechanism is used, or do we need to convert complex top-level functions to internal ones first, and then wrap that internal function in a single IGRAPH_R_CHECK? Let's leave this for a bit later.

@Antonov548
Copy link
Copy Markdown
Contributor Author

Question to @krlmlr and @Antonov548 regarding int *svv=INTEGER(sv), as I'm not familiar with the R API. What exactly does this line do? Does it expect sv to be of type integer on the R side, or does it convert to integer automatically? As I understand now, it expects an integer type, hence the as.integer() call on the R side.

Yes, it expects as.integer as input. So, in 0.10 we will need to have conversion of types in this function.

I'm agree that we need to have separate top-level function and internal functions to have more proper error handling. Maybe even split them do different files.

I suggest to not make refactoring in this pull request if you (@szhorvat) don't mind. Because we could go really deep and stuck with refactoring of error handling. Maybe it make sense to do this after 0.10.

@Antonov548
Copy link
Copy Markdown
Contributor Author

Tested with sanitizer. No errors 👍

@szhorvat
Copy link
Copy Markdown
Member

I suggest to not make refactoring in this pull request

Agreed. My point is that when you write new code, make sure it already handles errors properly, so it wouldn't have to be refactored again.

@krlmlr krlmlr merged commit 20e2df0 into main Jun 15, 2023
@krlmlr krlmlr deleted the vector-type-conversion branch June 15, 2023 09:50
@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented Jun 15, 2023

Thanks. Do we now need to do the same for numeric vectors?

@Antonov548
Copy link
Copy Markdown
Contributor Author

Thanks. Do we now need to do the same for numeric vectors?

I see that types-RC.yml was merged automatically and not correct. Please check also this one: #846

As I understand from @szhorvat message, that we can keep same behavior for numeric to save performance for numeric vectors.

@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented Jun 15, 2023

We will need to copy "from" and "to" vectors, but we can memory-map, e.g., weights and other numeric vectors.

krlmlr added a commit that referenced this pull request Jun 15, 2023
This reverts commit 20e2df0, reversing
changes made to 4a0253d.
krlmlr added a commit that referenced this pull request Jun 15, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants