ARROW-4651: [Flight] Use URIs instead of host/port pair#4047
Conversation
|
Does the change in format/ need to be voted on the mailing list? |
|
Hmm, I suppose we should, even if Flight is unstable for now. |
format/Flight.proto
Outdated
There was a problem hiding this comment.
I think it might be better to keep the location message and replace the internal with a single field URI. This makes adding additional fields easier in the future if needed. Also, documenting supported protocols might be useful.
I think it should be. Let's define the URIs that we support and make sure we get consensus there. |
|
Ok, I will put up a proposal on the mailing list. Thanks for the comments! |
|
The protocol change proposal was formally accepted on the Arrow-dev mailing list. Now this PR needs to be rebased (and conflicts fixed) before it gets reviewed. @lihalite, do you have time for this? Otherwise, I can take up. |
|
@pitrou sorry about that, I will rebase & clean up things later today (just got back from vacation). |
|
I've now rebased this. |
Codecov Report
@@ Coverage Diff @@
## master #4047 +/- ##
==========================================
- Coverage 88.19% 88.12% -0.08%
==========================================
Files 779 774 -5
Lines 97927 97265 -662
Branches 1251 1251
==========================================
- Hits 86370 85714 -656
+ Misses 11321 11315 -6
Partials 236 236
Continue to review full report at Codecov.
|
pitrou
left a comment
There was a problem hiding this comment.
I see that the C++ server API isn't changed (FlightServerBase::Init takes a simple port number). Is it intentional?
cpp/src/arrow/flight/types.cc
Outdated
There was a problem hiding this comment.
Perhaps it can be fixed later, but this won't work for IPv6 numeric addresses, e.g. you need grpc://[::1]:80 and not grpc://::1:80.
There was a problem hiding this comment.
uriparser (understandably) doesn't deal with URI construction, so we'd need recognize IPv6 addresses, or create a separate method for such addresses. Or perhaps just require that the user pass [::1]?
There was a problem hiding this comment.
Yes, we can require that for now.
(uriparser seems able to deal with URI construction, but the API looks a bit terrible)
|
I intended to change the C++ server API. I think I rebased out the change on accident as I was also trying to implement the builder APIs at the same time, but now I'd rather do that in a follow up PR. Thanks for the review, I'll get to fixing these issues! |
|
While I'm at it, might as well add the builder APIs here, and make TLS-enabled services possible. In C++/Python, I did not go all the way to a Java-style builder - the changes there are much more invasive, and I don't think it's worth it until we fully settle on supporting another transport. (And even then, I think it could be done within the current APIs, or with minimal changes to them.) |
|
Rebased, tests pass. |
pitrou
left a comment
There was a problem hiding this comment.
Thanks for updating this. It looks mostly good to me, just some style issues and a couple other details.
python/pyarrow/tests/test_flight.py
Outdated
There was a problem hiding this comment.
Ideally we would have a way to let gRPC bind the port and then return it (there is a race condition otherwise). But this needn't be in this PR.
|
Thanks for the feedback! I've updated things. gRPC supports binding to port 0 for a free port, we just need a way to report that back to the API user. |
pitrou
left a comment
There was a problem hiding this comment.
+1 from me on the C++ and Python changes.
|
Does this need reviewing on the Java side? |
|
Thanks @pitrou. For the Java side, perhaps @jacques-n could take a look again? |
|
Updated to fix an inadvertent API breakage. ( |
|
There are actually lots of little things I'm noticing here now that I'm trying to test internally, so please hold off while I fix things over the next couple days...apologies for the trouble. |
|
Ok, this should be ready now; fixed some inadvertent API breakages. |
There was a problem hiding this comment.
Why the move away from streams in some of these changes?
There was a problem hiding this comment.
Ah, here it's because I wanted the checked exception to be propagated - the stream hides this. And that applies to the change in FlightInfo as well.
java/flight/src/main/java/org/apache/arrow/flight/FlightServer.java
Outdated
Show resolved
Hide resolved
java/flight/src/main/java/org/apache/arrow/flight/FlightServer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Can't we have a default for certChain? Also, does it make sense to require it be a file?
There was a problem hiding this comment.
I'll add an overload for InputStream.
We may be able to hard-code some platform-specific paths to try.
There was a problem hiding this comment.
Oh wait, I mixed this up. This is the server, so you must provide the certificate chain and key (it's the cert/key the server presents). The client defaults to a platform-specific value already, though we should let the client specify a certificate store to check against.
java/flight/src/main/java/org/apache/arrow/flight/Location.java
Outdated
Show resolved
Hide resolved
java/flight/src/main/java/org/apache/arrow/flight/FlightServer.java
Outdated
Show resolved
Hide resolved
java/flight/src/main/java/org/apache/arrow/flight/LocationSchemes.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
dumb question, what is grpc verus grpc+tcp?
There was a problem hiding this comment.
Not a dumb question :) So far, they're aliases for each other (so the default "grpc" protocol is insecure gRPC over TCP)
pitrou
left a comment
There was a problem hiding this comment.
Looks mostly good to me on the C++/Python side. Just a few nits.
|
It looks like this is near merge-readiness. @jacques-n can you review/sign off on the Java changes? |
|
Rebased with master as well. |
|
I think Jacques is traveling right now so it may be a little time before we can get a go-ahead from the Java side. Would my review on the C++/Python side be helpful? |
|
I'd appreciate any feedback! On the Java side, I'm not in a particular rush to get this merged, and I can keep up with master, I'd just like to make sure I get everything in for 0.14. |
|
Cool, BTW I'm guessing on a somewhat longer release timeline than usual for 0.14 to give various in-flight efforts to sort themselves out, e.g. toward end of June / beginning July (EDIT: pun was not intended but...) |
|
Please make sure to run rebase any java CLs and re-run CI to make sure javadoc's are in place. |
|
@jacques-n I think you are the last approver on this PR |
|
Looks good to me. Thanks for pulling this together @lihalite! |
|
thanks @lihalite! It might have already been discussed here, but what is the testing strategy for TLS-enabled Flight going to be? Unless I missed it is doesn't seem that this is tested now, can we open a JIRA? |
|
It's not currently tested, JIRA: https://jira.apache.org/jira/browse/ARROW-5397 We'll need some way to generate certs/keys. |
|
The easiest thing to do is to store self-signed certs and the corresponding private key in the repo. |
I haven't changed the client/server construction interfaces to really take advantage of this. I would rather follow up with creating builders instead, as proposed, as that would be cleaner than special-casing the logic in the current constructors. Author: David Li <li.davidm96@gmail.com> Closes apache#4047 from lihalite/flight-uris and squashes the following commits: 870f6eb <David Li> Add more builder options for Java Flight servers 5c12763 <David Li> Make Python Flight bindings more complete 675acc9 <David Li> Introduce builder for C++ Flight servers 4744196 <David Li> Use builder for Flight server in Java 460442c <David Li> Use URIs for Flight locations
I haven't changed the client/server construction interfaces to really take advantage of this. I would rather follow up with creating builders instead, as proposed, as that would be cleaner than special-casing the logic in the current constructors.