Improve performance of include? by 5-10x#47
Merged
knu merged 1 commit intoruby:masterfrom Sep 23, 2023
Merged
Conversation
Contributor
Author
|
I opened #48 to address the failing truffleruby test |
Rails uses IPAddr#include? to evaluate what it should use as the
client's remote ip by filtering potential ips against a trusted list
of internal ips. In a _very_ minimal app, #include? was showing up in
a profile as ~1% of request time.
The issue is that #include? was converting itself and the other value
passed in to ranges of IPAddr. This mean as a worst case (where other is
a non-IPAddr, like a String) then there would be 5 IPAddr instances
created (other -> IPAddr, and two each for the conversions to ranges).
However, wrapping the begin and end values as IPAddr is not needed
because they are necessarily fixed addresses already.
This patch extracts the logic for getting the begin_addr and end_addr
from the #to_range method so that they can be used in #include? without
having to instantiate so many IPAddr.
Benchmark:
```ruby
net1 = IPAddr.new("192.168.2.0/24")
net2 = IPAddr.new("192.168.2.100")
net3 = IPAddr.new("192.168.3.0")
net4 = IPAddr.new("192.168.2.0/16")
Benchmark.ips do |x|
x.report("/24 includes address") { net1.include? net2 }
x.report("/24 not includes address") { net1.include? net3 }
x.report("/16 includes /24") { net4.include? net1 }
x.report("/24 not includes /16") { net1.include? net4 }
x.compare!
end
```
Before:
```
Comparison:
/24 not includes /16: 175041.3 i/s
/24 not includes address: 164933.2 i/s - 1.06x (± 0.00) slower
/16 includes /24: 163881.9 i/s - 1.07x (± 0.00) slower
/24 includes address: 163558.4 i/s - 1.07x (± 0.00) slower
```
After:
```
Comparison:
/24 not includes /16: 2588364.9 i/s
/24 not includes address: 1474650.7 i/s - 1.76x (± 0.00) slower
/16 includes /24: 1461351.0 i/s - 1.77x (± 0.00) slower
/24 includes address: 1425463.5 i/s - 1.82x (± 0.00) slower
```
6669b86 to
b8d0323
Compare
|
I've seen similar in my own profiling, it would be great to see something like this merged! |
|
If there's anything I can do to further this change, please let me know. It'd be awesome to have it land 🙂 |
Contributor
Author
|
hey @knu, do you have any time for a review? 🙏 |
knu
approved these changes
Sep 23, 2023
Member
knu
left a comment
There was a problem hiding this comment.
This looks optimal to me, factoring out and reusing what already exists. 👍
Contributor
Author
|
Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rails uses IPAddr#include? to evaluate what it should use as the client's remote ip by filtering potential ips against a trusted list of internal ips. In a very minimal app, #include? was showing up in a profile as ~1% of request time.
The issue is that #include? was converting itself and the other value passed in to ranges of IPAddr. This mean as a worst case (where other is a non-IPAddr, like a String) then there would be 5 IPAddr instances created (other -> IPAddr, and two each for the conversions to ranges). However, wrapping the begin and end values as IPAddr is not needed because they are necessarily fixed addresses already.
This patch extracts the logic for getting the begin_addr and end_addr from the #to_range method so that they can be used in #include? without having to instantiate so many IPAddr.
Benchmark:
Before:
After: