HDDS-8773. [S3G] Improve list performance in FSO bucket#4868
HDDS-8773. [S3G] Improve list performance in FSO bucket#4868ChenSammi merged 1 commit intoapache:masterfrom
Conversation
|
@tanvipenumudy please take a look |
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
|
@adoroszlai Added test, please help trigger ci, Thanks ! ( ci passed in my branch https://github.com/whbing/ozone/actions/runs/5233785220) |
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/BucketEndpoint.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
|
Hi @kerneltime , could you please help to review this patch? |
|
@duongkame please review |
zhtttylz
left a comment
There was a problem hiding this comment.
Great catch and patch here,just leave some nit comments inline,JFYI.
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
|
Hi @whbing , looks like the build has some issue, could you take a look? |
|
Thanks @captainzmc for merging #5003 upon which this PR depends. Now this PR is ready and I have successfully tested some scenarios in my environment. @captainzmc @ChenSammi @adoroszlai If you have time, thanks for helping to review this pr. |
|
@captainzmc @duongkame @kerneltime @tanvipenumudy please review |
|
Thanks, and look forward to the review. The PR is working well in our cluster. Just fine-tuned and added comments in above new commit. |
|
Thank you @whbing for this very useful contribution. I will get this review done this week. cc @tanvipenumudy @duongkame |
tanvipenumudy
left a comment
There was a problem hiding this comment.
Thank you @whbing for this important change, please find a comment.
What changes were proposed in this pull request?
Option --delimiter '/' in ListObjectsV2 Is a commonly used option. Common scenarios are as follows:
aws s3 --endpoint http://<ip>:9878 ls buk/dir/aws s3api --endpoint http://<ip>:9878 list-objects-v2 --bucket buk1-ln --prefix '' --delimiter '/'Only listing immediate children node of prefix is needed in the above scenario.
In the current implementation of FSO bucket, the object is listed by Depth-First-Search algorithm, and then filtered by delimiter, which greatly reduces the performance.
It was reduced from tens of seconds to 3 seconds in my test environment after optimization.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8773
How was this patch tested?
info : s3v/buk1-ln linked fso bucket, and simulate multiple calls to the iterator by reducing the parameter
ozone.client.list.cache$ hadoop fs -count -v ofs://om/s3v/buk1-ln/* DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 14 1703 2177 ofs://om/s3v/buk1-ln/test 7 1 238 ofs://om/s3v/buk1-ln/test0 93 373113 373113 ofs://om/s3v/buk1-ln/test1 31 227379 227379 ofs://om/s3v/buk1-ln/test2before optimization:
after optimization:
Detail data are shown in the following table: