HDDS-6960. EC: Implement the Over-replication Handler#3572
HDDS-6960. EC: Implement the Over-replication Handler#3572JacksonYao287 merged 9 commits intoapache:masterfrom
Conversation
umamaheswararao
left a comment
There was a problem hiding this comment.
Thank you @JacksonYao287 for working on this patch. I have few comments to check. Thanks
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
| try { | ||
| //the command target node should be in-service and healthy | ||
| return nodeManager.getNodeStatus(dd) | ||
| .equals(NodeStatus.inServiceHealthy()); |
There was a problem hiding this comment.
I think we discussed this. This is not covering healthy_readonly?
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/hdds/scm/container/replication/TestECOverReplicationHandler.java
Show resolved
Hide resolved
...test/java/org/apache/hadoop/hdds/scm/container/replication/TestECOverReplicationHandler.java
Show resolved
Hide resolved
| * commands. | ||
| */ | ||
| public class ECUnderReplicationHandler implements UnderReplicationHandler { | ||
| public class ECUnderReplicationHandler implements UnhealthyReplicationHandler { |
There was a problem hiding this comment.
Not sure we will call over replication as unhealthy. Don't have better name in my mind but how about simply ReplicationHandler ?
Since it is common class now, javadoc should represent the generic message. Currently it assumed for underReplication handler.
There was a problem hiding this comment.
ReplicationHandler seems a little confusing, it might be considered as the handler to handle replication command.
so i suggest we use unhealthy just for now and change it to an appropriate name after we find a proper one. what about "nonOptimalReplicationHandler"
|
@umamaheswararao thanks for the review, will fix the comments later |
umamaheswararao
left a comment
There was a problem hiding this comment.
Thanks @JacksonYao287 for updating the patch. Patch almost looks good to me, except a small comment left.
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
|
Thanks @umamaheswararao for the review , i have update this patch according to your comments, pleas take a look |
| } | ||
| /** | ||
| * Identify a new set of datanode(s) to replicate/reconstruct the container | ||
| * and form the SCM commands to send it to DN. |
There was a problem hiding this comment.
Missed to update doc?
There was a problem hiding this comment.
sorry for missing , fixed now!
| import java.util.stream.Collectors; | ||
|
|
||
| /** | ||
| * this class holds some common methods that will be shared among |
There was a problem hiding this comment.
Typos: this --> This
Implementation -> implementation
|
thanks @umamaheswararao for the review! i have committed this patch to master branch |
* master: (46 commits) HDDS-6901. Configure HDDS volume reserved as percentage of the volume space. (apache#3532) HDDS-6978. EC: Cleanup RECOVERING container on DN restarts (apache#3585) HDDS-6982. EC: Attempt to cleanup the RECOVERING container when reconstruction failed at coordinator. (apache#3583) HDDS-6968. Addendum: [Multi-Tenant] Fix USER_MISMATCH error even on correct user. (apache#3578) HDDS-6794. EC: Analyze and add putBlock even on non writing node in the case of partial single stripe. (apache#3514) HDDS-6900. Propagate TimeoutException for all SCM HA Ratis calls. (apache#3564) HDDS-6938. handle NPE when removing prefixAcl (apache#3568) HDDS-6960. EC: Implement the Over-replication Handler (apache#3572) HDDS-6979. Remove unused plexus dependency declaration (apache#3579) HDDS-6957. EC: ReplicationManager - priortise under replicated containers (apache#3574) HDDS-6723. Close Rocks objects properly in OzoneManager (apache#3400) HDDS-6942. Ozone Buckets/Objects created via S3 should not allow group access (apache#3553) HDDS-6965. Increase timeout for basic check (apache#3563) HDDS-6969. Add link to compose directory in smoketest README (apache#3567) HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission (apache#3573) HDDS-6977. EC: Remove references to ContainerReplicaPendingOps in TestECContainerReplicaCount (apache#3575) HDDS-6217. Cleanup XceiverClientGrpc TODOs, and document how the client works and should be used. (apache#3012) HDDS-6773. Cleanup TestRDBTableStore (apache#3434) - fix checkstyle HDDS-6773. Cleanup TestRDBTableStore (apache#3434) HDDS-6676. KeyValueContainerData#getProtoBufMessage() should set block count (apache#3371) ... Conflicts: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/upgrade/SCMUpgradeFinalizer.java
HDDS-6960. EC: Implement the Over-replication Handler (apache#3572) (cherry picked from commit fac23c9) Change-Id: Ic273820549544731b781b143a810ee45ef0085df
What changes were proposed in this pull request?
Implement the Over-replication Handler
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6960
How was this patch tested?
unit test