WAF source file timestamp changes without real content change is causing other issues such as #4425, but it also make dataset losing its harvest object on the UI, and potentially it is the biggest contributor to the db-solr-sync workload.
How to reproduce
Modify a XML file timestamp on a WAF souce, reharvest
Expected behavior
No change on the dataset. UI stays the same, no addition workload to db-solr-sync
Actual behavior
See the error in the fetch log
Document with GUID ### unchanged, skipping...
On the UI, dataset lost its harvest souce metadata info
Sketch
[Notes or a checklist reflecting our understanding of the selected approach]
WAF source file timestamp changes without real content change is causing other issues such as #4425, but it also make dataset losing its harvest object on the UI, and potentially it is the biggest contributor to the db-solr-sync workload.
How to reproduce
Modify a XML file timestamp on a WAF souce, reharvest
Expected behavior
No change on the dataset. UI stays the same, no addition workload to db-solr-sync
Actual behavior
See the error in the fetch log
On the UI, dataset lost its harvest souce metadata info
Sketch
[Notes or a checklist reflecting our understanding of the selected approach]