Skip to content

metadata on upload to dataset block#2064

Open
digaobarbosa wants to merge 13 commits intomainfrom
dataman-154-metadata-upload
Open

metadata on upload to dataset block#2064
digaobarbosa wants to merge 13 commits intomainfrom
dataman-154-metadata-upload

Conversation

@digaobarbosa
Copy link
Contributor

@digaobarbosa digaobarbosa commented Mar 2, 2026

What does this PR do?

Create a metadata field for dataset upload block

image image

And with variables in metadata

image image

Type of Change

  • New feature (non-breaking change that adds functionality)

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:
Uploaded images with metadata using UI

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

@digaobarbosa digaobarbosa self-assigned this Mar 2, 2026
@digaobarbosa digaobarbosa changed the title first draft for metadata on upload images metadata on upload to dataset block Mar 2, 2026
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 2, 2026

⚡️ Codeflash found optimizations for this PR

📄 16% (0.16x) speedup for register_datapoint in inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v1.py

⏱️ Runtime : 6.24 milliseconds 5.39 milliseconds (best of 15 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch dataman-154-metadata-upload).

Static Badge

@digaobarbosa digaobarbosa marked this pull request as draft March 3, 2026 10:56
@digaobarbosa digaobarbosa marked this pull request as ready for review March 3, 2026 11:08
digaobarbosa and others added 3 commits March 3, 2026 08:22
The optimization replaced `all(k not in prediction for k in ["top", "predicted_classes"])` with a direct boolean expression `("top" not in prediction and "predicted_classes" not in prediction)` in `is_prediction_registration_forbidden`, eliminating generator overhead and reducing that function's profiled time from 70.7% to 24% of its own cost. This hot helper is called on every datapoint registration (1125 calls in the profiler), so removing the generator iteration yields a cumulative ~4.7 ms speedup across all invocations, driving the overall 15% runtime improvement from 6.24 ms to 5.39 ms.

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 3, 2026

This PR is now faster! 🚀 @digaobarbosa accepted my optimizations from:

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 3, 2026

This PR is now faster! 🚀 codeflash-ai[bot] accepted my code suggestion above.

dkosowski87
dkosowski87 previously approved these changes Mar 5, 2026
@digaobarbosa digaobarbosa requested a review from dkosowski87 March 5, 2026 17:51
@classmethod
def get_parameters_accepting_batches(cls) -> List[str]:
return ["images", "predictions", "image_name"]
return ["images", "predictions", "image_name", "metadata"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking briefly I am not sure if that would work - here you mark as batch, but the metadata typing in run method speaks about scalar dict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look on this. It's working, but I will dig deeper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants