Skip to content

Return deterministic actions #5597

Merged
cmard merged 16 commits into
deterministic-actions-python-trainingfrom
develop-staging-determinstic-action
Nov 15, 2021
Merged

Return deterministic actions #5597
cmard merged 16 commits into
deterministic-actions-python-trainingfrom
develop-staging-determinstic-action

Conversation

@cmard

@cmard cmard commented Oct 27, 2021

Copy link
Copy Markdown
Contributor

Proposed change(s)

This PR will add an ability to retrieve actions deterministically based on the input to the model. A new run-options configuration has been added as well as a new CLI flag --deterministic.

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@cmard cmard changed the base branch from main to deterministic-actions-python-training October 28, 2021 15:58

@miguelalonsojr miguelalonsojr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see CR feedback.

Comment thread com.unity.ml-agents/CHANGELOG.md Outdated
Comment thread com.unity.ml-agents/CHANGELOG.md Outdated
Comment thread docs/Training-Configuration-File.md Outdated
# During training, clipping is done in TorchPolicy, but we need to clip before ONNX
# export as well.
self._clip_action_on_export = not tanh_squash
self.deterministic = deterministic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless it's going to be used outside of the ActionModel class, refactor to make it
self._deterministic.

cmard and others added 4 commits October 29, 2021 13:10
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>

@maryamhonari maryamhonari left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! Some nit changes on docs and I believe the setting test for yaml is not set right and I wonder how it's passing CI.

Comment thread com.unity.ml-agents/CHANGELOG.md Outdated
Comment thread ml-agents/mlagents/trainers/settings.py Outdated
Comment thread ml-agents/mlagents/trainers/cli_utils.py Outdated
action_spec: ActionSpec,
conditional_sigma: bool = False,
tanh_squash: bool = False,
deterministic: bool = False,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the docstring

test1_settings = run_options.behaviors["test1"]
assert test1_settings.max_steps == 2
assert test1_settings.network_settings.hidden_units == 2000
assert not test1_settings.network_settings.deterministic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use == True just for readability

agent_action1 = action_model._sample_action(dists)
agent_action2 = action_model._sample_action(dists)
agent_action3 = action_model._sample_action(dists)
assert torch.equal(agent_action1.continuous_tensor, agent_action2.continuous_tensor)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests on discrete actions would be great!

test1_settings = run_options.behaviors["test1"]
assert test1_settings.max_steps == 2
assert test1_settings.network_settings.hidden_units == 2000
assert not test1_settings.network_settings.deterministic

@maryamhonari maryamhonari Oct 29, 2021

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this test is wrong. Above we set deterministic: true in yaml, so it should be assert test1_settings.network_settings.deterministic == True, right
?

cmard and others added 5 commits November 1, 2021 09:33
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
…y-Technologies/ml-agents into develop-staging-determinstic-action
Comment thread ml-agents/mlagents/trainers/torch/distributions.py Outdated
cmard and others added 2 commits November 3, 2021 13:53
@cmard cmard merged commit 98da4b1 into deterministic-actions-python-training Nov 15, 2021
@delete-merged-branch delete-merged-branch Bot deleted the develop-staging-determinstic-action branch November 15, 2021 22:43
@maryamhonari maryamhonari restored the develop-staging-determinstic-action branch November 15, 2021 22:55
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants