Skip to content

[BYOC] Exclude external params from Graph Runtime#7564

Closed
d-smirnov wants to merge 1 commit into
apache:mainfrom
d-smirnov:byoc-graph-mem-duplication
Closed

[BYOC] Exclude external params from Graph Runtime#7564
d-smirnov wants to merge 1 commit into
apache:mainfrom
d-smirnov:byoc-graph-mem-duplication

Conversation

@d-smirnov

Copy link
Copy Markdown
Contributor

This patch exculdes external params from created Graph Runtime module
to prevent duplication as the external params will be also serialized
from Metadata module.

This patch exculdes external params from created Graph Runtime module
to prevent duplication as the external params will be also serialized
from Metadata module.

Change-Id: I47eda7e53b85f0b840445b60d33c2939b1cf99a0
@comaniac

comaniac commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

It seems to me that the parameter name prefix doesn't indicate if it is in the external module or metadata module? Could you show some use cases?

cc @zhiics

@d-smirnov

Copy link
Copy Markdown
Contributor Author

I understood that the agreement is p[0-9]+ graph_runtime_codegen.cc:319 for internal params and <codegen_name><something_with_sequence_value> (e.g. utils.h, default implemtation uses _const_ suffix though) for param names generated from annotated functions.

@comaniac

comaniac commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

ConstantUpdater can be customized as introduced in this #6697, so the parameter naming convention is not guaranteed (cc @zhiics, does codegen_name and symbol already the same?) Even it's the agreement, I still have the following two concerns:

  1. The naming convention can be changed anytime, and seems like we have no way to know that we need to change this part accordingly. Specifically, we need a test case if we really need to use the parameter name to decide whether to put the constants to metadata module.

  2. If a customized ConstantUpdater doesn't assign "null", then the intention is still maintaining the constants in the metadata module.

In summary, a safer solution is to have a way to check whether the constant is alaready stored in the extenral module when building a module, but it's not easy and that's part of the reasons that we haven't fixed this issue until now.

@d-smirnov

Copy link
Copy Markdown
Contributor Author

Agree, However it still looks to me as the agreement to put codegen_name as prefix for const naming. Bth, would it help if for filtering out external consts regexp ^p[0-9]+$ were used? I was cautious of the usage of encoded literal here

@d-smirnov

Copy link
Copy Markdown
Contributor Author

@comaniac Any suggestions then? My first attempt was to explicitly split params from internal (p prefixed) and external modules in different sets. However the subsequent modifications were quite invasive. Basically my understanding of the issue is graph runtime module should not need external constatns and metadata module should not contain internal constants.

@comaniac

comaniac commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

Agree, However it still looks to me as the agreement to put codegen_name as prefix for const naming. Bth, would it help if for filtering out external consts regexp ^p[0-9]+$ were used? I was cautious of the usage of encoded literal here

I personally don't think it's a good idea to make decisions depending on variable names. The reason was mentioned in my previous comment. The variable names are supposed to be referred by users for debugging or development only.

@comaniac Any suggestions then? My first attempt was to explicitly split params from internal (p prefixed) and external modules in different sets. However the subsequent modifications were quite invasive. Basically my understanding of the issue is graph runtime module should not need external constatns and metadata module should not contain internal constants.

I don't have a good idea at this moment. It might be a way to extend the solution from @mbaret that assigns "null" pointers to the external module handled parameters to be a more general solution, but we need to be careful. Maybe we should start with an RFC for discussion.

@d-smirnov

d-smirnov commented Mar 2, 2021

Copy link
Copy Markdown
Contributor Author

I noticed that there is a check implemented for demanding codegen_name to be placed first (utils.h) in case of external codegen uses its own implementation of ConstUpdater. Looks to me now that I am on a safe side with current patch

@comaniac

comaniac commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

I noticed that there is a check implemented for demanding codegen_name to be placed first (utils.h) in case of external codegen uses its own implementation of ConstUpdater. Looks to me now that I am on a safe side with current patch

Yeha that's why I asked the question about if codegen_name and symbol always the same, but it doesn't solve the problem that even a parameter is named with the external codegen prefix, it's still possible not maintained by the external module.

btw, if the purpose of this PR is going to solve this issue (https://discuss.tvm.apache.org/t/model-graph-size-doubling-after-partitioning-for-arm-compute-library/9131), I think it would be much easier to add an ACL constant updater like what Ethos-N does.

@d-smirnov

d-smirnov commented Mar 2, 2021

Copy link
Copy Markdown
Contributor Author

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C codegen). Could you elaborate your proposal?

@comaniac

comaniac commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C). Could you elaborate you proposal?

I think this is not related to what codegen you are using? Based on the functionality of this PR, what you need is getting rid of some constatnts in the metadata module. Then it should be fine to put this logic to the ACL specific constant updater. @mbaret could you comment whether this proposal would work?

@d-smirnov

d-smirnov commented Mar 2, 2021

Copy link
Copy Markdown
Contributor Author

The constants removed by this PR are actually from graph runtime module. The metadata constants are kept intact.

The usage of custom ConstUpdater (in its current implementation) seems unrealistic to me due to the updater is called before the metadata and graph runtime modules are created. Metadata module gets all the local consts and the consts fetched via ConstUpdater as one map. Then this map of all constants is copied to graph runtime module

@trevor-m

trevor-m commented Mar 2, 2021

Copy link
Copy Markdown
Contributor

Thanks @d-smirnov for this fix! I was also trying to find ways to fix the duplicate weights bug but couldn't find any as clean as yours.

To address @comaniac's concern about using the names, could we use some sort of pass like free_vars to prune params and remove any params that aren't used in the module that is being compiled?

Also, this problem also affects relay vm. Can we do the same thing for VM?

@zhiics

zhiics commented Mar 3, 2021

Copy link
Copy Markdown
Member

I think this is just a work around. One way I can think of is that we can make graph runtime similar to some other runtimes. Therefore, it is able to query the needed parameters. I had a local draft commit to do this some time ago but I could not find it now.

@mbaret

mbaret commented Mar 9, 2021

Copy link
Copy Markdown
Contributor

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C). Could you elaborate you proposal?

I think this is not related to what codegen you are using? Based on the functionality of this PR, what you need is getting rid of some constatnts in the metadata module. Then it should be fine to put this logic to the ACL specific constant updater. @mbaret could you comment whether this proposal would work?

Apologies @comaniac that I missed this earlier. We did explore using the same fix as for Ethos-N (giving a 'null' ConstantUpdater), but this doesn't work for ACL as ACL does load the constants from MetadataModule. Ethos-N just directly serializes the constants into the binary module so it doesn't care if they're missing in the MetadataModule.

This problem seems quite fundamental to me and shouldn't just affect ACL, but any BYOC target that tries to make use of MetadataModule.

I think this is just a work around. One way I can think of is that we can make graph runtime similar to some other runtimes. Therefore, it is able to query the needed parameters. I had a local draft commit to do this some time ago but I could not find it now.

Given this is a user-facing issue (https://discuss.tvm.apache.org/t/model-graph-size-doubling-after-partitioning-for-arm-compute-library/9131) and we're very interested in getting a fix in soon, could you elaborate on your alternative proposal @zhiics?

@mbaret

mbaret commented Mar 22, 2021

Copy link
Copy Markdown
Contributor

Ping @comaniac @zhiics, what can we do to make progress here?

@zhiics

zhiics commented Mar 23, 2021

Copy link
Copy Markdown
Member

The approach I was thinking is that we can probably use the MetadataModule as the only place to save the weights. Other modules including GraphRuntimeModule can directly query the needed parameters from it (external modules have done that).

The way we filter the redundant parameters in the this PR can be used by users after the build (i.e. we can document it somewhere) but should not be part of the build codebase in my opinion.

@comaniac

Copy link
Copy Markdown
Contributor

We did explore using the same fix as for Ethos-N (giving a 'null' ConstantUpdater), but this doesn't work for ACL as ACL does load the constants from MetadataModule. Ethos-N just directly serializes the constants into the binary module so it doesn't care if they're missing in the MetadataModule.

Could we improve ACL's customized runtime so that it could know if the constant should load from MetadataModule or itself? To me, Ethos-N is just an extreme case that completely gets rid of constants in MetadataModule, but constant updater should be capable of supporting all three cases: all in MetadataModule, some in MetadataModule, and none in MetadataModule.

@d-smirnov

Copy link
Copy Markdown
Contributor Author

Closed as superseded with #7977

@d-smirnov d-smirnov closed this May 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants