Skip to content

[Relay][VM] Add support for references.#6798

Closed
jroesch wants to merge 8 commits into
apache:mainfrom
jroesch:vm-reference
Closed

[Relay][VM] Add support for references.#6798
jroesch wants to merge 8 commits into
apache:mainfrom
jroesch:vm-reference

Conversation

@jroesch

@jroesch jroesch commented Oct 30, 2020

Copy link
Copy Markdown
Member

Add initial support for references in the VM. cc @altanh @zhiics @icemelon9 @antinucleon @tmoreau89 @mbrookhart.

Comment thread python/tvm/autotvm/record.py

@junrushao junrushao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

Comment thread tests/python/relay/test_vm.py Outdated
scope.ret(relay.RefRead(ref_create))

f = relay.Function([], scope.get())
print(f)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we remove the print in the test?

@wweic wweic left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

Comment thread src/runtime/vm/vm.cc Outdated
}
case Opcode::RefCreate: {
auto value = ReadRegister(instr.ref_create.initial_value);
auto ref = ADT(111, {value});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we do ensure this tag does not collide with other ADT's tag number?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't actually matter, I just used ADT container since typing guarantees we already know ahead of time the value is a reference we don't need to check or inspect the tag.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in this case Tuple({values}) is okay which defaults tag to 0?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, probably just use Tuple.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me speculate a case that in a relay program we defined many ADTs (150 ADT types), and VM compiler generates tags for each ADT type, is it possible to collide with 111? I recall it includes hashes in the tag number, so probably unlikely. And Tuple seems perfect fix.

Expr VisitExpr_(const LetNode* op) final {
Var v = op->var;
if (HasLet(v)) {
if (HasLet(v) || op->value.as<RefWriteNode>()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this fully fixes DCE due to nesting ref_write deep inside lets that get DCEd

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should stop it from removing RefWrites anywhere and by extension keep all ref operations alive as the reference is now live too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give you a test case for this since I made this exact change when debugging and it wasn't enough to fix the problem. (In fact I remember segfaulting...)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, out of the scope of this PR, do we need to do some alias analysis to handle reference for DCE?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, for now we should probably require DCE to error when the incoming module contains references. @jroesch and I are planning on writing an alias analysis pass soon.

@mbrookhart mbrookhart left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some of the debug stuff more formalized.

Looks like we're failing a bunch of unit tests?

I'm not sure I understand the goal of this PR, is there an RFC for references in TVM more generally?

Comment thread src/runtime/vm/vm.cc
Comment on lines +265 to +270
std::string DebugPrint(NDArray array) {
const PackedFunc* fprint = Registry::Get("relay._ndarray_repr");
ICHECK(fprint) << "unable to find printing function for constants";
return (*fprint)(array);
}

@mbrookhart mbrookhart Oct 30, 2020

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this part of the ReprPrinter in ndarray.cc? I've been using something very similar in the VM, but I feel like we should make it more general.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on this

Comment thread tests/python/relay/test_vm.py Outdated

if __name__ == "__main__":
pytest.main([__file__])
import sys

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line?

Comment thread src/runtime/vm/vm.cc Outdated
}
case Opcode::RefCreate: {
auto value = ReadRegister(instr.ref_create.initial_value);
auto ref = ADT(111, {value});

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, probably just use Tuple.

@altanh

altanh commented Dec 7, 2020

Copy link
Copy Markdown
Contributor

I've addressed the ADT tag issue as suggested (using Tuple), and left a TODO comment in DCE.

For the record, I tried adding Feature set checking in DCE to error on detecting RefWrite. However, the VM compilation process requires DCE in other places (I think most critically in InlinePrimitives), so I couldn't get the test to pass without removing feature checking. I think this raises the criticality of fixing DCE slightly, and we are working on it.

re @mbrookhart , I'm not totally sure what you meant by the debug stuff, but references have been in Relay since higher-order AD was introduced. I'm not sure if there is/was an RFC for adding references, although I agree we should probably make one especially as we are working towards supporting stateful stuff going forward.

cc @jroesch who might have more thoughts

@altanh

altanh commented Dec 8, 2020

Copy link
Copy Markdown
Contributor

I think the failing unit tests are actually unsound and rely on DCE for refs, or perhaps they are sound but correctness is definitely not guaranteed by what we currently have. cc @MarisaKirisame, how should we proceed? We might have to block this PR until DCE gets fixed once and for all, or disable all the offending tests.

@MarisaKirisame

Copy link
Copy Markdown
Contributor

Maybe we should disable the offending test rn and push on the DCE fixes?

case Opcode::DeviceCopy:
case Opcode::RefCreate:
case Opcode::RefRead:
case Opcode::RefWrite:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefWrite doesn't have last register, right? Maybe set the last_register_ to -1?

@icemelon

Copy link
Copy Markdown
Member

What's the status on this PR now? @jroesch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants