[REFACTOR][IR] attrs.h follow-up cleanup: drop legacy vtable / rename / phase out AttrFieldInfo#19615
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the attributes system in TVM, renaming BaseAttrsNode to AttrsNode, simplifying DictAttrs and Attrs definitions, and removing the add_argument method and argument descriptions from operator registrations. It also updates several passes to use PassConfigWithDefaults. The reviewer feedback strongly cautions against the widespread removal of defined() checks on attributes across various files (including attrs.h, expression constructors, printers, and transform passes). Since DictAttrs can still be null or undefined at runtime, removing these checks introduces a high risk of segmentation faults when accessing attrs->dict, and the reviewer recommends restoring these safety checks.
| static_assert(TNode::_type_final, "Can only operate on the leaf nodes"); | ||
| TNode* node = input.CopyOnWrite(); | ||
| node->attrs = WithAttr(std::move(node->attrs), attr_key, attr_value); | ||
| node->attrs.CopyOnWrite()->dict.Set(attr_key, std::move(attr_value)); |
There was a problem hiding this comment.
If node->attrs is null (which can happen if the function was deserialized from a format where attributes were omitted), calling CopyOnWrite() on it will cause a segmentation fault. We should defensively initialize node->attrs if it is not defined.
if (!node->attrs.defined()) {
node->attrs = DictAttrs();
}
node->attrs.CopyOnWrite()->dict.Set(attr_key, std::move(attr_value));|
|
||
| node->attrs = WithAttrs(std::move(node->attrs), attrs); | ||
|
|
||
| auto* dict_node = node->attrs.CopyOnWrite(); |
There was a problem hiding this comment.
If node->attrs is null, calling CopyOnWrite() on it will cause a segmentation fault. We should defensively initialize node->attrs if it is not defined.
| auto* dict_node = node->attrs.CopyOnWrite(); | |
| if (!node->attrs.defined()) { | |
| node->attrs = DictAttrs(); | |
| } | |
| auto* dict_node = node->attrs.CopyOnWrite(); |
| TNode* node = input.CopyOnWrite(); | ||
| node->attrs = WithoutAttr(std::move(node->attrs), attr_key); | ||
|
|
||
| node->attrs.CopyOnWrite()->dict.erase(attr_key); |
| matches = true; | ||
| for (auto kv : attributes) { | ||
| if (matches && op->attrs.defined() && op->attrs->dict.count(kv.first)) { | ||
| if (matches && op->attrs->dict.count(kv.first)) { |
There was a problem hiding this comment.
If op->attrs is null (which can happen if the function was deserialized without attributes or created via legacy APIs), dereferencing it via op->attrs->dict will cause a segmentation fault. It is safer to keep the defined() check.
| if (matches && op->attrs->dict.count(kv.first)) { | |
| if (matches && op->attrs.defined() && op->attrs->dict.count(kv.first)) { |
| ffi::Array<ExprDoc, void> kwargs_values; | ||
| // mark private if there is no global symbol | ||
| if (!func->attrs.defined() || !func->attrs->dict.count(tvm::attr::kGlobalSymbol)) { | ||
| if (!func->attrs->dict.count(tvm::attr::kGlobalSymbol)) { |
There was a problem hiding this comment.
If func->attrs is null, calling func->attrs->dict will cause a segmentation fault. We should check if func->attrs is defined before accessing its dictionary.
| if (!func->attrs->dict.count(tvm::attr::kGlobalSymbol)) { | |
| if (!func->attrs.defined() || !func->attrs->dict.count(tvm::attr::kGlobalSymbol)) { |
| {tirx::attr::kNoAlias, true}, | ||
| {tirx::attr::kIsGlobalFunc, true}}); | ||
| if (cur_func_->attrs.defined() && cur_func_->attrs->dict.count(tvm::attr::kSTir)) { | ||
| if (cur_func_->attrs->dict.count(tvm::attr::kSTir)) { |
There was a problem hiding this comment.
If cur_func_->attrs is null, calling cur_func_->attrs->dict will cause a segmentation fault. We should check if cur_func_->attrs is defined before accessing its dictionary.
| if (cur_func_->attrs->dict.count(tvm::attr::kSTir)) { | |
| if (cur_func_->attrs.defined() && cur_func_->attrs->dict.count(tvm::attr::kSTir)) { |
| return DictAttrs(); | ||
| } | ||
|
|
||
| ffi::Map<ffi::String, ffi::Any> dict; |
| attrs = DictAttrs(); | ||
| } | ||
|
|
||
| // Set the function type. |
There was a problem hiding this comment.
If attrs is passed as an undefined DictAttrs (which can happen if a caller passes a default-constructed or null DictAttrs), leaving it uninitialized will result in a null attrs field in the Function node, leading to downstream segmentation faults. We should keep the fallback initialization.
| // Set the function type. | |
| if (!attrs.defined()) { | |
| attrs = DictAttrs(); | |
| } | |
| // Set the function type. |
| attrs = DictAttrs(); | ||
| } | ||
|
|
||
| if (!ret_type.defined()) { |
There was a problem hiding this comment.
PR apache#19615 Commit A deleted AttrFieldInfoNode and OpNode::arguments on the rationale that the only consumer (GetArgStructInfo) could be satisfied with "input[i]" indices instead of stored names. On second thought we want the named-argument metadata back -- it's worth the ~340 lines of .add_argument(...) registrations for the clearer error messages and downstream tooling that wants to introspect Op argument schemas. Restore with two adjustments: relocate from include/tvm/ir/attrs.h to include/tvm/ir/op.h (closer to OpNode), and rename to FieldInfoNode (drop the "Attr" prefix that referenced the old Attrs hierarchy this class never actually used). The FFI type-key becomes "ir.FieldInfo" to match the new C++ name.
PR apache#19615 Commit A deleted AttrFieldInfoNode and OpNode::arguments on the rationale that the only consumer (GetArgStructInfo) could be satisfied with "input[i]" indices instead of stored names. On second thought we want the named-argument metadata back -- it's worth the ~340 lines of .add_argument(...) registrations for the clearer error messages and downstream tooling that wants to introspect Op argument schemas. Restore with two adjustments: relocate from include/tvm/ir/attrs.h to include/tvm/ir/op.h (closer to OpNode), and rename to FieldInfoNode (drop the "Attr" prefix that referenced the old Attrs hierarchy this class never actually used). The FFI type-key becomes "ir.FieldInfo" to match the new C++ name.
071e49e to
38af336
Compare
Followup to apache#19615 commit D, addressing Gemini review concerns that the removed defensive `attrs.defined()` checks at 15 call sites could segfault if a DictAttrs ever became null. Rather than re-add per-site checks (which defeats NOTNULLABLE), this change closes the upstream source paths so the type invariant holds end-to-end: - Move ctor and move assignment now leave the moved-from DictAttrs in a defined-but-empty state (allocating a fresh empty backing) rather than the null state default move would yield. This removes the one realistic C++ path that could leak a null DictAttrs. - WithAttr / WithAttrs / WithoutAttr templates gain a cheap belt+suspenders guard that re-initializes node->attrs to DictAttrs() if a caller somehow produced a null one. These three templates are central to the codebase and called from third-party code, so the extra check is worth the cost. - The class-level doxygen now documents the invariant and how it is enforced (default ctor allocates, move members preserve definedness, FFI type traits reject None at deserialization, the ir.IRModule FFI lambda normalizes None to DictAttrs() explicitly). Notes: - The default constructor (`DictAttrs dict;`) already produced an empty-backed instance via `explicit DictAttrs(Map = {})`, so no change was needed there. - The FFI type traits already reject None for non-nullable types (`_type_is_nullable == false` makes CheckAnyStrict return false), so reflection-driven deserializers cannot inject a null DictAttrs. - The IRModule FFI lambda explicitly normalizes a missing/None attrs parameter before forwarding to the C++ constructor; the Function and PrimFunc Python wrappers do the same on the Python side. - The 15 Gemini-flagged access sites are safe under the closed invariant without per-site `defined()` checks.
OpNode::arguments only stored metadata for self-documentation; no
Python tooling, no test, and no C++ caller other than internal
sanity checks read it. Removing it deletes AttrFieldInfo (which
existed solely to type that array) and the ~335 add_argument()
chain calls that populated dead metadata.
The 12 internal consumers that read op->arguments.size() now read
op->num_inputs (always set by the same TVM_REGISTER_OP() chain via
set_num_inputs). Error messages that read op->arguments[i]->name
now report the index ("input[i]") instead of the per-arg name.
…s ctor BaseAttrsNode no longer overrides any virtual method after apache#19607; ffi::Object destroys objects through a type-erased deleter captured at make_object time (see tvm-ffi/include/tvm/ffi/memory.h Deleter_::tptr->T::~T()) so the explicit virtual dtor adds nothing. The DictAttrs(Map) constructor is three lines and warrants header placement now that the file is otherwise unchanged.
AttrsNodeReflAdapter was the historical 'in between' layer that gave BaseAttrsNode its Base prefix. With the adapter removed in canonical name now used in every comment. The FFI registry key "ir.Attrs" is unchanged, so Python sees no difference.
The DictAttrs no-arg constructor already creates an always-defined empty backing (post-apache#19607 inlined ctor), so every existing call site that constructed a DictAttrs already produced a defined object. Switching to TVM_FFI_DEFINE_OBJECT_REF_METHODS_NOTNULLABLE makes that property part of the type contract. This removes the wall of explicit copy/move and operator-> declarations on DictAttrs and lets us drop ~15 defensive attrs.defined() checks that could never fire. Python wrappers that previously passed attrs=None for an absent attrs (Function and Function.create_empty) now construct an empty DictAttrs explicitly.
The DictAttrs-form WithAttr/WithAttrs/WithoutAttr free functions were not TVM_DLL-exported, not bound to Python, and had no external C++ callers - they exist only as one-hop delegations from the TFunc-template wrappers in the same header. Inlining the dict mutation into the templates removes a layer of indirection.
…s and move to transform.h After apache#19607 every consumer of AttrsWithDefaultValues is a pass-config class registered via TVM_REGISTER_PASS_CONFIG_OPTION; none are Attrs. Renaming to PassConfigWithDefaults and relocating next to PassContext makes the helper's domain explicit and shrinks attrs.h further.
PR apache#19615 Commit A deleted AttrFieldInfoNode and OpNode::arguments on the rationale that the only consumer (GetArgStructInfo) could be satisfied with "input[i]" indices instead of stored names. On second thought we want the named-argument metadata back -- it's worth the ~340 lines of .add_argument(...) registrations for the clearer error messages and downstream tooling that wants to introspect Op argument schemas. Restore with two adjustments: relocate from include/tvm/ir/attrs.h to include/tvm/ir/op.h (closer to OpNode), and rename to FieldInfoNode (drop the "Attr" prefix that referenced the old Attrs hierarchy this class never actually used). The FFI type-key becomes "ir.FieldInfo" to match the new C++ name.
Matches the host field name OpNode::arguments. The prior restoration commit used FieldInfo on first pass; this aligns the naming.
Followup to apache#19615 commit D, addressing Gemini review concerns that the removed defensive `attrs.defined()` checks at 15 call sites could segfault if a DictAttrs ever became null. Rather than re-add per-site checks (which defeats NOTNULLABLE), this change closes the upstream source paths so the type invariant holds end-to-end: - Move ctor and move assignment now leave the moved-from DictAttrs in a defined-but-empty state (allocating a fresh empty backing) rather than the null state default move would yield. This removes the one realistic C++ path that could leak a null DictAttrs. - WithAttr / WithAttrs / WithoutAttr templates gain a cheap belt+suspenders guard that re-initializes node->attrs to DictAttrs() if a caller somehow produced a null one. These three templates are central to the codebase and called from third-party code, so the extra check is worth the cost. - The class-level doxygen now documents the invariant and how it is enforced (default ctor allocates, move members preserve definedness, FFI type traits reject None at deserialization, the ir.IRModule FFI lambda normalizes None to DictAttrs() explicitly). Notes: - The default constructor (`DictAttrs dict;`) already produced an empty-backed instance via `explicit DictAttrs(Map = {})`, so no change was needed there. - The FFI type traits already reject None for non-nullable types (`_type_is_nullable == false` makes CheckAnyStrict return false), so reflection-driven deserializers cannot inject a null DictAttrs. - The IRModule FFI lambda explicitly normalizes a missing/None attrs parameter before forwarding to the C++ constructor; the Function and PrimFunc Python wrappers do the same on the Python side. - The 15 Gemini-flagged access sites are safe under the closed invariant without per-site `defined()` checks.
117e5c9 to
3369d30
Compare
Summary
Follow-up to #19607 that continues trimming
attrs.hand adjacentfiles. The six commits land independently and each builds clean.
OpNode::argumentsandAttrFieldInfo— the field storedmetadata that no Python tooling, test, or C++ caller (beyond internal
sanity checks) read; removing it deletes
AttrFieldInfoplus ~335chained
.add_argument(...)calls. The remaining 12 internal consumersnow read
op->num_inputsand report indexed inputs (input[i]).BaseAttrsNode(ffi::Objectuses a captured-typed deleter, no virtual dispatch needed) and inline
the trivial 3-line
DictAttrs(Map)constructor into the header.BaseAttrsNode→AttrsNode; theBaseprefix existed onlyto distinguish from the
AttrsNodeReflAdaptershim that [REFACTOR][IR] Cleanup attrs.h: drop NullValue, AttrsNodeReflAdapter, legacy BaseAttrsNode methods #19607removed. The
"ir.Attrs"FFI registry key is unchanged.DictAttrsto NOTNULLABLE(
TVM_FFI_DEFINE_OBJECT_REF_METHODS_NOTNULLABLE+ COW macro). Theno-arg
DictAttrs()constructor already created an empty backing,so every existing call site already produced a defined object;
~15 defensive
attrs.defined()checks (and a defensive PythonNonefallback in
Function) are now redundant.WithAttr(DictAttrs, ...)/WithAttrs(DictAttrs, ...)free-function overloads into the TFunc-template wrappers — those
overloads had no external callers (no TVM_DLL, no Python binding).
AttrsWithDefaultValues<T>→PassConfigWithDefaults<T>andmove from
attrs.htotransform.h; all 9 consumers are pass-configclasses registered via
TVM_REGISTER_PASS_CONFIG_OPTION.attrs.hshrinks from 363 → 262 lines.