Skip to content

API, Core: Add CatalogObjectIdentifier#16160

Open
stevenzwu wants to merge 4 commits into
apache:mainfrom
stevenzwu:rest-catalog-object-identifier-java
Open

API, Core: Add CatalogObjectIdentifier#16160
stevenzwu wants to merge 4 commits into
apache:mainfrom
stevenzwu:rest-catalog-object-identifier-java

Conversation

@stevenzwu
Copy link
Copy Markdown
Contributor

@stevenzwu stevenzwu commented Apr 29, 2026

Summary

Adds the Java reference implementation for the CatalogObjectIdentifier schema added to the REST OpenAPI spec in #16144.

  • CatalogObjectIdentifier (api/catalog) — hand-written POJO mirroring Namespace: static of(String...) factory, null/null-byte validation, levels() / level(i) / length() accessors, dotted toString.
  • CatalogObjectIdentifierParser (core/catalog) — JSON serde mirroring TableIdentifierParser. Serializes as a bare JSON array of strings.
  • RESTSerializers registers the serializer/deserializer pair.

Spec PR: #16144

Test plan

  • ./gradlew :iceberg-api:test --tests TestCatalogObjectIdentifier
  • ./gradlew :iceberg-core:test --tests TestCatalogObjectIdentifierParser
  • ./gradlew :iceberg-api:spotlessApply :iceberg-core:spotlessApply

🤖 Generated with Claude Code

@stevenzwu stevenzwu force-pushed the rest-catalog-object-identifier-java branch from c3d64e5 to 31eacf9 Compare April 29, 2026 22:33
* <p>The object kind is determined by context (e.g. the endpoint or a companion {@link
* CatalogObjectType} discriminator), not by the identifier structure alone.
*/
public class CatalogObjectIdentifier {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nti: what about just calling it Identifier as CatalogObjectIdentifier seems quite long

Copy link
Copy Markdown
Contributor Author

@stevenzwu stevenzwu Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CatalogObjectIdentifier is more accurate about identifying an object within a catalog.

E.g. a SQL may reference a table as my_catalog.my_db.my_tbl. catalog name should be excluded in the CatalogObjectIdentifier, which would have my_db.my_tbl.

There were discussions that some catalog servers (like Polaris) have listCatalogs endpoint. I don't want to confuse this with the catalog name/identifier.

public class TestCatalogObjectIdentifier {

@Test
public void testWithNullAndEmpty() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: we can omit test prefixes in newly added tests as this is a legacy JUnit thing that isn't adding any value

@Override
public CatalogObjectIdentifier deserialize(JsonParser p, DeserializationContext context)
throws IOException {
String[] levels = JsonUtil.getStringArray(p.getCodec().readTree(p));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's essentially a one-liner and we do the same for Namespace, but I wonder if it's worth still creating a separate Parser class. That way we can independently test the from/to JSON parts

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine either way. I changed the code as you suggested

}

public static CatalogObjectType fromName(String type) {
Preconditions.checkArgument(type != null, "Catalog object type is null");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we use a pattern similar to Invalid catalog object type: null in other places

Comment on lines +48 to +50
Preconditions.checkArgument(
!CONTAINS_NULL_CHARACTER.test(level),
"Cannot create a CatalogObjectIdentifier with the null-byte character");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please elaborate why this check is required ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mirrors the same null-byte check on Namespace levels (added in #3938). Null bytes are a foot-gun in identifier-level strings — they can corrupt any encoding that uses a control byte as a delimiter, mis-terminate strings in C-based downstream consumers, and slip past path/log normalization. Rejecting them at construction keeps the invariant explicit at the API boundary, the same way Namespace does for the same reason.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me fix the style though as we usually don't put class name in the error msg

@stevenzwu stevenzwu changed the title Core, API: Add CatalogObjectIdentifier and CatalogObjectType API, Core: Add CatalogObjectIdentifier May 12, 2026
@stevenzwu stevenzwu changed the title API, Core: Add CatalogObjectIdentifier API: Add CatalogObjectIdentifier May 12, 2026
stevenzwu and others added 4 commits May 12, 2026 10:43
Adds Java reference implementations for the REST schemas introduced
in apache#16144.

- api/org.apache.iceberg.catalog.CatalogObjectIdentifier: hand-written
  POJO mirroring Namespace — static of(String...) factory,
  null/null-byte validation, levels()/level(i)/length() accessors,
  dotted toString.
- api/org.apache.iceberg.catalog.CatalogObjectType: enum of TABLE,
  VIEW, NAMESPACE with lowercase wire strings and a fromName factory,
  mirroring PlanStatus.
- Registers a bare-array serializer and deserializer for
  CatalogObjectIdentifier in RESTSerializers, matching the way
  Namespace is wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…alogObjectType

- Re-add CatalogObjectIdentifierParser with independent from/to JSON
  unit tests. RESTSerializers delegates to it, matching how
  TableIdentifierParser is wired.
- Standardize the null-check error in CatalogObjectType.fromName on
  "Invalid catalog object type: null", matching the pattern used in
  DistributionMode, IsolationLevel, FileFormat, etc.
- Drop the test prefix from new @test method names.
- Link CatalogObjectIdentifier from the CatalogObjectType Javadoc now
  that both classes live in the same module and package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces "CatalogObjectIdentifier" with the lowercase noun form
"catalog object identifier" in the four checkArgument/checkNotNull
messages and updates the matching test assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the spec change in apache#16144: the shared
CatalogObjectType schema is being removed since the resolve and events
endpoints will each define their own narrower closed enums in their
respective PRs. Also rewords the CatalogObjectIdentifier Javadoc to
no longer reference CatalogObjectType by name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@stevenzwu stevenzwu changed the title API: Add CatalogObjectIdentifier API, Core: Add CatalogObjectIdentifier May 12, 2026
@stevenzwu stevenzwu force-pushed the rest-catalog-object-identifier-java branch from f364328 to 3180afd Compare May 12, 2026 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants