Skip to content

[feat](maxcompute) Support INSERT INTO for MaxCompute external catalog tables#60769

Draft
morningman wants to merge 7 commits intoapache:masterfrom
morningman:support_mc_write
Draft

[feat](maxcompute) Support INSERT INTO for MaxCompute external catalog tables#60769
morningman wants to merge 7 commits intoapache:masterfrom
morningman:support_mc_write

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Feb 14, 2026

What problem does this PR solve?

Related #60768

Add end-to-end write support for MaxCompute external tables, enabling
users to export data from Doris to MaxCompute via standard INSERT INTO
syntax. This builds on the JNI writer framework introduced in #60756.

Key changes:

BE:

  • Add MCTableSinkOperatorX pipeline sink operator and MCTableSinkLocalState
  • Add VMCTableWriter (async) and VMCPartitionWriter for partition-aware writes
  • Extend VJniFormatTransformer with get_statistics() for retrieving writemetrics from Java-side writer
  • Track TMCCommitData in RuntimeState and report it back to coordinatorvia FragmentMgr

FE:

  • Add MaxComputeJniWriter using MC Tunnel SDK for data upload
  • Add MCTransaction for upload session lifecycle management and commit
  • Add MCTransactionManager and MCInsertExecutor/MCInsertCommandContext
  • Add Nereids planner support: UnboundMaxComputeTableSink,
    LogicalMaxComputeTableSink, PhysicalMaxComputeTableSink with
    corresponding bind and implementation rules
  • Add MaxComputeTableSink planner node

Thrift:

  • Define TMCCommitData, TMaxComputeTableSink, and MAXCOMPUTE_TABLE_SINK
    data sink type

The title keeps the existing feat tag convention and makes it clearer that this is about INSERT INTO on external catalog tables (not just "write data back"). The commit message follows the style of the related PR #60756,
with structured sections for BE/FE/Thrift changes.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman morningman changed the title [feat](maxcompute) support write data back to mc table [feat](maxcompute) Support INSERT INTO for MaxCompute external catalog tables Feb 14, 2026
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.33% (1796/2264)
Line Coverage 64.79% (31989/49372)
Region Coverage 65.47% (15960/24378)
Branch Coverage 55.97% (8487/15164)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 4.92% (16/325) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 29038 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d0d766c929aea20bad55d820f0bf9bb7fdb8ccf5, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17443	4479	4363	4363
q2	q3	10650	818	532	532
q4	4683	379	257	257
q5	7550	1212	1029	1029
q6	172	179	147	147
q7	790	863	662	662
q8	9316	1461	1344	1344
q9	5259	4673	4737	4673
q10	6850	1865	1639	1639
q11	457	249	239	239
q12	748	578	459	459
q13	17788	4235	3448	3448
q14	237	233	215	215
q15	942	799	785	785
q16	774	726	678	678
q17	753	883	413	413
q18	6079	5454	5309	5309
q19	1487	994	640	640
q20	512	494	395	395
q21	4688	1963	1531	1531
q22	381	305	280	280
Total cold run time: 97559 ms
Total hot run time: 29038 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4764	4573	4526	4526
q2	q3	1838	2262	1751	1751
q4	876	1163	762	762
q5	4040	4453	4414	4414
q6	178	173	141	141
q7	1776	1668	1549	1549
q8	2555	2789	2549	2549
q9	7848	7362	7371	7362
q10	2651	2842	2477	2477
q11	515	460	417	417
q12	510	584	429	429
q13	3909	4444	3613	3613
q14	284	288	273	273
q15	872	820	806	806
q16	706	747	707	707
q17	1194	1537	1301	1301
q18	7072	6817	6646	6646
q19	980	985	894	894
q20	2089	2150	1983	1983
q21	4022	3482	3405	3405
q22	458	483	429	429
Total cold run time: 49137 ms
Total hot run time: 46434 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184250 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d0d766c929aea20bad55d820f0bf9bb7fdb8ccf5, data reload: false

query5	4834	653	530	530
query6	325	217	220	217
query7	4209	486	274	274
query8	331	242	239	239
query9	8809	2732	2741	2732
query10	537	355	337	337
query11	17017	16892	16563	16563
query12	180	124	120	120
query13	1255	436	341	341
query14	6342	3220	2972	2972
query14_1	2798	2791	2749	2749
query15	206	192	177	177
query16	988	479	452	452
query17	1075	726	628	628
query18	2584	445	353	353
query19	212	212	190	190
query20	138	128	132	128
query21	222	151	122	122
query22	4800	6169	5792	5792
query23	17575	17247	17055	17055
query23_1	17040	17070	16620	16620
query24	7194	1623	1213	1213
query24_1	1238	1221	1232	1221
query25	576	444	405	405
query26	1240	251	146	146
query27	2798	473	279	279
query28	4543	1859	1829	1829
query29	771	563	463	463
query30	312	242	208	208
query31	876	756	646	646
query32	84	70	74	70
query33	505	332	279	279
query34	913	913	565	565
query35	635	668	619	619
query36	1071	1133	979	979
query37	134	97	84	84
query38	2958	2883	2905	2883
query39	886	915	861	861
query39_1	826	820	838	820
query40	227	146	137	137
query41	63	98	62	62
query42	102	100	100	100
query43	383	385	355	355
query44	
query45	203	189	181	181
query46	886	976	605	605
query47	2101	2158	2049	2049
query48	301	312	232	232
query49	625	449	376	376
query50	694	286	210	210
query51	4105	4070	4065	4065
query52	103	110	95	95
query53	288	337	280	280
query54	293	262	267	262
query55	86	85	80	80
query56	322	313	305	305
query57	1376	1347	1276	1276
query58	289	281	278	278
query59	2567	2603	2507	2507
query60	334	335	318	318
query61	139	157	173	157
query62	617	596	523	523
query63	301	279	276	276
query64	4859	1259	1002	1002
query65	
query66	1411	452	342	342
query67	16574	16352	16310	16310
query68	
query69	405	326	286	286
query70	969	934	954	934
query71	337	299	300	299
query72	2711	2796	2622	2622
query73	533	556	326	326
query74	9984	9932	9757	9757
query75	2863	2762	2476	2476
query76	2290	1042	671	671
query77	379	396	350	350
query78	11235	11528	10737	10737
query79	1132	812	592	592
query80	1431	614	542	542
query81	563	273	253	253
query82	1000	153	117	117
query83	346	256	247	247
query84	250	114	103	103
query85	1015	473	424	424
query86	402	349	298	298
query87	3097	3073	2996	2996
query88	3582	2700	2655	2655
query89	425	359	344	344
query90	1923	178	173	173
query91	166	155	136	136
query92	78	78	74	74
query93	947	811	499	499
query94	641	330	301	301
query95	592	397	329	329
query96	659	527	229	229
query97	2448	2502	2405	2405
query98	228	219	216	216
query99	1002	999	931	931
Total cold run time: 253347 ms
Total hot run time: 184250 ms

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/333) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.66% (19546/37115)
Line Coverage 36.22% (182187/502992)
Region Coverage 32.52% (141190/434117)
Branch Coverage 33.59% (61277/182414)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 3.00% (10/333) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.60% (26041/36368)
Line Coverage 54.32% (272550/501743)
Region Coverage 51.62% (226361/438494)
Branch Coverage 53.17% (97362/183118)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 4.92% (16/325) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments