branch-4.0: [feature](search) introduce lucene bool mode for search function #59394#59745
branch-4.0: [feature](search) introduce lucene bool mode for search function #59394#59745yiguolei merged 1 commit intobranch-4.0from
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
7b764b8 to
3e3d159
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #58545 Problem Summary: This PR introduces two new features for the SEARCH function: #### 1. Lucene Boolean Mode Adds a `mode` option to enable Lucene/Elasticsearch-style query parsing: ```sql -- Enable Lucene mode via JSON options SELECT * FROM docs WHERE search('apple AND banana', '{"default_field":"title","mode":"lucene"}'); -- With minimum_should_match SELECT * FROM docs WHERE search('apple AND banana OR cherry', '{"default_field":"title","mode":"lucene","minimum_should_match":1}'); ``` **Key differences from standard mode:** - AND/OR/NOT work as left-to-right modifiers (not traditional boolean algebra) - Uses MUST/SHOULD/MUST_NOT internally (like Lucene's Occur enum) - Pure NOT queries return empty results (need positive clause) **Behavior comparison:** | Query | Standard Mode | Lucene Mode | |-------|--------------|-------------| | `a AND b` | a ∩ b | +a +b (both MUST) | | `a OR b` | a ∪ b | a b (both SHOULD, min=1) | | `NOT a` | ¬a | Empty (no positive clause) | | `a AND NOT b` | a ∩ ¬b | +a -b (MUST a, MUST_NOT b) | | `a AND b OR c` | (a ∩ b) ∪ c | +a b c (only a is MUST) | #### 2. Escape Characters in DSL Support for escaping special characters using backslash: | Escape | Description | Example | |--------|-------------|---------| | `\ ` | Literal space | `title:First\ Value` matches "First Value" | | `\(` `\)` | Literal parentheses | `title:hello\(world\)` matches "hello(world)" | | `\:` | Literal colon | `title:key\:value` matches "key:value" | | `\\` | Literal backslash | `title:path\\to\\file` matches "path\to\file" |
3e3d159 to
0dfdf53
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run p0 |
|
run nonConcurrent |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Cherry-picked from #59394
Note: This PR depends on #59766 (cherry-pick of #58545) being merged first.
Summary
Introduce lucene bool mode for search function.
Test plan
Related PRs: #59394
Depends on: #59766