Hey,
I have picked up on earlier work with Lyo, to run an LDP endpoint with oslc query support.
While doing this, I noticed an issue with the parser (or rather the lexer) for oslc.where:
https://github.com/eclipse/lyo.core/blob/2b5a849b0f0d0b9fe3a373b814da30b94f69c39b/oslc-query/src/main/antlr3/org/eclipse/lyo/core/query/OslcWhere.g#L162
The issue is that its IRI_REF lexer rule consumes '<', unless followed by a few reserved chars, which makes it impossible to compare using '<' with anything but strings and IRIs, and to compare using '<=' at all. That is because it matches '<', followed by something that matches the allowed chars, and then runs up to EOF while expecting the closing '>'.
org.eclipse.lyo.core.query.ParseException: line 1:12 no viable alternative at input '<EOF>'
at org.eclipse.lyo.core.query.QueryUtils.checkErrors(QueryUtils.java:676)
at org.eclipse.lyo.core.query.QueryUtils.parseWhere(QueryUtils.java:132)
[...]
Your test cases do not cover this, unfortunately. Adding the following expressions to BasicWhereTest shows the issue:
"qm:answer<42", // fails
"qm:answer<=42", // fails
"qm:question<\"The ultimate question of Life, the Universe, and Everything\"", // works:
"qm:question<=\"The ultimate question of Life, the Universe, and Everything\"", // fails
"qm:question<<urn:qm:ultimate>", // works
"qm:question<=<urn:qm:ultimate>", // fails
I messed with the grammar for a bit (not used to ANTL3), and it appears quite tricky due to the broad range of allowed chars in IRIs.
My current workaround uses ALPHA_CHARS after LESS at line 162, which makes it skip IRI_REF for all '<=' comparisons and '<' comparisons against decimal literals, thereby making those cases work.
RFC 3986 and 3987 (URI, IRI) suggest that IRIs should start with the scheme, anyway, so forcing it to start with a letter should not introduce a limitation.
--- a/OslcWhere.g
--- b/OslcWhere.g
@@ -159,7 +160,7 @@ PNAME_LN
IRI_REF
- : LESS ( options {greedy=false;} : ~(LESS | GREATER | '"' | OPEN_CURLY_BRACE | CLOSE_CURLY_BRACE | '|' | '^' | '\\' | '`' | ('\u0000'..'\u0020')) )* GREATER
+ : LESS ALPHA_CHARS ( options {greedy=false;} : ~(LESS | GREATER | '"' | OPEN_CURLY_BRACE | CLOSE_CURLY_BRACE | '|' | '^' | '\\' | '`' | ('\u0000'..'\u0020')) )* GREATER
;
Best regards,
M.
Hey,
I have picked up on earlier work with Lyo, to run an LDP endpoint with oslc query support.
While doing this, I noticed an issue with the parser (or rather the lexer) for oslc.where:
https://github.com/eclipse/lyo.core/blob/2b5a849b0f0d0b9fe3a373b814da30b94f69c39b/oslc-query/src/main/antlr3/org/eclipse/lyo/core/query/OslcWhere.g#L162
The issue is that its IRI_REF lexer rule consumes '<', unless followed by a few reserved chars, which makes it impossible to compare using '<' with anything but strings and IRIs, and to compare using '<=' at all. That is because it matches '<', followed by something that matches the allowed chars, and then runs up to EOF while expecting the closing '>'.
Your test cases do not cover this, unfortunately. Adding the following expressions to BasicWhereTest shows the issue:
I messed with the grammar for a bit (not used to ANTL3), and it appears quite tricky due to the broad range of allowed chars in IRIs.
My current workaround uses ALPHA_CHARS after LESS at line 162, which makes it skip IRI_REF for all '<=' comparisons and '<' comparisons against decimal literals, thereby making those cases work.
RFC 3986 and 3987 (URI, IRI) suggest that IRIs should start with the scheme, anyway, so forcing it to start with a letter should not introduce a limitation.
Best regards,
M.