Description
The rule_identifier value for matches does not align with the rule identifier available in license_rule_references for matching using the 1-spdx-id matcher. It correctly creates a matching rule, but the hash at the end of the identifiers are different (see "How to Reproduce").
This prevents any kind of automation to reliably retrieve rule information for these kind of matches.
How To Reproduce
- Create a file with an SPDX license identifier comment:
echo "/* SPDX-License-Identifier: MIT */" > Application.java
- Run ScanCode with the
--license-references option:
scancode --json-pp scancode_output.json --license --license-references --license-text Application.java
- In the scancode_output.json (file attached) there is now one match with a
rule_identifier that doesn't exist. And also one license_rule_reference with an identifier that is never user.
$ cat scancode_output.json | jq .files[].license_detections[].matches[].rule_identifier
"spdx-license-identifier-mit-e0e2f62999b9522e22ba5602a715c2acd64e958b"
$ cat scancode_output.json | jq .license_rule_references[].identifier
"spdx-license-identifier-mit-2410ec7d8cecfb84d911cb1c29ba44ab907b8b8f"
Note how the unique identifier/hashes at the end of the identifiers are different. Both refer to mit and the license_rule_references[].text and .files[].license_detections[].matches[].matched_text so it's certain they are referring to the same thing.
System configuration
- What OS are you running on? (Windows/MacOS/Linux)
Linux (Ubuntu 22.04.3 LTS), 64bit
- What version of scancode-toolkit was used to generate the scan file?
ScanCode version: 32.0.8
ScanCode Output Format version: 3.0.0
SPDX License list version: 3.21
- What installation method was used to install/run scancode? (pip/source download/other)
With pip on Python 3.10.12.
Description
The
rule_identifiervalue for matches does not align with the ruleidentifieravailable inlicense_rule_referencesfor matching using the1-spdx-idmatcher. It correctly creates a matching rule, but the hash at the end of the identifiers are different (see "How to Reproduce").This prevents any kind of automation to reliably retrieve rule information for these kind of matches.
How To Reproduce
--license-referencesoption:rule_identifierthat doesn't exist. And also onelicense_rule_referencewith anidentifierthat is never user.Note how the unique identifier/hashes at the end of the identifiers are different. Both refer to
mitand thelicense_rule_references[].textand.files[].license_detections[].matches[].matched_textso it's certain they are referring to the same thing.System configuration
Linux (Ubuntu 22.04.3 LTS), 64bit
With
piponPython 3.10.12.