Skip to content

Improve copyrights detection#3752

Merged
AyanSinhaMahapatra merged 29 commits intodevelopfrom
misc-copyrights
Jun 26, 2024
Merged

Improve copyrights detection#3752
AyanSinhaMahapatra merged 29 commits intodevelopfrom
misc-copyrights

Conversation

@pombredanne
Copy link
Copy Markdown
Member

@pombredanne pombredanne commented Apr 26, 2024

This PR improves copyright detection

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Reported-by: Anton Augsburg @vw-anton
Reference: #3655
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by:  Dimitris Iliou @dimitris-iliou
Reference: #3735
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Spotted in some common python libraries such as numpy and scipy

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Use an input file where each line is either:
- a URL to fetch
- a text to test

Then generate a test data files pair accordingly

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
- Start detecting "is held by"
- Do not include some trailing junk

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Make detection of copyright with a single lowercase name more specific

Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This makes copyright detection more specific

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Also improve NOTICEs, and other misc. variants
Don not detect "The Initial Developer"

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reference: #3797
Reported-by: Jörg Arndt @Joerki
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Handle corner cases with markup
Detect new copyright forms.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* Handle better various parens, markup and quotes

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Copy link
Copy Markdown
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne we need to fix the test failures here and after regenerating it seems to me like some of these are regressions potentially, we need more review of these failures.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne changed the title Apply small copyrights detection improvements Improve copyrights detection Jun 22, 2024
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Copy Markdown
Member Author

@AyanSinhaMahapatra ready for your review, all greeen

Copy link
Copy Markdown
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I have a couple small questions and fixes here.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-authored-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Copy link
Copy Markdown
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks++ @pombredanne This improves copyright detection a lot!
Merging!

@AyanSinhaMahapatra AyanSinhaMahapatra merged commit 1242518 into develop Jun 26, 2024
@AyanSinhaMahapatra AyanSinhaMahapatra deleted the misc-copyrights branch June 26, 2024 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants