Skip to content

fix(gitextractor): subtask Clone Git Repo ended unexpectedly#8136

Merged
klesh merged 5 commits into
apache:mainfrom
caioq:fix/gitextractor-auth
Oct 22, 2024
Merged

fix(gitextractor): subtask Clone Git Repo ended unexpectedly#8136
klesh merged 5 commits into
apache:mainfrom
caioq:fix/gitextractor-auth

Conversation

@caioq

@caioq caioq commented Oct 9, 2024

Copy link
Copy Markdown
Contributor

Summary

  • According to github documentation the installation access token expire after 1 hour. This commit generates a new installation access token at the start of each gitextractor task avoiding the auth error that happens for pipelines that last more than 1 hour.

Does this close any open issues?

Closes #7958

Screenshots

Before the fix:
image

After the fix: The pipeline lasts more than 1 hour and the gitextractor tasks keep working
image

Other Information

I understand that this solution is not the most efficient, since for each gitextractor task it will generate new tokens even when the current token is still valid. However, I believe it can be used as a temporary solution to enable the use of Github App without problems.

For a more efficient solution, we could generate a new token only when it has reached its expiration time.
Analyzing the code, I believe that the expiration time of this token needs to be persisted in the database in order to be accessed when preparing the task. What do you think? I may be evolving this solution in another new PR.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux labels Oct 9, 2024
Signed-off-by: Caio Queiroz <caiogqueiroz@gmail.com>

@klesh klesh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation assumes all repositories belong to GitHub, which is incorrect. We need to decouple GitExtractor plugin from specific data source platforms like GitHub and GitLab.

Here's a suggested approach:

Define an interface: Create an interface named DynamicGitUrl (or a more descriptive name) within the gitextractor plugin. This interface should define a method to retrieve the latest Git URL based on a given connection ID and scope ID.

Implement PrepareTaskData: In the gitextractor.PrepareTaskData function, if a plugin, connection ID, and scope ID are provided, use core.GetPlugin to fetch the plugin instance and dynamically cast it to the DynamicGitUrl interface. Then, call the interface's method with the connection ID and scope ID to retrieve the latest Git URL.

Implement DynamicGitUrl in Data Source plugins: Each data source plugin (like GitHub) should implement the DynamicGitUrl interface, providing its own logic for determining the Git URL based on connection and scope information.

This approach allows for a more flexible and extensible design. The GitExtractor plugin remains agnostic to specific data sources, while each data source plugin is responsible for providing the appropriate Git URL retrieval logic.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 17, 2024
@caioq

caioq commented Oct 17, 2024

Copy link
Copy Markdown
Contributor Author

The current implementation assumes all repositories belong to GitHub, which is incorrect. We need to decouple GitExtractor plugin from specific data source platforms like GitHub and GitLab.

Here's a suggested approach:

Define an interface: Create an interface named DynamicGitUrl (or a more descriptive name) within the gitextractor plugin. This interface should define a method to retrieve the latest Git URL based on a given connection ID and scope ID.

Implement PrepareTaskData: In the gitextractor.PrepareTaskData function, if a plugin, connection ID, and scope ID are provided, use core.GetPlugin to fetch the plugin instance and dynamically cast it to the DynamicGitUrl interface. Then, call the interface's method with the connection ID and scope ID to retrieve the latest Git URL.

Implement DynamicGitUrl in Data Source plugins: Each data source plugin (like GitHub) should implement the DynamicGitUrl interface, providing its own logic for determining the Git URL based on connection and scope information.

This approach allows for a more flexible and extensible design. The GitExtractor plugin remains agnostic to specific data sources, while each data source plugin is responsible for providing the appropriate Git URL retrieval logic.

Hi @klesh, thanks for the review!
I agree with you and liked your suggestion. I implemented it here 26b716f for the Github plugin, let me know if this is what you had in mind.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Oct 17, 2024
@caioq caioq requested a review from klesh October 17, 2024 20:08
klesh
klesh previously approved these changes Oct 21, 2024

@klesh klesh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES!!! Exactly. Fantastic.
Have you tested it on your local machine? Is it work as expected?

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Oct 21, 2024
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/plugins This issue or PR relates to plugins lgtm This PR has been approved by a maintainer pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][gitextractor] subtask Clone Git Repo ended unexpectedly

2 participants