fix(gitextractor): subtask Clone Git Repo ended unexpectedly#8136
Conversation
Signed-off-by: Caio Queiroz <caiogqueiroz@gmail.com>
6683422 to
512c748
Compare
There was a problem hiding this comment.
The current implementation assumes all repositories belong to GitHub, which is incorrect. We need to decouple GitExtractor plugin from specific data source platforms like GitHub and GitLab.
Here's a suggested approach:
Define an interface: Create an interface named DynamicGitUrl (or a more descriptive name) within the gitextractor plugin. This interface should define a method to retrieve the latest Git URL based on a given connection ID and scope ID.
Implement PrepareTaskData: In the gitextractor.PrepareTaskData function, if a plugin, connection ID, and scope ID are provided, use core.GetPlugin to fetch the plugin instance and dynamically cast it to the DynamicGitUrl interface. Then, call the interface's method with the connection ID and scope ID to retrieve the latest Git URL.
Implement DynamicGitUrl in Data Source plugins: Each data source plugin (like GitHub) should implement the DynamicGitUrl interface, providing its own logic for determining the Git URL based on connection and scope information.
This approach allows for a more flexible and extensible design. The GitExtractor plugin remains agnostic to specific data sources, while each data source plugin is responsible for providing the appropriate Git URL retrieval logic.
Hi @klesh, thanks for the review! |
klesh
left a comment
There was a problem hiding this comment.
YES!!! Exactly. Fantastic.
Have you tested it on your local machine? Is it work as expected?
Summary
Does this close any open issues?
Closes #7958
Screenshots
Before the fix:

After the fix: The pipeline lasts more than 1 hour and the gitextractor tasks keep working

Other Information
I understand that this solution is not the most efficient, since for each gitextractor task it will generate new tokens even when the current token is still valid. However, I believe it can be used as a temporary solution to enable the use of Github App without problems.
For a more efficient solution, we could generate a new token only when it has reached its expiration time.
Analyzing the code, I believe that the expiration time of this token needs to be persisted in the database in order to be accessed when preparing the task. What do you think? I may be evolving this solution in another new PR.