Full Report Details
Function Inventory
Package Distribution
| Package |
File Count |
Primary Purpose |
pkg/workflow/ |
108 |
Core workflow compilation, engines, safe outputs |
pkg/cli/ |
60 |
CLI commands, MCP management, tooling |
pkg/parser/ |
6 |
YAML/JSON parsing, GitHub API, frontmatter |
pkg/console/ |
3 |
Terminal UI rendering |
pkg/constants/ |
1 |
Shared constants |
pkg/logger/ |
1 |
Logging utilities |
pkg/timeutil/ |
1 |
Time formatting |
Key File Organization
The repository follows Go best practices with feature-based file organization:
- Engine files:
claude_engine.go, copilot_engine.go, codex_engine.go, etc.
- Create operations:
create_issue.go, create_pr.go, create_discussion.go
- Specialized functionality per file
Identified Issues
1. Validation Functions Scattered Across Multiple Files
Issue: Validation functions exist in 9+ files outside the dedicated validation.go file, violating the single responsibility principle for validation logic.
Files with Misplaced Validation Functions:
pkg/workflow/compiler.go - Contains validation logic that should be in validation.go
pkg/workflow/docker.go:86 - validateDockerImage() function
pkg/workflow/engine.go:261,315 - validateEngine(), validateSingleEngineSpecification()
pkg/workflow/expression_safety.go:66,141 - validateExpressionSafety(), validateSingleExpression()
pkg/workflow/mcp-config.go:1034,1046 - validateStringProperty(), validateMCPRequirements()
pkg/workflow/npm.go:45 - validateNpxPackages()
pkg/workflow/pip.go:49,84,113,174 - Multiple Python package validation functions
pkg/workflow/strict_mode.go:43,72,94,115,155 - Five strict mode validation functions
pkg/workflow/template.go:54 - validateNoIncludesInTemplateRegions()
Current State:
validation.go has 30+ validation functions (primary validation file)
- 9 other files contain 20+ additional validation functions
Recommendation:
- Move domain-specific validation to appropriate domain files (e.g., Docker validation can stay in
docker.go)
- Move general validation functions to
validation.go
- Consider creating validation sub-files if
validation.go becomes too large (e.g., validation_packages.go, validation_strict_mode.go)
Estimated Impact: Medium - Improved code organization and easier testing of validation logic
2. Exact Duplicate JavaScript Function: sanitizeLabelContent
Issue: The sanitizeLabelContent function appears identically in two JavaScript files.
Duplicate Occurrences:
Occurrence 1: pkg/workflow/js/create_issue.cjs:4-17
function sanitizeLabelContent(content) {
if (!content || typeof content !== "string") {
return "";
}
let sanitized = content.trim();
sanitized = sanitized.replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g, "");
sanitized = sanitized.replace(/\x1b\[[0-9;]*[mGKH]/g, "");
sanitized = sanitized.replace(
/(^|[^\w`])@([A-Za-z0-9](?:[A-Za-z0-9-]{0,37}[A-Za-z0-9])?(?:\/[A-Za-z0-9._-]+)?)/g,
(_m, p1, p2) => `${p1}\`@${p2}\``
);
sanitized = sanitized.replace(/[<>&'"]/g, "");
return sanitized.trim();
}
Occurrence 2: pkg/workflow/js/add_labels.cjs:4-17 (identical implementation)
Code Similarity: 100% identical (14 lines)
Recommendation:
- Extract to shared utility module (e.g.,
pkg/workflow/js/shared_utils.cjs)
- Import in both files instead of duplicating
- Benefits: Single source of truth, easier maintenance, reduced code size
Estimated Impact: Low effort, high maintainability benefit
3. Duplicate Sanitization Functions (Go and JavaScript)
Issue: Multiple sanitization functions exist across different languages and files with similar purposes.
Sanitization Functions Found:
Go Functions:
pkg/workflow/strings.go:75 - SanitizeName()
pkg/workflow/strings.go:157 - SanitizeWorkflowName()
pkg/workflow/workflow_name.go:12 - SanitizeIdentifier()
JavaScript Functions:
pkg/workflow/js/sanitize.cjs:14 - sanitizeContent()
pkg/workflow/js/parse_firewall_logs.cjs:242 - sanitizeWorkflowName()
pkg/workflow/js/create_issue.cjs:4 - sanitizeLabelContent()
pkg/workflow/js/add_labels.cjs:4 - sanitizeLabelContent() (duplicate)
Analysis:
- Go sanitization is reasonably consolidated in
strings.go and workflow_name.go
- JavaScript sanitization is scattered and duplicated
- Some naming inconsistency (
SanitizeIdentifier vs SanitizeWorkflowName)
Recommendation:
- ✅ Go side is acceptable - Keep as is
- ❌ JavaScript side needs consolidation - Create shared sanitization utilities module
Estimated Impact: Medium - Reduces JavaScript code duplication
4. Parsing Functions Distributed Across 15+ Files
Issue: Parsing functions are distributed across many files instead of being centralized in parser-related files or following a clear pattern.
Files with Parse Functions:
pkg/workflow/time_delta.go - Multiple date/time parsing functions
pkg/workflow/comment.go - ParseCommandEvents()
pkg/workflow/dependabot.go - parseNpmPackage(), parsePipPackage(), parseGoPackage()
pkg/workflow/create_discussion.go - parseDiscussionsConfig()
pkg/workflow/create_pr_review_comment.go - parsePullRequestReviewCommentsConfig()
pkg/workflow/threat_detection.go - parseThreatDetectionConfig()
pkg/workflow/expressions.go - Expression parsing
pkg/workflow/frontmatter_extraction.go - Frontmatter parsing
- And 7+ more files...
Analysis:
- Domain-specific parsing (e.g., time parsing in
time_delta.go) ✅ Good organization
- Config parsing (e.g.,
parseDiscussionsConfig()) ✅ Acceptable in feature files
- Generic parsing utilities scattered across multiple files ⚠️ Could be improved
Recommendation:
- ✅ Keep domain-specific parsers in their respective files (time, expressions, etc.)
- ✅ Keep config parsers in feature-specific files (create_discussion.go, etc.)
- ⚠️ Consider extracting common parsing patterns if code duplication is found
Estimated Impact: Low priority - Current organization is mostly acceptable
5. Helper/Utility File Organization
Issue: Helper and utility files exist in both pkg/cli/ and pkg/workflow/ with varying naming conventions.
Current Helper Files:
CLI Package:
frontmatter_utils.go - Frontmatter manipulation utilities
repeat_utils.go - Retry logic utilities
shared_utils.go - General shared utilities
Workflow Package:
engine_helpers.go - Engine-specific helper functions
prompt_step_helper.go - Prompt step utilities
strings.go - String manipulation utilities
safe_outputs_env_test_helpers.go - Test helper (appropriately named)
Analysis:
- ✅ Good: Naming convention with
_utils and _helpers suffixes
- ✅ Good: Test helpers clearly identified
- ⚠️ Mixed: Some utilities specific to domain (good), others generic (could be consolidated)
Recommendation:
- ✅ Keep current organization - It's reasonable and follows Go conventions
- Consider documenting the distinction between "utils" and "helpers" in contribution guidelines
- Monitor for utility function sprawl in future
Estimated Impact: Very low - Current state is acceptable
Semantic Function Clustering Results
Cluster 1: Validation Functions ⚠️ (Scattered)
Pattern: validate* functions
Total Found: 50+ validation functions
Primary File: validation.go (30+ functions)
Scattered Across: 9 additional files
Analysis: While having a primary validation file is good, too many validation functions are scattered. This creates maintenance challenges and makes it harder to understand validation logic.
Cluster 2: Sanitization Functions ⚠️ (Partially Consolidated)
Pattern: sanitize* or Sanitize* functions
Total Found: 10+ functions (Go + JavaScript)
Files:
- Go:
strings.go, workflow_name.go (good consolidation)
- JavaScript: 4+ files with duplicates (needs improvement)
Analysis: Go side is well-organized, JavaScript side has duplicates.
Cluster 3: Parsing Functions ✅ (Acceptable)
Pattern: parse* or Parse* functions
Total Found: 40+ parsing functions
Distribution: Spread across 15+ files based on domain
Analysis: Most parsing functions are appropriately placed in domain-specific files. This is good organization.
Cluster 4: Rendering Functions ✅ (Well Organized)
Pattern: render* or Render* functions
Total Found: 30+ rendering functions
Organization: Test files + engine_helpers.go + specific engine files
Analysis: Rendering logic is appropriately distributed. No consolidation needed.
Cluster 5: Formatting Functions ✅ (Well Organized)
Pattern: format* or Format* functions
Total Found: 20+ formatting functions
Key Files: engine_helpers.go, js.go, permissions_validator.go
Analysis: Formatting functions are reasonably organized by purpose.
Refactoring Recommendations
Priority 1: High Impact, Low Effort
1.1 Consolidate Duplicate JavaScript Function
Task: Extract sanitizeLabelContent to shared utility module
Steps:
- Create
pkg/workflow/js/label_utils.cjs with the sanitization function
- Update
create_issue.cjs to import from shared module
- Update
add_labels.cjs to import from shared module
- Add tests for the shared function
Estimated Effort: 1-2 hours
Benefits:
- Eliminates 14 lines of duplicate code
- Single source of truth for label sanitization
- Easier to test and maintain
1.2 Review and Document Validation Function Organization
Task: Create guidelines for where validation functions should live
Steps:
- Document validation function placement rules in CONTRIBUTING.md:
- Domain-specific validations → domain files (e.g., Docker validation in
docker.go)
- General workflow validations →
validation.go
- Complex validation logic → consider sub-files
- Review the 9 files with scattered validation functions
- Move or document exceptions
Estimated Effort: 2-3 hours
Benefits:
- Clear guidelines for contributors
- Prevents future validation sprawl
- Improves code discoverability
Priority 2: Medium Impact, Medium Effort
2.1 Consolidate JavaScript Sanitization Utilities
Task: Create shared JavaScript sanitization module
Steps:
- Create
pkg/workflow/js/sanitize_shared.cjs
- Move
sanitizeLabelContent (from Priority 1)
- Consider consolidating other JS sanitization functions
- Update imports in dependent files
- Add comprehensive tests
Estimated Effort: 3-4 hours
Benefits:
- Centralized JavaScript sanitization logic
- Reduced duplication
- Easier to apply consistent sanitization rules
2.2 Consider Validation Sub-Files
Task: Split validation.go if it becomes too large
Approach: Only if validation.go exceeds 1000 lines or has distinct validation domains
Suggested Split (if needed):
validation.go - Core workflow validations
validation_packages.go - Package validation (npm, pip, etc.)
validation_strict_mode.go - Strict mode validations
validation_features.go - Repository feature validations
Estimated Effort: 4-6 hours (only if needed)
Benefits:
- Easier navigation of validation logic
- Logical grouping of related validations
Priority 3: Long-term Improvements
3.1 Monitor for Utility Function Sprawl
Task: Establish guidelines for when to create new utility files
Guidelines:
- Functions used in 3+ files → move to utility file
- Domain-specific utilities → keep in domain file
- Test helpers → suffix with
_test_helpers.go
Estimated Effort: Ongoing code review discipline
Benefits: Prevents future utility sprawl
File Organization Assessment
Well-Organized Areas ✅
- Engine Architecture: Each engine has its own file (claude, copilot, codex)
- Create Operations: Separate files for each creation type (issue, PR, discussion)
- String Utilities: Consolidated in
strings.go
- Test Organization: Clear
_test.go suffix convention
Areas for Improvement ⚠️
- Validation Functions: Too scattered (9+ files)
- JavaScript Duplicates: Exact duplicates exist
- Sanitization (JS): Could be more consolidated
Implementation Checklist
Analysis Metadata
- Total Go Files Analyzed: 180 (excluding test files)
- Total Functions Cataloged: 500+ functions across all files
- Function Clusters Identified: 5 major clusters (validation, sanitization, parsing, rendering, formatting)
- Outliers Found: 20+ validation functions in wrong files
- Exact Duplicates Detected: 2+ JavaScript functions (100% match)
- Near-Duplicates Detected: Multiple sanitization functions with similar purpose
- Detection Method: Serena semantic code analysis + grep pattern analysis + manual review
- Analysis Date: 2025-11-04
- Packages Analyzed: cli (60 files), workflow (108 files), parser (6 files), console (3 files), others (3 files)
Conclusion
The gh-aw codebase demonstrates generally good organization with clear package boundaries and feature-based file structure. The primary opportunities for improvement are:
- ⚠️ High Priority: Eliminate JavaScript code duplication (quick win)
- ⚠️ Medium Priority: Consolidate scattered validation functions
- ✅ Low Priority: Current helper organization is acceptable
The refactoring recommendations focus on high-impact, low-effort improvements that will enhance maintainability without requiring extensive restructuring. Most of the codebase follows Go best practices effectively.
🔧 Semantic Function Clustering Analysis
Analysis of repository: githubnext/gh-aw
Executive Summary
Analysis of 180 non-test Go files across the
pkg/directory revealed several refactoring opportunities through semantic function clustering and duplicate detection. The codebase is generally well-organized with clear package boundaries, but there are opportunities to improve code organization by consolidating validation functions, eliminating duplicate code, and centralizing scattered utilities.Key Findings:
validation.goFull Report Details
Function Inventory
Package Distribution
pkg/workflow/pkg/cli/pkg/parser/pkg/console/pkg/constants/pkg/logger/pkg/timeutil/Key File Organization
The repository follows Go best practices with feature-based file organization:
claude_engine.go,copilot_engine.go,codex_engine.go, etc.create_issue.go,create_pr.go,create_discussion.goIdentified Issues
1. Validation Functions Scattered Across Multiple Files
Issue: Validation functions exist in 9+ files outside the dedicated
validation.gofile, violating the single responsibility principle for validation logic.Files with Misplaced Validation Functions:
pkg/workflow/compiler.go- Contains validation logic that should be invalidation.gopkg/workflow/docker.go:86-validateDockerImage()functionpkg/workflow/engine.go:261,315-validateEngine(),validateSingleEngineSpecification()pkg/workflow/expression_safety.go:66,141-validateExpressionSafety(),validateSingleExpression()pkg/workflow/mcp-config.go:1034,1046-validateStringProperty(),validateMCPRequirements()pkg/workflow/npm.go:45-validateNpxPackages()pkg/workflow/pip.go:49,84,113,174- Multiple Python package validation functionspkg/workflow/strict_mode.go:43,72,94,115,155- Five strict mode validation functionspkg/workflow/template.go:54-validateNoIncludesInTemplateRegions()Current State:
validation.gohas 30+ validation functions (primary validation file)Recommendation:
docker.go)validation.govalidation.gobecomes too large (e.g.,validation_packages.go,validation_strict_mode.go)Estimated Impact: Medium - Improved code organization and easier testing of validation logic
2. Exact Duplicate JavaScript Function:
sanitizeLabelContentIssue: The
sanitizeLabelContentfunction appears identically in two JavaScript files.Duplicate Occurrences:
Occurrence 1:
pkg/workflow/js/create_issue.cjs:4-17Occurrence 2:
pkg/workflow/js/add_labels.cjs:4-17(identical implementation)Code Similarity: 100% identical (14 lines)
Recommendation:
pkg/workflow/js/shared_utils.cjs)Estimated Impact: Low effort, high maintainability benefit
3. Duplicate Sanitization Functions (Go and JavaScript)
Issue: Multiple sanitization functions exist across different languages and files with similar purposes.
Sanitization Functions Found:
Go Functions:
pkg/workflow/strings.go:75-SanitizeName()pkg/workflow/strings.go:157-SanitizeWorkflowName()pkg/workflow/workflow_name.go:12-SanitizeIdentifier()JavaScript Functions:
pkg/workflow/js/sanitize.cjs:14-sanitizeContent()pkg/workflow/js/parse_firewall_logs.cjs:242-sanitizeWorkflowName()pkg/workflow/js/create_issue.cjs:4-sanitizeLabelContent()pkg/workflow/js/add_labels.cjs:4-sanitizeLabelContent()(duplicate)Analysis:
strings.goandworkflow_name.goSanitizeIdentifiervsSanitizeWorkflowName)Recommendation:
Estimated Impact: Medium - Reduces JavaScript code duplication
4. Parsing Functions Distributed Across 15+ Files
Issue: Parsing functions are distributed across many files instead of being centralized in parser-related files or following a clear pattern.
Files with Parse Functions:
pkg/workflow/time_delta.go- Multiple date/time parsing functionspkg/workflow/comment.go-ParseCommandEvents()pkg/workflow/dependabot.go-parseNpmPackage(),parsePipPackage(),parseGoPackage()pkg/workflow/create_discussion.go-parseDiscussionsConfig()pkg/workflow/create_pr_review_comment.go-parsePullRequestReviewCommentsConfig()pkg/workflow/threat_detection.go-parseThreatDetectionConfig()pkg/workflow/expressions.go- Expression parsingpkg/workflow/frontmatter_extraction.go- Frontmatter parsingAnalysis:
time_delta.go) ✅ Good organizationparseDiscussionsConfig()) ✅ Acceptable in feature filesRecommendation:
Estimated Impact: Low priority - Current organization is mostly acceptable
5. Helper/Utility File Organization
Issue: Helper and utility files exist in both
pkg/cli/andpkg/workflow/with varying naming conventions.Current Helper Files:
CLI Package:
frontmatter_utils.go- Frontmatter manipulation utilitiesrepeat_utils.go- Retry logic utilitiesshared_utils.go- General shared utilitiesWorkflow Package:
engine_helpers.go- Engine-specific helper functionsprompt_step_helper.go- Prompt step utilitiesstrings.go- String manipulation utilitiessafe_outputs_env_test_helpers.go- Test helper (appropriately named)Analysis:
_utilsand_helperssuffixesRecommendation:
Estimated Impact: Very low - Current state is acceptable
Semantic Function Clustering Results
Cluster 1: Validation Functions⚠️ (Scattered)
Pattern:
validate*functionsTotal Found: 50+ validation functions
Primary File:
validation.go(30+ functions)Scattered Across: 9 additional files
Analysis: While having a primary validation file is good, too many validation functions are scattered. This creates maintenance challenges and makes it harder to understand validation logic.
Cluster 2: Sanitization Functions⚠️ (Partially Consolidated)
Pattern:
sanitize*orSanitize*functionsTotal Found: 10+ functions (Go + JavaScript)
Files:
strings.go,workflow_name.go(good consolidation)Analysis: Go side is well-organized, JavaScript side has duplicates.
Cluster 3: Parsing Functions ✅ (Acceptable)
Pattern:
parse*orParse*functionsTotal Found: 40+ parsing functions
Distribution: Spread across 15+ files based on domain
Analysis: Most parsing functions are appropriately placed in domain-specific files. This is good organization.
Cluster 4: Rendering Functions ✅ (Well Organized)
Pattern:
render*orRender*functionsTotal Found: 30+ rendering functions
Organization: Test files +
engine_helpers.go+ specific engine filesAnalysis: Rendering logic is appropriately distributed. No consolidation needed.
Cluster 5: Formatting Functions ✅ (Well Organized)
Pattern:
format*orFormat*functionsTotal Found: 20+ formatting functions
Key Files:
engine_helpers.go,js.go,permissions_validator.goAnalysis: Formatting functions are reasonably organized by purpose.
Refactoring Recommendations
Priority 1: High Impact, Low Effort
1.1 Consolidate Duplicate JavaScript Function
Task: Extract
sanitizeLabelContentto shared utility moduleSteps:
pkg/workflow/js/label_utils.cjswith the sanitization functioncreate_issue.cjsto import from shared moduleadd_labels.cjsto import from shared moduleEstimated Effort: 1-2 hours
Benefits:
1.2 Review and Document Validation Function Organization
Task: Create guidelines for where validation functions should live
Steps:
docker.go)validation.goEstimated Effort: 2-3 hours
Benefits:
Priority 2: Medium Impact, Medium Effort
2.1 Consolidate JavaScript Sanitization Utilities
Task: Create shared JavaScript sanitization module
Steps:
pkg/workflow/js/sanitize_shared.cjssanitizeLabelContent(from Priority 1)Estimated Effort: 3-4 hours
Benefits:
2.2 Consider Validation Sub-Files
Task: Split
validation.goif it becomes too largeApproach: Only if
validation.goexceeds 1000 lines or has distinct validation domainsSuggested Split (if needed):
validation.go- Core workflow validationsvalidation_packages.go- Package validation (npm, pip, etc.)validation_strict_mode.go- Strict mode validationsvalidation_features.go- Repository feature validationsEstimated Effort: 4-6 hours (only if needed)
Benefits:
Priority 3: Long-term Improvements
3.1 Monitor for Utility Function Sprawl
Task: Establish guidelines for when to create new utility files
Guidelines:
_test_helpers.goEstimated Effort: Ongoing code review discipline
Benefits: Prevents future utility sprawl
File Organization Assessment
Well-Organized Areas ✅
strings.go_test.gosuffix conventionAreas for Improvement⚠️
Implementation Checklist
sanitizeLabelContentto shared JS utilityvalidation.goneeds splittingAnalysis Metadata
Conclusion
The gh-aw codebase demonstrates generally good organization with clear package boundaries and feature-based file structure. The primary opportunities for improvement are:
The refactoring recommendations focus on high-impact, low-effort improvements that will enhance maintainability without requiring extensive restructuring. Most of the codebase follows Go best practices effectively.
Note: This analysis focused on non-test Go files (
.goexcluding*_test.go) and associated JavaScript files in thepkg/directory. The findings represent refactoring opportunities discovered through semantic function clustering, naming pattern analysis, and duplicate detection using Serena's code analysis tools.