GitOps Remediation
11 min
overview gitops remediation is kubegrade’s workflow for converting operational findings (drift, incidents, upgrade blockers, optimization recommendations) into reviewable pull requests and applying changes through your existing gitops system supported repo layouts common layouts monorepo (all environments/services) environment per folder service per folder repo per service platform repo + app repos split helm values repos terraform module + environment composition repos best practice require users to define repo path mappings per cluster/environment file ownership / codeowners expectations branch strategy pr generation rules (naming, commit strategy, reviewers) branch naming (recommended) kubegrade/\<module>/\<cluster>/\<issue or workflow id> examples kubegrade/drift/prod eu1/restore nginx limits kubegrade/upgrade/staging/1 29 preflight fixes commit strategy one commit per logical remediation (preferred) squash option for noisy generated changes signed commits if required by org policy reviewers auto assign based on codeowners add platform approvers for prod scopes tag relevant service owners for workload level changes policy checks before pr before opening a pr, kubegrade validates scope and permissions policy compliance environment restrictions change type restrictions required approvals optional dry run validation/tests (if integrated) if checks fail open recommendation without pr show blocking reasons and remediation steps merge to apply mechanics typical flow pr merged in git provider gitops tool (argo cd / flux) detects commit gitops sync applies changes to cluster kubegrade watches rollout status / post change signals audit trail links pr → sync → outcome recommended controls sync windows manual sync for production health checks before marking complete rollback prs rollback approaches revert generated pr commit(s) generate rollback pr from known good state partial rollback for specific resources (advanced / policy controlled) rollback triggers (examples) failed post change health checks elevated error rates dependency graph degradation slo breach detected via integrations auditability change provenance every gitops remediation should be traceable provenance fields (recommended) workflow source (drift/troubleshoot/upgrade/etc ) cluster/environment/namespace agent/workflow version trigger type (manual/scheduled/alert driven) approvers and timestamps pr link and commit sha gitops sync result verification result