--- description: Safely convert private repositories to public open source. Covers legal review, secret scanning, history sanitization, documentation, vulnerability scanning, and post-publication hardening. category: general --- flowchart TD _HEADER_["
Open-Sourcing Workflow
Safely convert private repositories to public open source. Covers legal review, secret scanning, history sanitization, documentation, vulnerability scanning, and post-publication hardening.
"]:::headerStyle classDef headerStyle fill:none,stroke:none subgraph _MAIN_[" "] %% Phase 1: Pre-Flight subgraph PreFlight["Phase 1: Pre-Flight"] LEGAL[Legal Review] --> TOOLS[Install Tools] TOOLS --> CLONE[Fresh Mirror Clone] end %% Phase 2: Secret Scanning & Audit subgraph Scanning["Phase 2: Secret Scanning"] CLONE --> GITLEAKS[Gitleaks Scan] CLONE --> TRUFFLEHOG[TruffleHog Scan] GITLEAKS & TRUFFLEHOG --> PATTERN[Pattern Audit] PATTERN --> PERSONAL[Personal Detail Audit] PERSONAL --> SCAN_RESULT{Secrets Found?} SCAN_RESULT -->|Yes| SANITIZE SCAN_RESULT -->|No| DEPS end %% Phase 3: History Sanitization subgraph Sanitize["Phase 3: History Sanitization"] SANITIZE[Build Replacements File] --> FILTER_TEXT[Replace Text Patterns] FILTER_TEXT --> FILTER_FILES[Remove Sensitive Files] FILTER_FILES --> FILTER_AUTHORS[Fix Author Info] FILTER_AUTHORS --> STRIP_MSGS[Scrub Commit Messages] STRIP_MSGS --> STRIP_EXIF[Strip Image EXIF] STRIP_EXIF --> RESCAN[Re-scan with Gitleaks] RESCAN --> CLEAN{History Clean?} CLEAN -->|No| SANITIZE CLEAN -->|Yes| DEPS end %% Phase 4: Code Cleanup & Dependencies subgraph Cleanup["Phase 4: Code Cleanup"] DEPS[Audit Dependencies] --> PRIVATE_DEPS{Private Deps?} PRIVATE_DEPS -->|Yes| REPLACE_DEPS[Replace or Inline] PRIVATE_DEPS -->|No| HYGIENE REPLACE_DEPS --> HYGIENE[Technical Hygiene] HYGIENE --> VULN_SCAN[Vulnerability Scan] VULN_SCAN --> BUILD_CHECK[Verify Clean Build] end %% Phase 5: Documentation & License subgraph Docs["Phase 5: Documentation"] BUILD_CHECK --> LICENSE[Choose License] LICENSE --> README[Write README] README --> CONTRIB[Create CONTRIBUTING.md] CONTRIB --> SECURITY_MD[Create SECURITY.md] SECURITY_MD --> TEMPLATES[Issue and PR Templates] end %% Phase 6: Final Verification & Release subgraph Release["Phase 6: Release"] TEMPLATES --> FINAL_SCAN[Final Gitleaks Scan] FINAL_SCAN --> FINAL_OK{All Clear?} FINAL_OK -->|No| SANITIZE FINAL_OK -->|Yes| TAG[Tag v0.1.0] TAG --> PUBLISH[Push to Public Remote] PUBLISH --> CI_SETUP[Set Up CI and Dependabot] end %% Phase 7: Post-Publication subgraph PostPub["Phase 7: Post-Publication"] CI_SETUP --> ROTATE[Rotate All Secrets] ROTATE --> MONITOR[Enable Secret Scanning Alerts] MONITOR --> ARCHIVE[Archive Private Repo] ARCHIVE --> PROTECT[Set Up Branch Protection] end click LEGAL "#" "**Legal Review**\nComplete before any code work.\n- Get written employer IP clearance\n- Evaluate patent exposure (use Apache 2.0 or GPL v3 for patented techniques)\n- Audit dependency license compatibility\n- Check export control for non-standard crypto\n- Choose contributor model: DCO (lightweight) or CLA (relicensing flexibility)\n\nTools: `pip-licenses`, `npx license-checker`, `go-licenses`" click TOOLS "#" "**Install Required Tools**\n`brew install gitleaks git-filter-repo exiftool`\n`brew install trufflehog` (optional)\n`brew install semgrep trivy` (vuln scanning)\n\nAll tools are available via Homebrew." click CLONE "#" "**Fresh Mirror Clone**\nAlways work on a fresh clone to avoid corrupting the original.\n\n`git clone --mirror repo.git repo-clean.git`\n`cd repo-clean.git`\n\nNever sanitize the original repository directly." click GITLEAKS "#" "**Gitleaks Secret Scan**\nScan both current code and full git history.\n\n`gitleaks detect --source . --verbose`\n`gitleaks detect --source . --verbose --log-opts='--all'`\n\nOutput as JSON for parsing:\n`gitleaks detect --source . --report-format json --report-path leaks.json`\n\nDetects: AWS keys, GitHub tokens, private keys, database URLs, OAuth tokens, high-entropy strings." click TRUFFLEHOG "#" "**TruffleHog Secret Scan**\nMore comprehensive scanner, supports verified detection.\n\n`trufflehog git file://. --only-verified`\n`trufflehog git file://. --include-detectors all`\n\nCross-reference with Gitleaks results for maximum coverage." click PATTERN "#" "**Sensitive Pattern Audit**\nSearch for patterns that secret scanners miss.\n\n- Internal domains: `.internal`, `.local`, `.corp`\n- Internal IPs: `192.168.*`, `10.0.*`, `172.16.*`\n- Hardcoded credentials: `password=`, `api_key=`, `token=`\n- Private registry URLs in configs\n- CI/CD configs with internal references\n\nCheck: `.env*`, `*.yaml`, `config.*`, `docker-compose.*`, `Dockerfile*`, `pyproject.toml`, `package.json`, `*.tf`" click PERSONAL "#" "**Personal Detail Audit**\nThe most dangerous leaks - they do not look like secrets.\n\n- Git authors: real names, work emails in commit history\n- Commit messages: ticket IDs (TRCKR-55, JIRA-123), internal codenames\n- Personal domains: custom hosting URLs\n- Machine hostnames in configs or paths\n- Home directory paths: `/home/username/`, `/Users/name/`\n- Package metadata: author, email, homepage fields\n- Agent configs: CLAUDE.md, .claude/ directories\n- Image EXIF: GPS coordinates, device owner names" click SCAN_RESULT "#" "**Decision: Secrets Found?**\nIf any scanner or audit found issues, proceed to history sanitization.\nIf clean, skip ahead to dependency audit." click SANITIZE "#" "**Build Replacements File**\nCreate a text file mapping old values to new:\n\n`cat > replacements.txt`\n- `internal.company.com==>example.com`\n- `api.company.internal==>api.example.com`\n- `user@company.com==>user@example.com`\n- `AKIAIOSFODNN7EXAMPLE==>YOUR_AWS_ACCESS_KEY`\n\nOr use the automated script:\n`scripts/generate-replacements.sh`" click FILTER_TEXT "#" "**Replace Text Patterns**\nRewrite all matching text across entire git history.\n\n`git filter-repo --replace-text replacements.txt --force`\n\nThis rewrites every commit that contains any matched pattern." click FILTER_FILES "#" "**Remove Sensitive Files from History**\nRemove files that should never have been committed.\n\n`git filter-repo --path secrets.json --invert-paths`\n`git filter-repo --path .env.production --invert-paths`\n`git filter-repo --path-glob '*.pem' --invert-paths`\n`git filter-repo --strip-blobs-bigger-than 10M`" click FILTER_AUTHORS "#" "**Fix Author Information**\nCommit history embeds name and email in every commit.\n\n`git log --all --format='%aN <%aE>' | sort -u`\n\nCreate `.mailmap`:\n`Contributor Real Name `\n\nApply: `git filter-repo --mailmap .mailmap --force`" click STRIP_MSGS "#" "**Scrub Commit Messages**\nRemove ticket references, internal URLs, and codenames.\n\n`git log --all --oneline | grep -oE '[A-Z]{2,10}-[0-9]{1,6}' | sort -u`\n\nStrip ticket refs with `git filter-repo --message-callback` using a regex to remove JIRA-style IDs.\n\nReplace URLs:\n`git filter-repo --replace-message replacements.txt --force`" click STRIP_EXIF "#" "**Strip Image EXIF Data**\nRemove GPS coordinates, device names, and camera owner info.\n\n`exiftool -all= -overwrite_original *.jpg *.png *.jpeg`\n\nRecursive strip:\n`find . -type f \\( -iname '*.jpg' -o -iname '*.png' \\) -exec exiftool -all= -overwrite_original {} +`" click RESCAN "#" "**Re-scan After Sanitization**\nRun Gitleaks again on the rewritten history to verify all secrets were removed.\n\n`gitleaks detect --source . --verbose --log-opts='--all' --exit-code 1`\n\nMust exit clean before proceeding." click CLEAN "#" "**Decision: History Clean?**\nIf re-scan finds remaining issues, loop back and add more replacements.\nIf clean, proceed to dependency audit." click DEPS "#" "**Audit Dependencies**\nCheck for private or incompatible dependencies.\n\nPython: `grep -E 'git\\+ssh://|@private' requirements.txt pyproject.toml`\nNode: `grep -E '@company|private-registry' package.json .npmrc`\nGo: `grep -E 'module|// .*@' go.mod`\n\nAlso check license compatibility:\n`pip-licenses`, `npx license-checker`, `go-licenses`" click PRIVATE_DEPS "#" "**Decision: Private Dependencies?**\nIf any dependencies reference private registries, internal repos, or git+ssh URLs, they must be replaced before release." click REPLACE_DEPS "#" "**Replace Private Dependencies**\nOptions for each private dependency:\n- Publish the dep publicly (PyPI, npm, Go module)\n- Inline the code if small\n- Remove the feature if not core\n- Document as optional (user must provide)" click HYGIENE "#" "**Technical Hygiene**\nItems commonly missed:\n- Remove dead code (`vulture` for Python, `deadcode` for Go)\n- Scrub TODO/FIXME/HACK comments referencing people or teams\n- Remove private Docker registries from CI configs\n- Check `.gitmodules` for private repo URLs\n- Remove or opt-in telemetry/phone-home endpoints\n- Replace real PII in test fixtures with fake data\n- Remove binary artifacts and compiled files\n- Scrub internal codenames from variables\n- Ensure `.gitignore` covers `.DS_Store`, `.env`, IDE configs" click VULN_SCAN "#" "**Vulnerability Scan**\nScan for known vulnerabilities before public release.\n\n`semgrep scan --config=p/owasp-top-ten --config=p/security-audit`\n`trivy fs . --severity HIGH,CRITICAL`\n`govulncheck ./...` (Go projects)\n\nFix critical and high severity issues before release." click BUILD_CHECK "#" "**Verify Clean Build**\nClone to a fresh directory and verify outsiders can build with only public dependencies.\n\n`git clone repo-clean.git /tmp/test-build`\n`cd /tmp/test-build`\nRun the full build and test suite.\n\nIf it fails, fix dependency or build issues." click LICENSE "#" "**Choose and Add License**\nQuick guide:\n- **MIT** - Maximum adoption, minimal restrictions\n- **Apache 2.0** - Corporate use, patent protection\n- **GPL-3.0** - Require derivatives stay open source\n- **BSD-3** - Similar to MIT with attribution\n\nAdd SPDX identifiers to source files:\n`// SPDX-License-Identifier: MIT`" click README "#" "**Write README.md**\nStructure:\n- Project name and description\n- Features list\n- Installation instructions\n- Quick start example\n- Documentation links\n- Contributing link\n- License\n\nAdd badges: CI status, Go Report Card, docs, license." click CONTRIB "#" "**Create CONTRIBUTING.md**\nHow to contribute:\n- Fork, branch, change, test, PR\n- Code style expectations\n- Testing requirements\n- Review process" click SECURITY_MD "#" "**Create SECURITY.md**\nVulnerability reporting instructions.\n\nPoint to GitHub private vulnerability reporting:\n`https://github.com/OWNER/REPO/security/advisories/new`" click TEMPLATES "#" "**Issue and PR Templates**\nCreate `.github/ISSUE_TEMPLATE/bug_report.md` with OS, version, and reproduction steps.\n\nCreate `.github/PULL_REQUEST_TEMPLATE.md` with what changed and how to test.\n\nAlso add `CODE_OF_CONDUCT.md` (Contributor Covenant)." click FINAL_SCAN "#" "**Final Gitleaks Scan**\nOne last scan of the fully prepared repository.\n\n`gitleaks detect --source . --verbose --log-opts='--all' --exit-code 1`\n\nThis is the gate - no release if secrets are found." click FINAL_OK "#" "**Decision: All Clear?**\nIf the final scan finds anything, loop back to sanitization.\nIf clean, proceed to release." click TAG "#" "**Tag First Release**\n`git tag v0.1.0`\n`git push origin v0.1.0`\n\nWithout a tag, package managers cannot install and Go modules will not resolve." click PUBLISH "#" "**Push to Public Remote**\nAdd the public remote and push.\n\n`git remote add public https://github.com/OWNER/REPO.git`\n`git push public --all`\n`git push public --tags`" click CI_SETUP "#" "**Set Up CI and Dependabot**\nAdd `.github/workflows/ci.yml` for tests and linting.\nAdd `.github/workflows/release.yml` triggered on tag push (Goreleaser).\nAdd `.github/dependabot.yml` for weekly dependency updates.\n\nEnsure unit tests run without API keys (mock external clients)." click ROTATE "#" "**Rotate ALL Secrets**\nEvery secret that was ever in the repo must be rotated, even if removed from history.\n\nAssume any secret that touched git is compromised.\n\n- AWS access keys\n- API tokens\n- Database passwords\n- OAuth credentials" click MONITOR "#" "**Enable Secret Scanning Alerts**\nEnable GitHub secret scanning on the public repo.\n\nSet up notifications for any new secret detections." click ARCHIVE "#" "**Archive Private Repo**\nArchive the original private repository to prevent accidental pushes.\n\nUpdate internal docs to note the public repo location." click PROTECT "#" "**Set Up Branch Protection**\nRequire PR reviews before merging on the public repo.\n\nEnable status checks for CI to pass before merge." classDef preflight fill:#e8daef,stroke:#b07cc6 classDef scanning fill:#d1ecf1,stroke:#7ec8d8 classDef sanitize fill:#ffeaa7,stroke:#e0c040 classDef cleanup fill:#fff3cd,stroke:#f0c040 classDef docs fill:#d4edda,stroke:#5cb85c classDef release fill:#d1ecf1,stroke:#7ec8d8 classDef postpub fill:#f8d7da,stroke:#e06070 classDef decision fill:#fff3cd,stroke:#f0c040 class LEGAL,TOOLS,CLONE preflight class GITLEAKS,TRUFFLEHOG,PATTERN,PERSONAL scanning class SANITIZE,FILTER_TEXT,FILTER_FILES,FILTER_AUTHORS,STRIP_MSGS,STRIP_EXIF,RESCAN sanitize class DEPS,REPLACE_DEPS,HYGIENE,VULN_SCAN,BUILD_CHECK cleanup class LICENSE,README,CONTRIB,SECURITY_MD,TEMPLATES docs class FINAL_SCAN,TAG,PUBLISH,CI_SETUP release class ROTATE,MONITOR,ARCHIVE,PROTECT postpub class SCAN_RESULT,CLEAN,PRIVATE_DEPS,FINAL_OK decision end style _MAIN_ fill:none,stroke:none,padding:0 _HEADER_ ~~~ _MAIN_