Methodology
How we compute health scores for 8,517 repositories, and why.
Current dimension averages
Across 8,517 repositories. Security and Maintenance are consistently the weakest areas.
1. Philosophy
We measure repository health practices, not code quality.
A repo with excellent code but no CI, no security policy, and no tests scores poorly. A repo with boilerplate code but proper CI, documentation, and security practices scores well. This is intentional. Practices are observable, automatable, and actionable. Code quality is subjective, requires deep analysis, and varies by domain.
Important disclaimer: Presence of a file does not guarantee quality. An empty SECURITY.md passes the security policy check. A CI workflow that never runs green still counts as "CI detected." We measure whether the practice exists, not how well it is implemented. This is a known limitation we accept in exchange for being able to score thousands of repos without cloning them.
2. The six dimensions
Security
avg 30/100Why it matters: Software supply chain attacks increased 742% between 2019 and 2022 (Sonatype State of the Software Supply Chain). A security policy and pinned dependencies are baseline defenses that cost minutes to implement but signal that maintainers take vulnerability disclosure seriously.
| Check | Weight |
|---|---|
| SECURITY.md with contact info and disclosure process | 25 |
| GitHub Actions pinned to full commit SHAs | 15 |
| Dependency update automation (Dependabot/Renovate) | 15 |
| Explicit token permissions in workflows | 10 |
| No committed .env files | 10 |
| .gitignore present | 10 |
| CI workflows detected (branch protection proxy) | 10 |
| CODEOWNERS / OWNERS / MAINTAINERS file | 5 |
Testing
avg 39/100Why it matters: Google's DORA research shows teams with robust testing deploy 46x more frequently with 7x lower change failure rates (Accelerate: State of DevOps). Testing infrastructure is the foundation of deployment confidence.
| Check | Weight |
|---|---|
| CI workflows detected | 25 |
| Test files present (__tests__, .test., .spec., tests/) | 25 |
| Coverage configuration (nyc, jest, vitest, codecov) | 20 |
| Test runner configured (package.json scripts, pytest, cargo) | 15 |
| Pre-commit hooks (Husky, pre-commit-config) | 5 |
IaC projects check for Terratest, kitchen-terraform, and InSpec instead. Documentation projects check for markdown linting and link checking.
Documentation
avg 53/100Why it matters: A project without documentation is a project without users. GitHub's 2023 Open Source Survey found that incomplete or confusing documentation is the single largest barrier to open source participation. README quality directly correlates with adoption and contribution rates.
| Check | Weight |
|---|---|
| README exists and is >500 characters | 30 |
| LICENSE file present | 20 |
| CONTRIBUTING guide | 15 |
| CHANGELOG / CHANGES / HISTORY | 15 |
| Documentation directory (docs/) or API docs | 10 |
| Repository description set | 10 |
Architecture
avg 40/100Why it matters: Consistent tooling (linters, formatters, type checkers) reduces defect introduction rates. A 2022 study in IEEE TSE found that projects using static analysis tools had 20-30% fewer post-release defects. We check for language-appropriate tooling across 9 language profiles.
| Check | Weight |
|---|---|
| Type safety (tsconfig, mypy, built-in type system) | 20 |
| Linter (ESLint, Biome, ruff, golangci-lint, Clippy) | 20 |
| Organized source structure (src/, lib/, cmd/) | 20 |
| Code formatter (Prettier, black, gofmt, rustfmt) | 15 |
| Build configuration (tsconfig, Cargo.toml, Makefile) | 10 |
Language profiles: TypeScript, JavaScript, Python, Go, Rust, Java, Shell, C/C++, Ruby, plus a generic fallback. IaC projects check tflint, terraform fmt, and module conventions instead.
DevOps
avg 35/100Why it matters: CI/CD automation is the foundation of reliable software delivery. DORA metrics consistently show that elite-performing teams have fully automated deployment pipelines. We detect 12 CI/CD systems to avoid penalizing projects that use non-GitHub CI.
| Check | Weight |
|---|---|
| CI/CD pipeline (GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.) | 30 |
| Container support (Dockerfile, docker-compose) | 20 |
| Release automation (semantic-release, changesets) | 20 |
| Issue/PR templates | 15 |
| Deployment/infrastructure config (Terraform, k8s, Helm) | 5 |
Maintenance
avg 33/100Why it matters: An unmaintained dependency is a ticking time bomb. The Heartbleed vulnerability persisted for 2 years in OpenSSL partly because of insufficient maintenance resources. Bus factor and funding signal whether a project can sustain itself beyond a single contributor.
| Check | Weight |
|---|---|
| Last commit within 30 days (full) or 90 days (partial) | 30 |
| Latest release/tag within 180 days | 20 |
| Bus factor (4+ contributors with >5% of commits) | 20 |
| Median open issue age under 90 days | 15 |
| FUNDING.yml present (not penalized if absent) | 10* |
| 10+ stars (community interest signal) | 5 |
*Funding weight is 0 if absent -- repos are not penalized for missing FUNDING.yml. The "Recent releases" check also detects release hygiene gaps (old releases + recent commits) and repos that have never published a release.
3. How scoring works
Each dimension score is the percentage of check weights earned. The overall score is a weighted average across dimensions, with weights determined by the detected project type.
Worked example -- a hypothetical application-type repo with equal weights:
| Dimension | Score | Weight | Contribution |
|---|---|---|---|
| Security | 70 | 1.0 | 70 |
| Testing | 100 | 1.0 | 100 |
| Documentation | 85 | 1.0 | 85 |
| Architecture | 100 | 1.0 | 100 |
| DevOps | 50 | 1.0 | 50 |
| Maintenance | 83 | 1.0 | 83 |
| Total | 488 / 6.0 | = 81 (B) | |
Grade scale:
4. Project type weights
Different project types have different priorities. An IaC repo needs strong security practices (1.5x weight) because misconfigurations deploy directly to production. A library needs excellent documentation (1.5x) because adoption depends on it. A mirror repo has minimal expectations across the board.
| Dimension | App | Library | IaC | Hybrid | Docs | Runtime | Mirror |
|---|---|---|---|---|---|---|---|
| Security | 1.0 | 0.8 | 1.5 | 1.2 | 0.3 | 0.5 | 0.3 |
| Testing | 1.0 | 1.2 | 0.8 | 1.0 | 0.5 | 0.8 | 0.5 |
| Documentation | 1.0 | 1.5 | 1.0 | 1.0 | 2.0 | 1.5 | 1.5 |
| Architecture | 1.0 | 1.0 | 1.5 | 1.2 | 1.5 | 0.5 | 0.5 |
| DevOps | 1.0 | 0.5 | 0.8 | 1.0 | 0.5 | 0.5 | 0.2 |
| Maintenance | 1.0 | 1.0 | 0.8 | 0.8 | 0.5 | 1.5 | 1.5 |
Highlighted values (accented) indicate the most influential weight for that project type. Documentation and mirror repos receive numeric scores but are not assigned letter grades (shown as N/A).
5. What we don't measure
Transparency about limitations is more useful than pretending they don't exist. Here is what this report cannot tell you.
Code quality
Cyclomatic complexity, code duplication, and tech debt require cloning the repository. We only use the GitHub API and tree endpoints.
CI pass rate
We detect whether CI exists, not whether it consistently passes. A repo with a perpetually red CI pipeline scores the same as one with all-green checks.
Dependency vulnerability depth
We check deps.dev for the main package but do not perform transitive dependency vulnerability analysis.
Community health beyond metrics
We measure issue age and PR responsiveness, not response quality, maintainer tone, or contributor sentiment.
GitHub-centric bias
Repos using Concourse, Buildkite, or other non-standard CI systems may score lower on some DevOps checks. We detect 12 CI systems, but the long tail is vast.
File quality vs. file presence
An empty SECURITY.md passes the security policy check. A one-line CONTRIBUTING.md passes the contribution guide check. We verify existence, not substance.
Popularity
Stars carry only 5 points in the maintenance dimension. A project with 100k stars and no CI scores lower than a 10-star project with proper practices. This is intentional.
6. Comparison to other frameworks
Several frameworks measure open source project health. We overlap with each but differ in scope and approach.
| Framework | Focus | Our overlap |
|---|---|---|
| OpenSSF Scorecard | Security (20 checks) | We implement 5 Scorecard-inspired checks: pinned deps, token permissions, branch protection (proxy), dependency updates, and security policy. |
| CHAOSS | Community (89 metrics) | We implement bus factor, issue freshness, and PR responsiveness. CHAOSS covers contributor diversity, event attendance, and organizational participation that we do not attempt. |
| CII Best Practices | Maturity (67 criteria) | We check roughly 30 of the Passing-level criteria. CII requires self-attestation; we infer from repo metadata. |
| Libraries.io SourceRank | Adoption (15 factors) | We check dependent count via deps.dev and stars. SourceRank also factors in download counts, reverse dependencies, and package manager presence. |
7. How to improve your score
Quick wins per dimension, ranked roughly by effort (lowest first).
Security
- Add a SECURITY.md with a disclosure email
- Enable Dependabot or Renovate
- Pin GitHub Actions to full SHA hashes
- Add
permissions: read-allto workflows
Testing
- Add a CI workflow that runs on push/PR
- Add at least one test file in a
tests/directory - Configure a coverage tool (codecov, coveralls)
- Add a test script to package.json / Makefile
Documentation
- Write a README longer than 500 characters
- Add a LICENSE file
- Add a CONTRIBUTING.md with setup instructions
- Set a repository description on GitHub
Architecture
- Add a linter config for your language
- Add a code formatter (Prettier, black, rustfmt)
- Organize source code into
src/orlib/ - Enable type checking (tsconfig strict, mypy)
DevOps
- Add a GitHub Actions CI workflow
- Add issue and PR templates
- Set up release automation (semantic-release, changesets)
- Add a Dockerfile if the project is deployable
Maintenance
- Commit at least once every 90 days
- Triage open issues (close stale ones)
- Tag a release at least every 6 months
- Diversify commit access (reduce bus factor risk)
8. FAQ
"Why does my well-maintained repo score poorly?"
Probably missing security practices. Most developers focus on code quality and testing but skip SECURITY.md, dependency pinning, and token permissions. These three checks alone account for 50 points in the Security dimension. If your project uses non-GitHub CI (Jenkins, Buildkite, Concourse), it may also miss DevOps checks even though your pipeline is solid.
"Why do awesome-lists and docs repos get N/A?"
Documentation-only repositories are not graded on code-centric metrics because the comparison would be meaningless. They still receive dimension scores for the checks that apply, but no overall letter grade. Mirror repos are treated the same way.
"Can I customize the weights?"
Not yet on the website. The CLI supports --explain to show the full scoring breakdown for any repository, including which weight profile was applied and how each check contributed. Custom weight profiles are on the roadmap.
"How often is the data updated?"
The site is regenerated periodically from fresh GitHub API data. Individual repo scores can shift between runs based on recent commits, issue activity, and release cadence. The "last commit" and "issue freshness" checks are time-sensitive by design.
"A check is wrong for my repo. What do I do?"
Open an issue with the repo name and the check you believe is incorrect. We regularly refine detection heuristics based on false positive and false negative reports.
Source and reproduction
The scoring logic is open source at
github.com/williamzujkowski/repo-health-report.
Run repo-health-report owner/repo --explain
to reproduce any score shown on this site. The SCORING.md file in the repo root
is the canonical reference and is kept in sync with the scoring implementation.