Methodology

How we compute health scores for 8,517 repositories, and why.

Current dimension averages

Across 8,517 repositories. Security and Maintenance are consistently the weakest areas.

Security
30
Maintenance
33
DevOps
35
Testing
39
Architecture
40
Documentation
53

1. Philosophy

We measure repository health practices, not code quality.

A repo with excellent code but no CI, no security policy, and no tests scores poorly. A repo with boilerplate code but proper CI, documentation, and security practices scores well. This is intentional. Practices are observable, automatable, and actionable. Code quality is subjective, requires deep analysis, and varies by domain.

Important disclaimer: Presence of a file does not guarantee quality. An empty SECURITY.md passes the security policy check. A CI workflow that never runs green still counts as "CI detected." We measure whether the practice exists, not how well it is implemented. This is a known limitation we accept in exchange for being able to score thousands of repos without cloning them.

2. The six dimensions

Security

avg 30/100

Why it matters: Software supply chain attacks increased 742% between 2019 and 2022 (Sonatype State of the Software Supply Chain). A security policy and pinned dependencies are baseline defenses that cost minutes to implement but signal that maintainers take vulnerability disclosure seriously.

Check Weight
SECURITY.md with contact info and disclosure process25
GitHub Actions pinned to full commit SHAs15
Dependency update automation (Dependabot/Renovate)15
Explicit token permissions in workflows10
No committed .env files10
.gitignore present10
CI workflows detected (branch protection proxy)10
CODEOWNERS / OWNERS / MAINTAINERS file5

Testing

avg 39/100

Why it matters: Google's DORA research shows teams with robust testing deploy 46x more frequently with 7x lower change failure rates (Accelerate: State of DevOps). Testing infrastructure is the foundation of deployment confidence.

Check Weight
CI workflows detected25
Test files present (__tests__, .test., .spec., tests/)25
Coverage configuration (nyc, jest, vitest, codecov)20
Test runner configured (package.json scripts, pytest, cargo)15
Pre-commit hooks (Husky, pre-commit-config)5

IaC projects check for Terratest, kitchen-terraform, and InSpec instead. Documentation projects check for markdown linting and link checking.

Documentation

avg 53/100

Why it matters: A project without documentation is a project without users. GitHub's 2023 Open Source Survey found that incomplete or confusing documentation is the single largest barrier to open source participation. README quality directly correlates with adoption and contribution rates.

Check Weight
README exists and is >500 characters30
LICENSE file present20
CONTRIBUTING guide15
CHANGELOG / CHANGES / HISTORY15
Documentation directory (docs/) or API docs10
Repository description set10

Architecture

avg 40/100

Why it matters: Consistent tooling (linters, formatters, type checkers) reduces defect introduction rates. A 2022 study in IEEE TSE found that projects using static analysis tools had 20-30% fewer post-release defects. We check for language-appropriate tooling across 9 language profiles.

Check Weight
Type safety (tsconfig, mypy, built-in type system)20
Linter (ESLint, Biome, ruff, golangci-lint, Clippy)20
Organized source structure (src/, lib/, cmd/)20
Code formatter (Prettier, black, gofmt, rustfmt)15
Build configuration (tsconfig, Cargo.toml, Makefile)10

Language profiles: TypeScript, JavaScript, Python, Go, Rust, Java, Shell, C/C++, Ruby, plus a generic fallback. IaC projects check tflint, terraform fmt, and module conventions instead.

DevOps

avg 35/100

Why it matters: CI/CD automation is the foundation of reliable software delivery. DORA metrics consistently show that elite-performing teams have fully automated deployment pipelines. We detect 12 CI/CD systems to avoid penalizing projects that use non-GitHub CI.

Check Weight
CI/CD pipeline (GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.)30
Container support (Dockerfile, docker-compose)20
Release automation (semantic-release, changesets)20
Issue/PR templates15
Deployment/infrastructure config (Terraform, k8s, Helm)5

Maintenance

avg 33/100

Why it matters: An unmaintained dependency is a ticking time bomb. The Heartbleed vulnerability persisted for 2 years in OpenSSL partly because of insufficient maintenance resources. Bus factor and funding signal whether a project can sustain itself beyond a single contributor.

Check Weight
Last commit within 30 days (full) or 90 days (partial)30
Latest release/tag within 180 days20
Bus factor (4+ contributors with >5% of commits)20
Median open issue age under 90 days15
FUNDING.yml present (not penalized if absent)10*
10+ stars (community interest signal)5

*Funding weight is 0 if absent -- repos are not penalized for missing FUNDING.yml. The "Recent releases" check also detects release hygiene gaps (old releases + recent commits) and repos that have never published a release.

3. How scoring works

Each dimension score is the percentage of check weights earned. The overall score is a weighted average across dimensions, with weights determined by the detected project type.

dimension_score = earned_weight / total_weight * 100
overall = sum(dimension_score * type_weight) / sum(type_weight)

Worked example -- a hypothetical application-type repo with equal weights:

Dimension Score Weight Contribution
Security701.070
Testing1001.0100
Documentation851.085
Architecture1001.0100
DevOps501.050
Maintenance831.083
Total 488 / 6.0 = 81 (B)

Grade scale:

A: 90+ B: 80-89 C: 70-79 D: 60-69 F: <60 N/A: docs & mirrors

4. Project type weights

Different project types have different priorities. An IaC repo needs strong security practices (1.5x weight) because misconfigurations deploy directly to production. A library needs excellent documentation (1.5x) because adoption depends on it. A mirror repo has minimal expectations across the board.

Dimension App Library IaC Hybrid Docs Runtime Mirror
Security 1.0 0.8 1.5 1.2 0.3 0.5 0.3
Testing 1.0 1.2 0.8 1.0 0.5 0.8 0.5
Documentation 1.0 1.5 1.0 1.0 2.0 1.5 1.5
Architecture 1.0 1.0 1.5 1.2 1.5 0.5 0.5
DevOps 1.0 0.5 0.8 1.0 0.5 0.5 0.2
Maintenance 1.0 1.0 0.8 0.8 0.5 1.5 1.5

Highlighted values (accented) indicate the most influential weight for that project type. Documentation and mirror repos receive numeric scores but are not assigned letter grades (shown as N/A).

5. What we don't measure

Transparency about limitations is more useful than pretending they don't exist. Here is what this report cannot tell you.

Code quality

Cyclomatic complexity, code duplication, and tech debt require cloning the repository. We only use the GitHub API and tree endpoints.

CI pass rate

We detect whether CI exists, not whether it consistently passes. A repo with a perpetually red CI pipeline scores the same as one with all-green checks.

Dependency vulnerability depth

We check deps.dev for the main package but do not perform transitive dependency vulnerability analysis.

Community health beyond metrics

We measure issue age and PR responsiveness, not response quality, maintainer tone, or contributor sentiment.

GitHub-centric bias

Repos using Concourse, Buildkite, or other non-standard CI systems may score lower on some DevOps checks. We detect 12 CI systems, but the long tail is vast.

File quality vs. file presence

An empty SECURITY.md passes the security policy check. A one-line CONTRIBUTING.md passes the contribution guide check. We verify existence, not substance.

Popularity

Stars carry only 5 points in the maintenance dimension. A project with 100k stars and no CI scores lower than a 10-star project with proper practices. This is intentional.

6. Comparison to other frameworks

Several frameworks measure open source project health. We overlap with each but differ in scope and approach.

Framework Focus Our overlap
OpenSSF Scorecard Security (20 checks) We implement 5 Scorecard-inspired checks: pinned deps, token permissions, branch protection (proxy), dependency updates, and security policy.
CHAOSS Community (89 metrics) We implement bus factor, issue freshness, and PR responsiveness. CHAOSS covers contributor diversity, event attendance, and organizational participation that we do not attempt.
CII Best Practices Maturity (67 criteria) We check roughly 30 of the Passing-level criteria. CII requires self-attestation; we infer from repo metadata.
Libraries.io SourceRank Adoption (15 factors) We check dependent count via deps.dev and stars. SourceRank also factors in download counts, reverse dependencies, and package manager presence.

7. How to improve your score

Quick wins per dimension, ranked roughly by effort (lowest first).

Security

  • Add a SECURITY.md with a disclosure email
  • Enable Dependabot or Renovate
  • Pin GitHub Actions to full SHA hashes
  • Add permissions: read-all to workflows

Testing

  • Add a CI workflow that runs on push/PR
  • Add at least one test file in a tests/ directory
  • Configure a coverage tool (codecov, coveralls)
  • Add a test script to package.json / Makefile

Documentation

  • Write a README longer than 500 characters
  • Add a LICENSE file
  • Add a CONTRIBUTING.md with setup instructions
  • Set a repository description on GitHub

Architecture

  • Add a linter config for your language
  • Add a code formatter (Prettier, black, rustfmt)
  • Organize source code into src/ or lib/
  • Enable type checking (tsconfig strict, mypy)

DevOps

  • Add a GitHub Actions CI workflow
  • Add issue and PR templates
  • Set up release automation (semantic-release, changesets)
  • Add a Dockerfile if the project is deployable

Maintenance

  • Commit at least once every 90 days
  • Triage open issues (close stale ones)
  • Tag a release at least every 6 months
  • Diversify commit access (reduce bus factor risk)

8. FAQ

"Why does my well-maintained repo score poorly?"

Probably missing security practices. Most developers focus on code quality and testing but skip SECURITY.md, dependency pinning, and token permissions. These three checks alone account for 50 points in the Security dimension. If your project uses non-GitHub CI (Jenkins, Buildkite, Concourse), it may also miss DevOps checks even though your pipeline is solid.

"Why do awesome-lists and docs repos get N/A?"

Documentation-only repositories are not graded on code-centric metrics because the comparison would be meaningless. They still receive dimension scores for the checks that apply, but no overall letter grade. Mirror repos are treated the same way.

"Can I customize the weights?"

Not yet on the website. The CLI supports --explain to show the full scoring breakdown for any repository, including which weight profile was applied and how each check contributed. Custom weight profiles are on the roadmap.

"How often is the data updated?"

The site is regenerated periodically from fresh GitHub API data. Individual repo scores can shift between runs based on recent commits, issue activity, and release cadence. The "last commit" and "issue freshness" checks are time-sensitive by design.

"A check is wrong for my repo. What do I do?"

Open an issue with the repo name and the check you believe is incorrect. We regularly refine detection heuristics based on false positive and false negative reports.

Source and reproduction

The scoring logic is open source at github.com/williamzujkowski/repo-health-report. Run repo-health-report owner/repo --explain to reproduce any score shown on this site. The SCORING.md file in the repo root is the canonical reference and is kept in sync with the scoring implementation.