Methodology

How we compute health scores for 8,517 repositories, and why.

Current dimension averages

Across 8,517 repositories. Security and Maintenance are consistently the weakest areas.

Security

Maintenance

DevOps

Testing

Architecture

Documentation

1. Philosophy

We measure repository health practices, not code quality.

A repo with excellent code but no CI, no security policy, and no tests scores poorly. A repo with boilerplate code but proper CI, documentation, and security practices scores well. This is intentional. Practices are observable, automatable, and actionable. Code quality is subjective, requires deep analysis, and varies by domain.

Important disclaimer: Presence of a file does not guarantee quality. An empty SECURITY.md passes the security policy check. A CI workflow that never runs green still counts as "CI detected." We measure whether the practice exists, not how well it is implemented. This is a known limitation we accept in exchange for being able to score thousands of repos without cloning them.

2. The six dimensions

Security

avg 30/100

Why it matters: Software supply chain attacks increased 742% between 2019 and 2022 (Sonatype State of the Software Supply Chain). A security policy and pinned dependencies are baseline defenses that cost minutes to implement but signal that maintainers take vulnerability disclosure seriously.

Check	Weight
SECURITY.md with contact info and disclosure process	25
GitHub Actions pinned to full commit SHAs	15
Dependency update automation (Dependabot/Renovate)	15
Explicit token permissions in workflows	10
No committed .env files	10
.gitignore present	10
CI workflows detected (branch protection proxy)	10
CODEOWNERS / OWNERS / MAINTAINERS file	5

Testing

avg 39/100

Why it matters: Google's DORA research shows teams with robust testing deploy 46x more frequently with 7x lower change failure rates (Accelerate: State of DevOps). Testing infrastructure is the foundation of deployment confidence.

Check	Weight
CI workflows detected	25
Test files present (__tests__, .test., .spec., tests/)	25
Coverage configuration (nyc, jest, vitest, codecov)	20
Test runner configured (package.json scripts, pytest, cargo)	15
Pre-commit hooks (Husky, pre-commit-config)	5

IaC projects check for Terratest, kitchen-terraform, and InSpec instead. Documentation projects check for markdown linting and link checking.

Documentation

avg 53/100

Why it matters: A project without documentation is a project without users. GitHub's 2023 Open Source Survey found that incomplete or confusing documentation is the single largest barrier to open source participation. README quality directly correlates with adoption and contribution rates.

Check	Weight
README exists and is >500 characters	30
LICENSE file present	20
CONTRIBUTING guide	15
CHANGELOG / CHANGES / HISTORY	15
Documentation directory (docs/) or API docs	10
Repository description set	10

Architecture

avg 40/100

Why it matters: Consistent tooling (linters, formatters, type checkers) reduces defect introduction rates. A 2022 study in IEEE TSE found that projects using static analysis tools had 20-30% fewer post-release defects. We check for language-appropriate tooling across 9 language profiles.

Check	Weight
Type safety (tsconfig, mypy, built-in type system)	20
Linter (ESLint, Biome, ruff, golangci-lint, Clippy)	20
Organized source structure (src/, lib/, cmd/)	20
Code formatter (Prettier, black, gofmt, rustfmt)	15
Build configuration (tsconfig, Cargo.toml, Makefile)	10

Language profiles: TypeScript, JavaScript, Python, Go, Rust, Java, Shell, C/C++, Ruby, plus a generic fallback. IaC projects check tflint, terraform fmt, and module conventions instead.

DevOps

avg 35/100

Why it matters: CI/CD automation is the foundation of reliable software delivery. DORA metrics consistently show that elite-performing teams have fully automated deployment pipelines. We detect 12 CI/CD systems to avoid penalizing projects that use non-GitHub CI.

Check	Weight
CI/CD pipeline (GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.)	30
Container support (Dockerfile, docker-compose)	20
Release automation (semantic-release, changesets)	20
Issue/PR templates	15
Deployment/infrastructure config (Terraform, k8s, Helm)	5

Maintenance

avg 33/100

Why it matters: An unmaintained dependency is a ticking time bomb. The Heartbleed vulnerability persisted for 2 years in OpenSSL partly because of insufficient maintenance resources. Bus factor and funding signal whether a project can sustain itself beyond a single contributor.

Check	Weight
Last commit within 30 days (full) or 90 days (partial)	30
Latest release/tag within 180 days	20
Bus factor (4+ contributors with >5% of commits)	20
Median open issue age under 90 days	15
FUNDING.yml present (not penalized if absent)	10*
10+ stars (community interest signal)	5

*Funding weight is 0 if absent -- repos are not penalized for missing FUNDING.yml. The "Recent releases" check also detects release hygiene gaps (old releases + recent commits) and repos that have never published a release.

3. How scoring works

Each dimension score is the percentage of check weights earned. The overall score is a weighted average across dimensions, with weights determined by the detected project type.

dimension_score = earned_weight / total_weight * 100
overall = sum(dimension_score * type_weight) / sum(type_weight)

Worked example -- a hypothetical application-type repo with equal weights:

Dimension	Score	Weight	Contribution
Security	70	1.0	70
Testing	100	1.0	100
Documentation	85	1.0	85
Architecture	100	1.0	100
DevOps	50	1.0	50
Maintenance	83	1.0	83
Total	488 / 6.0		= 81 (B)

Grade scale:

A: 90+ B: 80-89 C: 70-79 D: 60-69 F: <60 N/A: docs & mirrors

4. Project type weights

Different project types have different priorities. An IaC repo needs strong security practices (1.5x weight) because misconfigurations deploy directly to production. A library needs excellent documentation (1.5x) because adoption depends on it. A mirror repo has minimal expectations across the board.

Dimension	App	Library	IaC	Hybrid	Docs	Runtime	Mirror
Security	1.0	0.8	1.5	1.2	0.3	0.5	0.3
Testing	1.0	1.2	0.8	1.0	0.5	0.8	0.5
Documentation	1.0	1.5	1.0	1.0	2.0	1.5	1.5
Architecture	1.0	1.0	1.5	1.2	1.5	0.5	0.5
DevOps	1.0	0.5	0.8	1.0	0.5	0.5	0.2
Maintenance	1.0	1.0	0.8	0.8	0.5	1.5	1.5

Highlighted values (accented) indicate the most influential weight for that project type. Documentation and mirror repos receive numeric scores but are not assigned letter grades (shown as N/A).

5. What we don't measure

Transparency about limitations is more useful than pretending they don't exist. Here is what this report cannot tell you.

Code quality

Cyclomatic complexity, code duplication, and tech debt require cloning the repository. We only use the GitHub API and tree endpoints.

CI pass rate

We detect whether CI exists, not whether it consistently passes. A repo with a perpetually red CI pipeline scores the same as one with all-green checks.

Dependency vulnerability depth

We check deps.dev for the main package but do not perform transitive dependency vulnerability analysis.

Community health beyond metrics

We measure issue age and PR responsiveness, not response quality, maintainer tone, or contributor sentiment.

GitHub-centric bias

Repos using Concourse, Buildkite, or other non-standard CI systems may score lower on some DevOps checks. We detect 12 CI systems, but the long tail is vast.

File quality vs. file presence

An empty SECURITY.md passes the security policy check. A one-line CONTRIBUTING.md passes the contribution guide check. We verify existence, not substance.

Popularity

Stars carry only 5 points in the maintenance dimension. A project with 100k stars and no CI scores lower than a 10-star project with proper practices. This is intentional.

6. Comparison to other frameworks

Several frameworks measure open source project health. We overlap with each but differ in scope and approach.

Framework	Focus	Our overlap
OpenSSF Scorecard	Security (20 checks)	We implement 5 Scorecard-inspired checks: pinned deps, token permissions, branch protection (proxy), dependency updates, and security policy.
CHAOSS	Community (89 metrics)	We implement bus factor, issue freshness, and PR responsiveness. CHAOSS covers contributor diversity, event attendance, and organizational participation that we do not attempt.
CII Best Practices	Maturity (67 criteria)	We check roughly 30 of the Passing-level criteria. CII requires self-attestation; we infer from repo metadata.
Libraries.io SourceRank	Adoption (15 factors)	We check dependent count via deps.dev and stars. SourceRank also factors in download counts, reverse dependencies, and package manager presence.

7. How to improve your score

Quick wins per dimension, ranked roughly by effort (lowest first).

Security

Add a SECURITY.md with a disclosure email
Enable Dependabot or Renovate
Pin GitHub Actions to full SHA hashes
Add permissions: read-all to workflows

Testing

Add a CI workflow that runs on push/PR
Add at least one test file in a tests/ directory
Configure a coverage tool (codecov, coveralls)
Add a test script to package.json / Makefile

Documentation

Write a README longer than 500 characters
Add a LICENSE file
Add a CONTRIBUTING.md with setup instructions
Set a repository description on GitHub

Architecture

Add a linter config for your language
Add a code formatter (Prettier, black, rustfmt)
Organize source code into src/ or lib/
Enable type checking (tsconfig strict, mypy)

DevOps

Add a GitHub Actions CI workflow
Add issue and PR templates
Set up release automation (semantic-release, changesets)
Add a Dockerfile if the project is deployable

Maintenance

Commit at least once every 90 days
Triage open issues (close stale ones)
Tag a release at least every 6 months
Diversify commit access (reduce bus factor risk)

8. FAQ

"Why does my well-maintained repo score poorly?"

Probably missing security practices. Most developers focus on code quality and testing but skip SECURITY.md, dependency pinning, and token permissions. These three checks alone account for 50 points in the Security dimension. If your project uses non-GitHub CI (Jenkins, Buildkite, Concourse), it may also miss DevOps checks even though your pipeline is solid.

"Why do awesome-lists and docs repos get N/A?"

Documentation-only repositories are not graded on code-centric metrics because the comparison would be meaningless. They still receive dimension scores for the checks that apply, but no overall letter grade. Mirror repos are treated the same way.

"Can I customize the weights?"

Not yet on the website. The CLI supports --explain to show the full scoring breakdown for any repository, including which weight profile was applied and how each check contributed. Custom weight profiles are on the roadmap.

"How often is the data updated?"

The site is regenerated periodically from fresh GitHub API data. Individual repo scores can shift between runs based on recent commits, issue activity, and release cadence. The "last commit" and "issue freshness" checks are time-sensitive by design.

"A check is wrong for my repo. What do I do?"

Open an issue with the repo name and the check you believe is incorrect. We regularly refine detection heuristics based on false positive and false negative reports.

Source and reproduction

The scoring logic is open source at github.com/williamzujkowski/repo-health-report. Run repo-health-report owner/repo --explain to reproduce any score shown on this site. The SCORING.md file in the repo root is the canonical reference and is kept in sync with the scoring implementation.