Secret scanning is one of those features that sounds great until you're drowning in alerts about test passwords and placeholder UUIDs. At GitHub's scale, even small inefficiencies create real friction, and too many false positives make alerts harder to trust. When alerts feel noisy, developers spend more time triaging and less time fixing real issues. GitHub just published a detailed account of how they tackled this problem, and the results are hard to ignore.

The scale of the problem

More than 39 million secrets were leaked across GitHub in 2024 alone, and every minute GitHub blocks several secrets with push protection. But blocking is only half the battle. The other half is making sure the alerts that do fire are worth acting on. Alert fatigue from false positives kills adoption -- and a scanner that cries wolf too often is almost as dangerous as no scanner at all.

Generic secret detection can generate more false positive alerts compared to the existing secret scanning feature, which detects partner patterns and has a very low false positive rate. To mitigate the excess noise, alerts are grouped in a separate list, and security managers and maintainers need to triage each alert to verify its accuracy. That manual triage burden is exactly what this new work aims to shrink.

How the pipeline works today

GitHub's secret scanning pipeline combines pattern-based detection and AI-powered generic detection. The new verification layer was introduced to lower noisy, low-value alerts while preserving coverage. Think of it as three stages:

  • Pattern-based detection -- catches known secret formats like partner API tokens using regex-style rules
  • AI-powered generic detection
Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves