← Back to Blog
Stop AI Bot Spam on GitHub Using Git's --author Flag
Automation

Stop AI Bot Spam on GitHub Using Git's --author Flag

Published

gitgithubautomationsecuritydevops

GitHub repositories are facing a new scale of automated noise. Projects like DBeaver recently reported thousands of AI-generated discussions and comments appearing daily—a volume that overwhelms standard manual moderation and the platform's native reporting tools. When bots can generate context-aware spam faster than maintainers can click 'Block', you need a structural defense that shifts the burden of proof back to the contributor.

Maintaining an open-source project shouldn't mean managing a low-quality AI data stream. By combining GitHub's native interaction gates with Git’s --author metadata, you can create a high-friction environment for bots that remains navigable for legitimate humans.

Key Takeaways

  • Prior-contributor gates are the most effective native defense against scaled bot attacks.
  • Programmatic authorship via the --author flag allows you to verify users through external forms (with CAPTCHAs) before they touch your repo.
  • Audit trail control is improved by decoupling the git committer from the git author for screening purposes.
  • Purging leaks requires git rebase or git filter-branch to scrub unverified commits from history.

The Mechanism: Prior-Contributor Gating

The core problem for repos like DBeaver is that GitHub's default state is 'open to all'. Anyone with an email-verified account can post. To stop AI spam, you must flip this logic: restrict interactions to users who have already successfully merged a commit.

However, this creates a 'cold start' problem: how does a legitimate first-time contributor get that first commit merged if they are blocked from opening a PR?

The Archestra Protocol

Archestra solved this by moving the screening process outside of GitHub’s standard UI. Their workflow follows a specific sequence to validate contributors before they are granted repository permissions:

  1. The Screening Portal: Newcomers are directed to an external form protected by a robust CAPTCHA.
  2. Identity Verification: The user provides their GitHub handle and email.
  3. Automated Commit: Once the CAPTCHA is cleared, an automated internal process creates a small, benign commit (like adding the user's name to a CONTRIBUTORS.md file) to a dedicated branch.
  4. Authorship Injection: The system uses the git commit --author="Name <email>" flag to attribute the commit to the new user, even though the system's bot is technically performing the write operation.
  5. The Gate Opens: Once this 'authored' commit is merged, GitHub recognizes the user as a 'prior contributor', allowing them to bypass interaction limits and open PRs or Issues normally.

Implementation: Using the --author Flag

Git distinguishes between the committer (the person who runs the command) and the author (the person who wrote the code). In an automated screening flow, your automation server acts as the committer, but you must specify the user as the author to satisfy GitHub's internal logic.


# Example: Creating a verification commit for a screened user

git commit -m "docs: verify contributor @jdoe" --author="John Doe <jdoe@example.com>"

To audit your history and find commits that might have bypassed your gates or were added by unknown actors, use the filtering flag with git log:


# Find all commits not authored by your verified team members

git log --author="PatternToMatch"

Cleaning Up Bot Leaks

If spam commits do land in your history, you cannot simply delete them; you must purge them from the reflog to maintain a clean audit trail. Video documentation suggests using git rebase for small-scale cleanups or git filter-branch (or the more modern git-filter-repo) to scrub specific authors from the entire project history.


# Use filter-branch to remove commits from a specific spammy author

git filter-branch --commit-filter '
    if [ "$GIT_AUTHOR_NAME" = "SpamBot123" ];
    then
            skip_commit "$@";
    else
            git commit-tree "$@";
    fi' HEAD

Comparison: Defense Strategies

Strategy Pros Cons When to Use
Interaction Limits Built into GitHub; zero maintenance. Temporary (max 24h); blunt instrument. During an active, sudden attack.
External Screening Blocks 100% of basic bots; verified history. High friction for new contributors. High-traffic repos with heavy AI spam.
Manual Moderation No technical overhead. Does not scale; leads to maintainer burnout. Small, private, or niche repositories.
Automated Detection Low friction for humans. AI vs AI arms race; high false positives. Supplement to other methods.

Architectural Considerations

While this approach effectively kills automated spam, it shifts the burden forward. Maintainers must ensure the screening portal is highly available. If the form or the CAPTCHA service goes down, your repository effectively becomes 'read-only' for the entire world.

Furthermore, while GitHub organizations can now delete posts when blocking a user, that user can still operate elsewhere on the platform. The goal of the --author gateway isn't to fix GitHub's global spam problem—it's to create a 'walled garden' for your specific project so you can focus on code rather than moderation.

Frequently Asked Questions

Will this affect my project's contribution metrics?
Yes. By adding a screening step, you will likely see a drop in the number of first-time contributors. However, the quality of those contributions usually increases as low-effort participants are filtered out by the CAPTCHA.
Can bots bypass the --author check?
The check isn't the authorship itself, but the 'Prior Contributor' status. Since a bot cannot merge its own PR in a protected repo, it cannot gain this status without passing your external screening portal first.
Is it safe to automate commits with --author?
Yes, as long as the automation is triggered by a secure event (like a CAPTCHA solve). Ensure your automation token has 'contents: write' permissions but is restricted to specific verification branches.
How does this handle GitHub Discussions?
GitHub allows you to restrict Discussions to established contributors similarly to Issues and PRs. By forcing the first 'touch' through a gated commit, you protect the entire community ecosystem.

If you're managing a growing project and the manual overhead of bot moderation is stalling your roadmap, it's time to automate your gates. At AImatic, we build custom automation workflows that secure your development cycle without killing your velocity. Reach out at hello@aimatic.dev to discuss hardening your repo.

Archestra Uses Git Authorship to Block GitHub Spam How to protect spam bot attacks in Discussion? Stop AI Bot Spam in Git with --author! Wyd if you get a virus? (Reference only) Discord Moderation Vulnerabilities (Reference only)

Related Posts