Skip to content

Add AI-disclosure and quality requirements to the contribution guidelines#2143

Merged
Byron merged 2 commits intomainfrom
contrbuting
May 6, 2026
Merged

Add AI-disclosure and quality requirements to the contribution guidelines#2143
Byron merged 2 commits intomainfrom
contrbuting

Conversation

@Byron
Copy link
Copy Markdown
Member

@Byron Byron commented May 6, 2026

Tasks

  • Eliah's review

…ines.

Co-authored-by: GPT 5.5 <codex@openai.com>
@Byron Byron requested a review from EliahKagan May 6, 2026 07:01
Split the quality-expectations section into two paragraphs (the warning
about low-quality contributions being declined was visually merged with
the preceding paragraph). Replace "and the pull request closed without
warning" with a note that maintainers may not always be able to provide
detailed feedback, which conveys the same practical reality.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit (92ff6df). How do you feel about this wording? I worry this may soften the point you are making too much, which I would not want to happen. But this also explains the reasoning why feedback may be absent -- and I am hoping it may also avoid discouraging people from contributing, while at the same time hopefully making the situation at least as clear, and setting the exact same expectations, as in the original wording.

@EliahKagan EliahKagan requested a review from Copilot May 6, 2026 14:19
@EliahKagan EliahKagan changed the title Add AI-disclusure and quality requirements to the contribution guidelines. Add AI-disclosure and quality requirements to the contribution guidelines May 6, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the project’s contribution guidelines to set explicit quality expectations for contributions and to require disclosure/identification for AI-assisted work when it meaningfully affects a PR, commits, or GitHub interactions.

Changes:

  • Add a “Quality expectations” section defining baseline standards (readability, maintainability, tests where practical, documentation, consistency).
  • Add an “AI-assisted contributions” section describing disclosure requirements and agent identification expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CONTRIBUTING.md
Comment thread CONTRIBUTING.md
@Byron Byron marked this pull request as ready for review May 6, 2026 21:38
@Byron Byron merged commit c7648c0 into main May 6, 2026
64 of 65 checks passed
@Byron
Copy link
Copy Markdown
Member Author

Byron commented May 6, 2026

Thanks a lot! Let's use this.

@Byron Byron deleted the contrbuting branch May 6, 2026 21:39
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.

The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.

Root cause hypothesis
---------------------

The CI's safe.directory list (set in the "Special configuration for
Cygwin git" step) covers the main repo and its .git directory but
not the gitdb submodule's working tree at git/ext/gitdb. Git matches
safe.directory exactly, not by prefix, so when GitPython spawns
"git cat-file --batch-check" in the gitdb submodule (via
Repo(submodule_path).odb.info(sha)), git rejects the repository for
dubious ownership and exits.

Three observed failure modes, all from the same root cause
----------------------------------------------------------

The git subprocess exits soon after starting. What Python sees
depends on a race between Python writing to the subprocess's stdin
and the subprocess exiting and closing its stdout pipe.

1. ValueError ("SHA is empty, possible dubious ownership..."):
   Python's write/flush completes before git has finished exiting.
   The buffered write succeeds, then stdout.readline() returns b""
   (EOF). _parse_object_header(b"") in git/cmd.py raises this
   ValueError. The error message names the rejected directory and
   even suggests the safe.directory fix. This propagates uncaught
   from Object.new_from_sha through name_to_object (line 229 of
   git/repo/fun.py is outside the try/except loop) through
   repo.commit("HEAD") to iter_items, where the
   "except (IOError, BadName)" clause does not catch ValueError.

2. IndexError ("list index out of range"):
   Git exits before Python's write/flush runs. cmd.stdin.write or
   cmd.stdin.flush raises BrokenPipeError, a subclass of OSError
   (IOError). This time iter_items's "except (IOError, BadName)"
   catches it, returning an empty iterator. children() therefore
   returns an empty list, and "[0]" in test_submodules raises
   IndexError.

3. AssertionError ("1 not greater than or equal to 2"):
   Same BrokenPipeError-caught-as-IOError mechanism as case 2, but
   manifesting in test_repo.py::test_submodules, which does
   "assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)".
   The recursive traversal finds gitdb (via the main repo, which is
   in safe.directory) but cannot enumerate gitdb's children, so
   only one submodule is yielded.

Evidence
--------

Recent main-branch CI runs show all three xfailed tests consistently
matching their raises=ValueError xfail, e.g. job 74546730688 in run
25415738988. Verified across the five most recent main runs.

PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had
test_docs.py FAILED with IndexError while test_repo.py XFAILed with
ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had
both XFAIL with ValueError. Same commit, same workflow, same runner
image -- a flaky race.

The 3-job trial of these reproduce-safe-dir jobs (run 25454533092,
commit baf3526) produced 8 ValueError and 1 IndexError out of 9
test runs.

The 30-job run (run 25454836713) produced 90 ValueError and 0
IndexError in the reproduce jobs, but the same run's test (fast)
job exhibited the AssertionError variant in
test_repo.py::test_submodules. That AssertionError is the third
manifestation of the BrokenPipeError race.

Plan
----

1. (this commit) Reproduce the failure under the current
   safe.directory configuration, with xfails removed so failures
   surface directly. The 256-job matrix is intended to characterize
   the rate of each variant.

2. Apply the fix: extend the safe.directory configuration to cover
   the gitdb submodule's working tree (and smmap's, recursively).
   Verify that the tests now pass.

3. Remove the reproduce-safe-dir jobs and the YAML anchors that
   only the reproduce job needed. The existing test job retains the
   permanent fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.

The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.

Root cause hypothesis
---------------------

The CI's safe.directory list (set in the "Special configuration for
Cygwin git" step) covers the main repo and its .git directory but
not the gitdb submodule's working tree at git/ext/gitdb. Git matches
safe.directory exactly, not by prefix, so when GitPython spawns
"git cat-file --batch-check" in the gitdb submodule (via
Repo(submodule_path).odb.info(sha)), git rejects the repository for
dubious ownership and exits.

Three observed failure modes, all from the same root cause
----------------------------------------------------------

The git subprocess exits soon after starting. What Python sees
depends on a race between Python writing to the subprocess's stdin
and the subprocess exiting and closing its stdout pipe.

1. ValueError ("SHA is empty, possible dubious ownership..."):
   Python's write/flush completes before git has finished exiting.
   The buffered write succeeds, then stdout.readline() returns b""
   (EOF). _parse_object_header(b"") in git/cmd.py raises this
   ValueError. The error message names the rejected directory and
   even suggests the safe.directory fix. This propagates uncaught
   from Object.new_from_sha through name_to_object (line 229 of
   git/repo/fun.py is outside the try/except loop) through
   repo.commit("HEAD") to iter_items, where the
   "except (IOError, BadName)" clause does not catch ValueError.

2. IndexError ("list index out of range"):
   Git exits before Python's write/flush runs. cmd.stdin.write or
   cmd.stdin.flush raises BrokenPipeError, a subclass of OSError
   (IOError). This time iter_items's "except (IOError, BadName)"
   catches it, returning an empty iterator. children() therefore
   returns an empty list, and "[0]" in test_submodules raises
   IndexError.

3. AssertionError ("1 not greater than or equal to 2"):
   Same BrokenPipeError-caught-as-IOError mechanism as case 2, but
   manifesting in test_repo.py::test_submodules, which does
   "assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)".
   The recursive traversal finds gitdb (via the main repo, which is
   in safe.directory) but cannot enumerate gitdb's children, so
   only one submodule is yielded.

Evidence
--------

Recent main-branch CI runs show all three xfailed tests consistently
matching their raises=ValueError xfail, e.g. job 74546730688 in run
25415738988. Verified across the five most recent main runs.

PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had
test_docs.py FAILED with IndexError while test_repo.py XFAILed with
ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had
both XFAIL with ValueError. Same commit, same workflow, same runner
image -- a flaky race.

The 3-job trial of these reproduce-safe-dir jobs (run 25454533092,
commit baf3526) produced 8 ValueError and 1 IndexError out of 9
test runs.

The 30-job run (run 25454836713) produced 90 ValueError and 0
IndexError in the reproduce jobs, but the same run's test (fast)
job exhibited the AssertionError variant in
test_repo.py::test_submodules. That AssertionError is the third
manifestation of the BrokenPipeError race.

Plan
----

1. (this commit) Reproduce the failure under the current
   safe.directory configuration, with xfails removed so failures
   surface directly. The 256-job matrix is intended to characterize
   the rate of each variant.

2. Apply the fix: extend the safe.directory configuration to cover
   the gitdb submodule's working tree (and smmap's, recursively).
   Verify that the tests now pass.

3. Remove the reproduce-safe-dir jobs and the YAML anchors that
   only the reproduce job needed. The existing test job retains the
   permanent fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.

The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.

Root cause
----------

The CI workflow adds the main repo and its .git directory to
safe.directory but not the gitdb or smmap submodule working trees.
Git matches safe.directory by exact path. When GitPython opens a
submodule as a Repo and runs "git cat-file --batch-check" against
it, git rejects the repository for dubious ownership and exits.

The race
--------

GitPython caches the cat-file process. When git has exited, the
next call to __get_object_header (git/cmd.py:1697) hits one of
two outcomes depending on whether Python's cmd.stdin.flush() runs
before or after the kernel marks the read end of the pipe as
closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT
kills the cat-file process and the next flush() raises
BrokenPipeError.)

If Python wins the race, flush() succeeds (data goes into the
kernel buffer); the subsequent stdout.readline() returns b"" (EOF);
_parse_object_header at git/cmd.py:1659 raises a ValueError whose
message names the rejected directory.

If git wins, flush() raises BrokenPipeError, which is OSError, which
is IOError.

In both paths, the exception travels from Object.new_from_sha at
git/repo/fun.py:229 (outside name_to_object's try/except for
ValueError-from-dereference_recursive) up through repo.commit("HEAD")
into iter_items at git/objects/submodule/base.py. That function's
"except (IOError, BadName)" at base.py:1597 catches BrokenPipeError
but not ValueError. So Path 1 propagates ValueError all the way to
the test; Path 2 ends with iter_items returning early and the test
seeing an empty submodule list.

Per-test outcomes
-----------------

What the test sees in Path 2 is determined by what the test does
with the empty list:

- test_docs::test_submodules does sm.children()[0].name, so the
  [0] on the empty list raises IndexError at git/util.py:1212.

- test_repo::test_submodules does
  assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2).
  The recursive traversal yields gitdb (its iter_items on the *main*
  repo succeeds, because the main repo IS in safe.directory) but not
  smmap, so length 1 fails the assertion at test_repo.py:882.

- test_submodule::test_root_module does "assert len(rsmsp) >= 2" on
  a similar traversal result, failing at test_submodule.py:513.

The race itself is non-deterministic, but the mapping from race
outcome to exception type is deterministic per test, so each test
has exactly two possible failure types -- one per side of the race:

  test_docs:       ValueError (Python wins) or IndexError (git wins).
  test_repo:       ValueError (Python wins) or AssertionError (git wins).
  test_submodule:  ValueError (Python wins) or AssertionError (git wins).

In particular, test_docs never produces AssertionError, and
test_repo and test_submodule never produce IndexError.

Empirical confirmation
----------------------

Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast)
jobs in the runs cited below, every one of 2403 test failures
traces to one of four source lines: git/cmd.py:1659 (ValueError,
~98.7%), git/util.py:1212 (IndexError, only test_docs),
test/test_repo.py:882 (AssertionError, only test_repo), or
test/test_submodule.py:513 (AssertionError, only test_submodule).
Zero violations of the per-test prediction.

Reproduce-safe-dir runs on the fork:

  https://github.com/EliahKagan/GitPython/actions/runs/25454533092
  https://github.com/EliahKagan/GitPython/actions/runs/25454836713
  https://github.com/EliahKagan/GitPython/actions/runs/25472029324
  https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1
  https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2

The first run with the fix applied (256 jobs, 768 test outcomes,
all PASSED):

  https://github.com/EliahKagan/GitPython/actions/runs/25473807645

The independently observed race in the field that prompted this
investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run:

  https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1

Plan
----

1. (this commit) Reproduce the failure under the current
   safe.directory configuration, with xfails removed so failures
   surface directly.
2. Apply the fix: extend the safe.directory configuration to cover
   the gitdb submodule's working tree (and smmap's, recursively).
3. Remove the reproduce-safe-dir jobs and the YAML anchors that
   only the reproduce job needed. The existing test job retains the
   permanent fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.

The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.

Root cause
----------

The CI workflow adds the main repo and its .git directory to
safe.directory but not the gitdb or smmap submodule working trees.
Git matches safe.directory by exact path. When GitPython opens a
submodule as a Repo and runs "git cat-file --batch-check" against
it, git rejects the repository for dubious ownership and exits.

The race
--------

GitPython caches the cat-file process. When git has exited, the
next call to __get_object_header (git/cmd.py:1697) hits one of
two outcomes depending on whether Python's cmd.stdin.flush() runs
before or after the kernel marks the read end of the pipe as
closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT
kills the cat-file process and the next flush() raises
BrokenPipeError.)

If Python wins the race, flush() succeeds (data goes into the
kernel buffer); the subsequent stdout.readline() returns b"" (EOF);
_parse_object_header at git/cmd.py:1659 raises a ValueError whose
message names the rejected directory.

If git wins, flush() raises BrokenPipeError, which is OSError, which
is IOError.

In both paths, the exception travels from Object.new_from_sha at
git/repo/fun.py:229 (outside name_to_object's try/except for
ValueError-from-dereference_recursive) up through repo.commit("HEAD")
into iter_items at git/objects/submodule/base.py. That function's
"except (IOError, BadName)" at base.py:1597 catches BrokenPipeError
but not ValueError. So Path 1 propagates ValueError all the way to
the test; Path 2 ends with iter_items returning early and the test
seeing an empty submodule list.

Per-test outcomes
-----------------

What the test sees in Path 2 is determined by what the test does
with the empty list:

- test_docs::test_submodules does sm.children()[0].name, so the
  [0] on the empty list raises IndexError at git/util.py:1212.

- test_repo::test_submodules does
  assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2).
  The recursive traversal yields gitdb (its iter_items on the *main*
  repo succeeds, because the main repo IS in safe.directory) but not
  smmap, so length 1 fails the assertion at test_repo.py:882.

- test_submodule::test_root_module does "assert len(rsmsp) >= 2" on
  a similar traversal result, failing at test_submodule.py:513.

The race itself is non-deterministic, but the mapping from race
outcome to exception type is deterministic per test, so each test
has exactly two possible failure types -- one per side of the race:

  test_docs:       ValueError (Python wins) or IndexError (git wins).
  test_repo:       ValueError (Python wins) or AssertionError (git wins).
  test_submodule:  ValueError (Python wins) or AssertionError (git wins).

In particular, test_docs never produces AssertionError, and
test_repo and test_submodule never produce IndexError.

Empirical confirmation
----------------------

Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast)
jobs in the runs cited below, every one of 2403 test failures
traces to one of four source lines: git/cmd.py:1659 (ValueError,
~98.7%), git/util.py:1212 (IndexError, only test_docs),
test/test_repo.py:882 (AssertionError, only test_repo), or
test/test_submodule.py:513 (AssertionError, only test_submodule).
Zero violations of the per-test prediction.

Reproduce-safe-dir runs on the fork:

  https://github.com/EliahKagan/GitPython/actions/runs/25454533092
  https://github.com/EliahKagan/GitPython/actions/runs/25454836713
  https://github.com/EliahKagan/GitPython/actions/runs/25472029324
  https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1
  https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2

The first run with the fix applied (256 jobs, 768 test outcomes,
all PASSED):

  https://github.com/EliahKagan/GitPython/actions/runs/25473807645

The independently observed race in the field that prompted this
investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run:

  https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1

Plan
----

1. (this commit) Reproduce the failure under the current
   safe.directory configuration, with xfails removed so failures
   surface directly.
2. Apply the fix: extend the safe.directory configuration to cover
   the gitdb submodule's working tree (and smmap's, recursively).
3. Remove the reproduce-safe-dir jobs and the YAML anchors that
   only the reproduce job needed. The existing test job retains the
   permanent fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants