Add AI-disclosure and quality requirements to the contribution guidelines#2143
Add AI-disclosure and quality requirements to the contribution guidelines#2143
Conversation
…ines. Co-authored-by: GPT 5.5 <codex@openai.com>
Split the quality-expectations section into two paragraphs (the warning about low-quality contributions being declined was visually merged with the preceding paragraph). Replace "and the pull request closed without warning" with a note that maintainers may not always be able to provide detailed feedback, which conveys the same practical reality. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
I've pushed a commit (92ff6df). How do you feel about this wording? I worry this may soften the point you are making too much, which I would not want to happen. But this also explains the reasoning why feedback may be absent -- and I am hoping it may also avoid discouraging people from contributing, while at the same time hopefully making the situation at least as clear, and setting the exact same expectations, as in the original wording.
There was a problem hiding this comment.
Pull request overview
This PR updates the project’s contribution guidelines to set explicit quality expectations for contributions and to require disclosure/identification for AI-assisted work when it meaningfully affects a PR, commits, or GitHub interactions.
Changes:
- Add a “Quality expectations” section defining baseline standards (readability, maintainability, tests where practical, documentation, consistency).
- Add an “AI-assisted contributions” section describing disclosure requirements and agent identification expectations.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks a lot! Let's use this. |
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.
The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.
Root cause hypothesis
---------------------
The CI's safe.directory list (set in the "Special configuration for
Cygwin git" step) covers the main repo and its .git directory but
not the gitdb submodule's working tree at git/ext/gitdb. Git matches
safe.directory exactly, not by prefix, so when GitPython spawns
"git cat-file --batch-check" in the gitdb submodule (via
Repo(submodule_path).odb.info(sha)), git rejects the repository for
dubious ownership and exits.
Three observed failure modes, all from the same root cause
----------------------------------------------------------
The git subprocess exits soon after starting. What Python sees
depends on a race between Python writing to the subprocess's stdin
and the subprocess exiting and closing its stdout pipe.
1. ValueError ("SHA is empty, possible dubious ownership..."):
Python's write/flush completes before git has finished exiting.
The buffered write succeeds, then stdout.readline() returns b""
(EOF). _parse_object_header(b"") in git/cmd.py raises this
ValueError. The error message names the rejected directory and
even suggests the safe.directory fix. This propagates uncaught
from Object.new_from_sha through name_to_object (line 229 of
git/repo/fun.py is outside the try/except loop) through
repo.commit("HEAD") to iter_items, where the
"except (IOError, BadName)" clause does not catch ValueError.
2. IndexError ("list index out of range"):
Git exits before Python's write/flush runs. cmd.stdin.write or
cmd.stdin.flush raises BrokenPipeError, a subclass of OSError
(IOError). This time iter_items's "except (IOError, BadName)"
catches it, returning an empty iterator. children() therefore
returns an empty list, and "[0]" in test_submodules raises
IndexError.
3. AssertionError ("1 not greater than or equal to 2"):
Same BrokenPipeError-caught-as-IOError mechanism as case 2, but
manifesting in test_repo.py::test_submodules, which does
"assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)".
The recursive traversal finds gitdb (via the main repo, which is
in safe.directory) but cannot enumerate gitdb's children, so
only one submodule is yielded.
Evidence
--------
Recent main-branch CI runs show all three xfailed tests consistently
matching their raises=ValueError xfail, e.g. job 74546730688 in run
25415738988. Verified across the five most recent main runs.
PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had
test_docs.py FAILED with IndexError while test_repo.py XFAILed with
ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had
both XFAIL with ValueError. Same commit, same workflow, same runner
image -- a flaky race.
The 3-job trial of these reproduce-safe-dir jobs (run 25454533092,
commit baf3526) produced 8 ValueError and 1 IndexError out of 9
test runs.
The 30-job run (run 25454836713) produced 90 ValueError and 0
IndexError in the reproduce jobs, but the same run's test (fast)
job exhibited the AssertionError variant in
test_repo.py::test_submodules. That AssertionError is the third
manifestation of the BrokenPipeError race.
Plan
----
1. (this commit) Reproduce the failure under the current
safe.directory configuration, with xfails removed so failures
surface directly. The 256-job matrix is intended to characterize
the rate of each variant.
2. Apply the fix: extend the safe.directory configuration to cover
the gitdb submodule's working tree (and smmap's, recursively).
Verify that the tests now pass.
3. Remove the reproduce-safe-dir jobs and the YAML anchors that
only the reproduce job needed. The existing test job retains the
permanent fix.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the Cygwin xfail decorations from test_submodules in
test_docs.py and test_repo.py, and from test_root_module in
test_submodule.py, so the tests surface the underlying failure
directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml,
each running these three tests under the current safe.directory
configuration. All or most of these reproduce-safe-dir jobs must be
removed before this work is integrated.
The existing test job's env, defaults, and setup steps gain YAML
anchors so the new job can reference them without duplication.
Root cause hypothesis
---------------------
The CI's safe.directory list (set in the "Special configuration for
Cygwin git" step) covers the main repo and its .git directory but
not the gitdb submodule's working tree at git/ext/gitdb. Git matches
safe.directory exactly, not by prefix, so when GitPython spawns
"git cat-file --batch-check" in the gitdb submodule (via
Repo(submodule_path).odb.info(sha)), git rejects the repository for
dubious ownership and exits.
Three observed failure modes, all from the same root cause
----------------------------------------------------------
The git subprocess exits soon after starting. What Python sees
depends on a race between Python writing to the subprocess's stdin
and the subprocess exiting and closing its stdout pipe.
1. ValueError ("SHA is empty, possible dubious ownership..."):
Python's write/flush completes before git has finished exiting.
The buffered write succeeds, then stdout.readline() returns b""
(EOF). _parse_object_header(b"") in git/cmd.py raises this
ValueError. The error message names the rejected directory and
even suggests the safe.directory fix. This propagates uncaught
from Object.new_from_sha through name_to_object (line 229 of
git/repo/fun.py is outside the try/except loop) through
repo.commit("HEAD") to iter_items, where the
"except (IOError, BadName)" clause does not catch ValueError.
2. IndexError ("list index out of range"):
Git exits before Python's write/flush runs. cmd.stdin.write or
cmd.stdin.flush raises BrokenPipeError, a subclass of OSError
(IOError). This time iter_items's "except (IOError, BadName)"
catches it, returning an empty iterator. children() therefore
returns an empty list, and "[0]" in test_submodules raises
IndexError.
3. AssertionError ("1 not greater than or equal to 2"):
Same BrokenPipeError-caught-as-IOError mechanism as case 2, but
manifesting in test_repo.py::test_submodules, which does
"assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)".
The recursive traversal finds gitdb (via the main repo, which is
in safe.directory) but cannot enumerate gitdb's children, so
only one submodule is yielded.
Evidence
--------
Recent main-branch CI runs show all three xfailed tests consistently
matching their raises=ValueError xfail, e.g. job 74546730688 in run
25415738988. Verified across the five most recent main runs.
PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had
test_docs.py FAILED with IndexError while test_repo.py XFAILed with
ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had
both XFAIL with ValueError. Same commit, same workflow, same runner
image -- a flaky race.
The 3-job trial of these reproduce-safe-dir jobs (run 25454533092,
commit baf3526) produced 8 ValueError and 1 IndexError out of 9
test runs.
The 30-job run (run 25454836713) produced 90 ValueError and 0
IndexError in the reproduce jobs, but the same run's test (fast)
job exhibited the AssertionError variant in
test_repo.py::test_submodules. That AssertionError is the third
manifestation of the BrokenPipeError race.
Plan
----
1. (this commit) Reproduce the failure under the current
safe.directory configuration, with xfails removed so failures
surface directly. The 256-job matrix is intended to characterize
the rate of each variant.
2. Apply the fix: extend the safe.directory configuration to cover
the gitdb submodule's working tree (and smmap's, recursively).
Verify that the tests now pass.
3. Remove the reproduce-safe-dir jobs and the YAML anchors that
only the reproduce job needed. The existing test job retains the
permanent fix.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tasks