CI Add workflow to check Changelog entry. #19155

cmarmo · Jan 11, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Add a github action checking if the pull request is referenced in the changelog together with its author.

Any other comments?

Need the creation of a label "no-changelog" for pull requests that do not need to be listed.
This is the same approach used by astropy.

See its test implementation here.

NicolasHug

Thanks @cmarmo

The original issue seemed to suggest to only trigger an error if tests were modified. Is this the case here? I think that most PRs don't require a changelog entry, so we probably don't want the error to be the default in all cases.

.github/workflows/check_changelog.yml

NicolasHug

Thanks @cmarmo, I agree we should keep the labeling thing as not all PRs modifying tests would need a changelog entry (but probably most will).

My CI expertise is weak: any chance we can make the CI check issue a "warning" rather than an error (or anything not as critical as an error)?

When this is merged, we should ping all core devs to inform them that this is in (I think we should make it a habit to ping people for such infra changes).

I haven't double checked whether the check is correct but this isn't critical so LGTM

.github/workflows/check_changelog.yml

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

cmarmo · Jan 12, 2021

My CI expertise is weak: any chance we can make the CI check issue a "warning" rather than an error (or anything not as critical as an error)?

I believe a warning could be issued but then the check board will show it as 'green'.
I've seen that Astropy adds "Allowed failure" in the name of the check to explicit that its failure is not critical, we can pick this option here, if you think it could be useful.

ogrisel

I like this check. I thing we could reduce the number of false "negative" by maintaining a blacklist and whitelist of labels to enable or disable it:

Whitelist:

Bug
New Feature
Performance

Blacklist:

CI / Build
Documentation

Decision rule

If the No changelog then skip the check.
Else if any of the whitelist labels are enabled then do the check.
Else if any of the blacklist labels are enabled, skip the check.
Else: do the check (this probably means that the PR labels should be updated).

What do you think about this suggestion?

ogrisel · Jan 13, 2021

.github/workflows/check_changelog.yml

+            then
+              echo "Changelog has been updated."
+            else
+              echo "Changelog entry is missing. If no changelog entry is required for this PR, label the PR with no-changelog to bypass this check."


Maybe we should be extra-explicit on case where we need a changelog entry:

bug fixes

new features (in particular with changes to the public API, e.g. with a new public class or function or a new parameter)

Changes that do not need a changelog entry:

documentation fixes or improvements

new or updated examples or benchmark

fix or improvements in the test suite

code style changes

updated continuous integration configuration

updated packaging configuration or tools (unless the previously released packages on pypi.org or conda-forge where invalid for some reason, in which case it's useful to mention the list of problematic packages in the changelog)

Note that it is ok to reference several PRs in the same changelog entry, for instance when a big task has been decomposed in subtasks or if a partial fix has been further improved in a follow-up PR.

* fix or improvements in the test suite

wrt the current implementation this case will be checked because test files are likely to be modified. Should I get rid of this check?

scikit-learn/.github/workflows/check_changelog.yml

Lines 22 to 23 in 62a5068

tests=$(git diff --name-only origin/master)

if [[ "$tests" =~ tests ]]

Note that it is ok to reference several PRs in the same changelog entry, for instance when a big task has been decomposed in subtasks or if a partial fix has been further improved in a follow-up PR.

The current implementation check if the submitted PR is cited in the changelog: this is always true when a changelog is needed, right?

If the PR is only a about fixing a bad test or adding a new one, we shall not document it in a changelog entry because it's probably useless for our end-users. I mean the test fix is useful to us and our users but since the tests are not part of the public API, it's not that interesting to the readers of the changelog.

The current implementation check if the submitted PR is cited in the changelog: this is always true when a changelog is needed, right?

Yes, this is correct.

.github/workflows/check_changelog.yml

cmarmo · Jan 13, 2021

What do you think about this suggestion?

I like it , but I'm under the impression that labels are still not systematically used in the scikit-learn issue tracker, despite the current effort of automation. I would rather to keep the workflow simple for now: it could always be improved in the future.
I think you are right about continuous integration ("Build-CI"), we can safely skip it, probably "Documentation" too.

NicolasHug · Jan 14, 2021

I thing we could reduce the number of false "negative" by maintaining a blacklist and whitelist of labels to enable or disable it:

I'm personally more concerned about false positive than false negative. Quite frankly, I don't think there are tons of occurrences where we forgot to add a changelog entry.
I also agree with @cmarmo that we don't use labels consistently enough in the PRs ATM, so having all the proposed logic above might be over-engineering for little gain

ogrisel

Ok let's keep it simple then.

I am fine with merging this as it is, unless @cmarmo would like to implement the version check thingy (https://github.com/scikit-learn/scikit-learn/pull/19155/files#r556562785).

cmarmo · Jan 15, 2021

I am fine with merging this as it is, unless @cmarmo would like to implement the version check thingy (https://github.com/scikit-learn/scikit-learn/pull/19155/files#r556562785).

Let me try to implement it : I think it is a useful test... at least this will force to move the milestones if they are obsoletes... and there are some cases in the issue tracker... :)
In the meanwhile may I ask to create the "no-changelog" label? I don't have rights for that. Thanks!

ogrisel · Jan 15, 2021

I created the new label as No Changelog Needed to be a bit more explicit and more consistent with the case style of the existing labels (even if we are not that consistent).

Can you please update the PR accordingly (both for the skip condition and the user displayed to the users in case of failure of the check).

…ion.

.github/workflows/check-changelog.yml

NicolasHug · Jan 15, 2021

.github/workflows/check-changelog.yml

+            then
+              echo "Changelog has been updated."
+              check_mlstn=$(cat ./doc/whats_new/v${MILESTONE_TITLE:0:4}.rst)
+              if [[ "$check_mlstn" =~ :pr:\`$PR_NUMBER\` ]]


I'm not sure what happens if there's no mileston label on the PR?

instead of checking all changelogs first and then the specific changelog, could we just check

all changelogs if there's no label

the expected changelog if there's a label?

I tend to prefer the current solution for the following reason: I have seen PRs mile-stoned for the r+2 release while r+1 was under development.
Assuming that the changelog entry has been written in r+1

current implementation: you will be notified that the entry has been written but there is a mis-versioning (edited: not really, because the changelog is not created yet)... but the check will be green anyway.

~~your suggestion: the check will fail. You should verify yourself if the entry is there.~~

I'm assuming that the furthest milestone has been assigned to notify low priority or to not put pressure ... I don't know... but as I've seen this quite often in the issue tracker I'm not comfortable in telling to people "don't do that" as long as an automatic check is affordable.

Not at all... sorry, the furthest changelog is not present.

but the check will be green anyway

I'm a bit confused because as far as I understand the current check, it will fail if the entry isn't written in the corresponding changelog. Which is why it seems unnecessary to check all changelogs first

I'm affected by a confirmation bias here ... :(
Another quite common case is an old milestone in a new PR:

current implementation: the entry is there, the check will tell you that the milestone is wrong

your suggestion: there is no new entry in the old changelog.

Again, I know that those cases should not happen, but they happen...

IIUC your point is that the current version will provide a more informative error message? Fair point. I still find the proposed structure below easier to follow (and more efficient, which doesn't really matter anyway). Maybe we can just amend the general error message saying

This PR is modifying tests and we thus expect a changelog entry. If no changelog entry is needed, label the PR with no changelog needed. If you see this error and there is already a changelog entry then make sure that a) the PR number is correct, b) the tagged milestone for the PR (if any) and the changelog name properly match.

WDYT?

.github/workflows/check-changelog.yml

NicolasHug · Jan 15, 2021

Basically here's what would seem natural to me:

if not tests are modified
	exit gracefully

If PR is milestoned: 
	if corresponding CL file doesn't exist:
		exit gracefully as there's probably nothing to do
	expected_CL = <that specific changelog>
else:
	expected_CL = <all changelogs>

<now ensure that the PR number is in expected_CL. If it fails, error with general informative error message>

I could be wrong but I believe that this is what the PR is currently doing, although it does not follow the above structure

cmarmo · Jan 15, 2021

This PR does the following:

skip the test if the label no-changelog is present
if tests files are not modified exits gracefully
look for the PR number in all changelogs: if no entry exit 1
check if a file correspondent to the PR milestone exists: if not exit gracefully
if yes check that if the PR is referenced there: if not exit 1 because of the mismatch

As I said before (our comments crossed) I prefer keep this way because it allows me to check mismatch between changelog and milestones as they are common in the issue tracker. But to be honest I'm unable to test all the use cases and I would rather to see it working to verify its behavior.

NicolasHug · Jan 15, 2021

Sorry if I'm missing something but I think this description is strictly equivalent to my proposal in terms of behavior: #19155 (comment)

The only difference I see is in terms of error message opportunity, for which I proposed a general solution in #19155 (comment)

NicolasHug · Jan 15, 2021

The only difference I see is in terms of error message opportunity, for which I proposed a general solution in #19155 (comment)

To clarify: I'm proposing to implement the new error message + #19155 (comment), not just the new error messages alone. The new error message proposal is so that the messages become as informative as they are in the current version

If the proposed structure leads to a different behavior then please let me know, but otherwise I think it will be simpler, cleaner, and easier to maintain.

cmarmo · Jan 15, 2021

If the proposed structure leads to a different behavior then please let me know, but otherwise I think it will be simpler, cleaner, and easier to maintain.

I'm quite convinced the current structure is different from your suggestion: that's why I needed to split the message in two parts.
If the PR is milestoned r+2 (for any reason) the file doesn't exist, if I understand correctly:

current implementation: the check will verify if there is an entry in all changelog (containing r+1), then tell you that there is a mismatch.
your suggestion: the check is not performed

Again, this is maybe not an expected use case but it happened already (in particular with the 1.0 version). Maybe this will not happen in the future but I can't be sure of that.

NicolasHug

Ah I see, thanks for re-clarifying.

It is indeed an edge case and the fix for that use-case would be to update the PR's tag, not the entry (which is why it got me confused)... but OK. I made some last comments, maybe @ogrisel can make a last pass and merge

Thanks @cmarmo

.github/workflows/check-changelog.yml

NicolasHug · Jan 15, 2021

.github/workflows/check-changelog.yml

+          set -xe
+          changed_files=$(git diff --name-only origin/master)
+          # Changelog should be updated only if tests have been modified
+          if [[ "$changed_files" =~ tests ]]


I'd still suggest to exit early here if no tests are changed: this saves one nested block, which helps readability (especially in bash ;) )

I've seen your previous comment and and replied there: I'm sorry but I can't understand what you are suggesting.
Should I add an else with exit 0? The end of this if is the end of the script, I guess if the check is False there is nothing else to be executed...

Sorry I missed your reply up there. Basically I'm suggesting to do

if [[ ! "$changed_files" =~ tests ]] # added the "!" here then exit 0 fi # <- move very last fi here # continue as before but everything indented left

ogrisel · Feb 23, 2021

I did another pass. Let's try this for real and we will see if we need to adjust the conditions or message based on contributors feedback in PRs.

ogrisel · Feb 23, 2021

Merged!

really just an innocuous commit to check scikit-learn#19155

jnothman · Feb 24, 2021

I pushed a couple of small fixes: f2943c6 and 86445ab

Where did we decide on the logic that we should check all the changelogs? I might have thought it good to error if a PR added an entry to already released changelog, without being tagged with a corresponding milestone.

jnothman · Feb 24, 2021

I also note that with set -x we're printing the entire changelog to the Github Check logs.

cmarmo · Feb 24, 2021

Where did we decide on the logic that we should check all the changelogs? I might have thought it good to error if a PR added an entry to already released changelog, without being tagged with a corresponding milestone.

This is indeed what happens: if a PR is referenced in a previous changelog without being tagged the check errors. As milestones are not mandatory for now I can't check only the milestone changelog but I need to look for the PR everywhere. To fix the error the contributor should add or move the entry, or the team should change the milestone (this can be useful for minor releases), if any.
I think that making the milestone mandatory would be great.... but I don't think this is doable for now.

jnothman · Feb 24, 2021

if a PR is referenced in a previous changelog without being tagged the check errors.

I don't see this logic in the code. If a PR is referenced in a previous changelog, and the PR is not tagged with a milestone, then the check passes, regardless of which changelog the PR edits. I am suggesting, rather, that the PR enforces that the changelog entry is in the latest version's logs.

I think that making the milestone mandatory would be great....

Perhaps with automation? Previously it's not been that helpful except for timely issues.

Add workflow to check Changelog entry.

90321f4

cmarmo changed the title ~~[BLD] Add workflow to check Changelog entry.~~ CI Add workflow to check Changelog entry. Jan 11, 2021

cmarmo added the Build / CI label Jan 11, 2021

NicolasHug reviewed Jan 12, 2021

View reviewed changes

.github/workflows/check_changelog.yml Outdated Show resolved Hide resolved

Check changelog only if tests have been modified. Remove grep command.

09fb7d9

NicolasHug approved these changes Jan 12, 2021

View reviewed changes

.github/workflows/check_changelog.yml Outdated Show resolved Hide resolved

.github/workflows/check_changelog.yml Outdated Show resolved Hide resolved

More explicit comment and message

62a5068

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

ogrisel reviewed Jan 13, 2021

View reviewed changes

Define variables.

0ff8125

ogrisel approved these changes Jan 14, 2021

View reviewed changes

ogrisel added the No Changelog Needed label Jan 15, 2021

cmarmo added 4 commits January 15, 2021 12:24

Merge branch 'master' into changelog-check

e88fbcc

Add check for the milestone. Change file name. Group variable definit…

5a7c871

…ion.

Revert variable definition.

be40e43

Fix syntax error.

4ae42a8

NicolasHug reviewed Jan 15, 2021

View reviewed changes

cmarmo added 2 commits January 15, 2021 15:29

Check the existence of the milestone. Add comments.

dd4c1aa

Rename variables.

4dcd9df

NicolasHug reviewed Jan 15, 2021

View reviewed changes

.github/workflows/check-changelog.yml Outdated Show resolved Hide resolved

Make the error message more informative.

145ed05

NicolasHug approved these changes Jan 15, 2021

View reviewed changes

cmarmo added 2 commits January 15, 2021 17:50

Clarify the error message.

7d013bb

Exit early if tests are not modified.

6526b88

Base automatically changed from master to main January 22, 2021 10:53

cmarmo added 2 commits February 8, 2021 23:39

Merge branch 'main' into changelog-check

94487ff

Merge branch 'main' into changelog-check

59c60f6

ogrisel merged commit fab739c into scikit-learn:main Feb 23, 2021

cmarmo deleted the changelog-check branch February 23, 2021 17:23

jnothman added a commit that referenced this pull request Feb 24, 2021

CI Fix shell syntax introduced in #19155

f2943c6

jnothman added a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 24, 2021

Consistent use of f-strings

4aaafed

really just an innocuous commit to check scikit-learn#19155

cmarmo mentioned this pull request Feb 24, 2021

[CI] Add trigging events to check-changelog workflow. #19545

Merged

cmarmo mentioned this pull request Mar 2, 2021

Check for milestone and change log connection #19596

Closed

2 tasks

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

	tests=$(git diff --name-only origin/master)
	if [[ "$tests" =~ tests ]]

Search code, repositories, users, issues, pull requests...

Uh oh!

CI Add workflow to check Changelog entry. #19155

CI Add workflow to check Changelog entry. #19155

Uh oh!

Conversation

cmarmo commented Jan 11, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cmarmo commented Jan 12, 2021

Uh oh!

ogrisel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Whitelist:

Blacklist:

Decision rule

Uh oh!

ogrisel Jan 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmarmo commented Jan 13, 2021

Uh oh!

NicolasHug commented Jan 14, 2021

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

cmarmo commented Jan 15, 2021

Uh oh!

ogrisel commented Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmarmo Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug commented Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmarmo commented Jan 15, 2021

Uh oh!

NicolasHug commented Jan 15, 2021

Uh oh!

NicolasHug commented Jan 15, 2021

Uh oh!

ogrisel left a comment •

edited

Loading

ogrisel Jan 13, 2021 •

edited

Loading

ogrisel commented Jan 15, 2021 •

edited

Loading

cmarmo Jan 15, 2021 •

edited

Loading

NicolasHug Jan 15, 2021 •

edited

Loading

NicolasHug commented Jan 15, 2021 •

edited

Loading

NicolasHug Jan 15, 2021 •

edited

Loading