Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

bpo-21475: Support the Sitemap extension in robotparser#6883

Merged
ned-deily merged 13 commits into
python:masterpython/cpython:masterfrom
mcscope:robotparser_site_mapsmcscope/cpython:robotparser_site_mapsCopy head branch name to clipboard
May 16, 2018
Merged

bpo-21475: Support the Sitemap extension in robotparser#6883
ned-deily merged 13 commits into
python:masterpython/cpython:masterfrom
mcscope:robotparser_site_mapsmcscope/cpython:robotparser_site_mapsCopy head branch name to clipboard

Conversation

@mcscope

@mcscope mcscope commented May 15, 2018

Copy link
Copy Markdown
Contributor

This ticket has been open for 3 years just because it was awaiting tests. I took the existing patch and added a test

https://bugs.python.org/issue21475

@the-knights-who-say-ni

Copy link
Copy Markdown

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

When your account is ready, please add a comment in this pull request
and a Python core developer will remove the CLA not signed label
to make the bot check again.

Thanks again to your contribution and we look forward to looking at it!

@mcscope

mcscope commented May 15, 2018

Copy link
Copy Markdown
Contributor Author

I signed the CLA

@Mariatta Mariatta left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have several comments (quite nitpicky) but the rest looks good.
In addition, I would suggest adding both yours and Peter's name into Misc/ACKs file.


.. versionadded:: 3.8


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being so picky, but we only need two extra spaces between .. versionadded and The following example... So please remove the extra lines.

@@ -0,0 +1,2 @@
Added support for optional Site Map extension to urllib robotparser. Patch
by Lady Red

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please end the sentence with a period. In addition, since it was based off another person's patch, it would be good to also mention Based on patch by Peter Wirtz.

@bedevere-bot

Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@Mariatta

Copy link
Copy Markdown
Member

If I don't end up merging this, I'd suggest the core dev merging to remember to add "Co-authored by: Peter Wirtz" in the commit message, since it seems like this was based off Peter's patch.

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@Mariatta: please review the changes made to this pull request.

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

Yes please give credit to Peter in the commit message. PS this is my first contribution to cpython! \o/

Comment thread Lib/test/test_robotparser.py Outdated
"""
good = ['/', '/test.html']
bad = ['/cyberworld/map/index.html']
site_maps = ["http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Please use single quotes.

Comment thread Lib/test/test_robotparser.py Outdated
# Time between requests is short enough that we won't wake
# up spuriously too many times.
kwargs={'poll_interval':0.01})
kwargs={'poll_interval': 0.01})

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't make unrelated cosmetic changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh drat, it's my pep8 autoformatter doing that automatically. will remove

Comment thread Lib/test/test_robotparser.py Outdated

if __name__=='__main__':

if __name__ == '__main__':

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't make unrelated cosmetic changes.

Comment thread Lib/urllib/robotparser.py
return self.default_entry.req_rate

def site_maps(self):
if not self.sitemaps:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a test for this branch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this branch is tested by test_site_maps on all the other tests for robotparser - they each test that it is none except for my single class that tests the positive case

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're correct, I didn't click to the expand button and didn't notice that the test_site_maps method is part of BaseRobotTest.

@@ -0,0 +1,2 @@
Added support for optional Site Map extension to urllib robotparser. Patch by

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add some links to the new site_maps (:meth:`RobotFileParser.site_maps() <urllib.robotparser.RobotFileParser.site_maps>` -- untested, you'll need to try it locally :)) method or to the urllib.robotparser (:mod:`urllib.robotparser`) module.

@bedevere-bot

Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

@berkerpeksag I added the link to the news as you suggested but I can't find any documentation in the dev guide that explains how to do whatever build step I need to do to build the news to evaluate that link. Mind pointing me to the right place?

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@Mariatta, @berkerpeksag: please review the changes made to this pull request.

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

Oh, I think I figured out how to make the news, I have to just make the documentation right?

@mcscope

mcscope commented May 16, 2018

Copy link
Copy Markdown
Contributor Author

Successfully Tested! The news link works

@berkerpeksag

Copy link
Copy Markdown
Member

@mcscope I assume you've found https://devguide.python.org/committing/#what-s-new-and-news-entries but I will share it anyway in case someone else wonders how to build the docs :)

@berkerpeksag berkerpeksag left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for helping to finish bpo-21475. This was in my TODO list along with other urllib.robotparser issues but I couldn't find the time to work on them.

@ned-deily

Copy link
Copy Markdown
Member

Congratulations on your first cpython PR, @mcscope!

@mcscope

mcscope commented May 17, 2018

Copy link
Copy Markdown
Contributor Author

@berkerpeksag Any other todo-list items you have I could take care of? I'm looking at bpo but having a hard time finding ones that require code, instead of requiring developer consensus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.