-
Notifications
You must be signed in to change notification settings - Fork 514
Auto refresh online blocklist #1142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
80b5b72
to
3e7971f
Compare
3e7971f
to
f91d7b2
Compare
try: | ||
for url in self.urls: | ||
blocklist.extend(_parse_block_list_from_url(url)) | ||
except Exception: | ||
logger.warning("Failed to refresh online blocklists") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this logic, if any site is unreachable or returns an error, then none of the sites update the blocklist. Is this what you want to happen? If you put the try-except blocks inside the for-loop, then the block list can be partially updated, even when a URL results in an error.
def _parse_block_list_from_url(url: str) -> list[str]: | ||
block_list = requests.get(url).text.strip() | ||
return [username.strip() for username in block_list.split("\n")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We risk being a burden a site with a blocklist if we don't somehow moderate how often the blocklist is updated. Imagine some blocklist becomes popular among many bot owners and all of them are downloading from the server several times a minute. If the blocklist is long, that can be a substantial amount of data being downloaded. If the list is not changing often, that's a lot of useless bandwidth usage.
I can think of two options:
- Capture the
last-modified
oretag
header of the response. With these saved, send the next request with anif-modified-since
orif-none-match
header, respectively. This will probably require keeping the individual blocklists in a dictionary with the URL as the key. - Use a
Timer
to limit refreshes to some reasonable frequency, like a few times an hour. Create the timer withself.refresh_timer = Timer(minutes(30))
. Then, inrefresh()
,if not self.refresh_timer.is_expired(): return
. Finally, after refreshing,self.refresh_timer.reset()
.
for name in self.matchmaking_cfg.block_list: | ||
self.add_to_block_list(name) | ||
|
||
self.online_block_list = OnlineBlocklist(self.matchmaking_cfg.online_block_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the same online_block_list that was created in lichess_bot_main()
. There's no reason to have two copies of the class pinging the servers.
Type of pull request:
Description:
The online blocklist introduced in #1104 requires manually refreshing the bot to refresh the blocklist, which can be tedious for long running bots. This PR refactors the online blocklist to a separate class and refresh every time before it is used.
Checklist: