-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Make soft durability writes use a flush interval. #6392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Haven't looked at the code, but from the description of it, this should be awesome to reduce the amount of write amplification! I've always been somewhat unhappy with how bad RethinkDB was at that in some scenarios. |
|
This would be very useful! Can it be part of the next release? |
|
It's released as 2.3.5-extra, link in the OP. And 2.4 won't have it, but I will release a 2.4-extra. Then 2.5 will (hopefully) have it. |
This adds a "flush_interval" top-level config to the table config, telling how much time should pass between flushes. This allows multiple writes to be combined into a single flush, reducing disk bandwidth, disk lifetime, and for many workloads, write throughput. Eviction now works such that if a table shard is using too much memory (and none are evictable bufs), the whole shard initiates a flush. (Previously, it _waited_ for enough active flushes to complete -- now it must initiate the flush.) It can't yet incrementally evict bufs, because that is trickier to implement correctly than you'd think.
|
I see no complaints -- in next with commit 011bea8. |
Add credits for PR rethinkdb#6392
|
FYI -- we finally got around to taking this live. Our writes dropped from ~200 MEGABYTES per second to around 400 KILOBYTES per second. That's with only two tables configured to have delayed flush, so the gain is not entirely from configuring that. This patch improves I/O write load so enormously that the previous level of write load could almost be considered a bug. |
|
Hi any idea when this will be released officially for rethink? (not sure if this is the right place to ask, but we would love this feature!). I assume it comes when what is in next becomes a release, but do you have any indication as to when that may be? And many thanks for the great work on this. |
|
Looks like current |
|
Note that a pretty nasty memory leak has been reported, so you might want to hold off trying it until an update. |
|
@srh thanks for the heads up, will keep an eye out for updates. Appreciate your work. |
|
Does this helps with single-write performances, and those who are using it in production, did you step into the memory leak? |
|
Any news on this feature, would be cool to have it in the next release. |
Description
This changes the cache's flush strategy for soft durability writes to be to accumulate changes, and then flush the changes every
tseconds. It has the effect of greatly reducing the number of write operations.The feature is described in the notes for https://github.com/srh/rethinkdb/releases/tag/v2.3.5-srh-extra except that the configuration option is moved to a top-level
flush_intervaltable config. Also, the default interval is 1.0 seconds.This addresses some suggestions in #1771. It's easy to imagine people finding something to complain about here, so I'm going to leave this up for a while.