Replies: 3 comments
-
|
Do you envision this trimming mechanism also being used for “closing the books” scenarios? For example, I could imagine pairing it with a |
Beta Was this translation helpful? Give feedback.
-
|
@TylerPachal that's it exactly. I'm envisioning the scenario you are describing and another where you truncate the stream, but leave a rollup and maybe a forwarding address in some cases: tombstone = %BooksClosed{new_book_id: abc123, giga_watts_collected: 1.21, frobbers_frobulated: 15}
{:trim, tombstone} |
Beta Was this translation helpful? Give feedback.
-
|
The linked stream problem is a bit gnarly. AFAIK the eventstore does rely on consecutive numbers, but let's confirm that. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@drteeth, in January, proposed an idea for trimming event streams but it was a ton of work with not really a driver/sponsor so did not go anywhere. Partially to capture the thinking and partially because there's an itch to be scratched in this area at my daytime job, here's a proposal.
Problem
Event logs grow without bounds, and Commanded has no solution for that. We want to be able to expire events when we're sure they're not needed.
Solution parameters
Proposal
API perspective
When an Aggregate decides that it does not need old events anymore, it can return a special
:trimatom or tuple. If a:trimtuple is returned, the second element contains one or more events that are to be written to the event store after the trim point. In effect, this will make sure that new event handlers will only see a stream that starts with these events. Existing event handlers will continue to operate, if they are lagging when the trim is executed, they will process the older events and then the "post-trim" events in order.Implementation details
streamstable currently only has aversionwhich in effect is the end of stream. It will also need astartto designate the lowest version ("low water mark" of the stream).startvalue to determine where they need to start reading.Reasoning
The developer API is very simple and the "in-process" implementation will be very fast. Also, the application that runs what usually is a Big Hairy Ball Of Code will still not be able to permanently delete events, because we can run GC with a separate (privileged) user. I think it is a feature of Commanded that the event store is immutable and this design mostly keeps that in tact.
Garbage collection can be run at user-specified intervals with user-specified hooks to handle things like backups, double checking whether events really are eligible for deletion, etc. Keeping the API simple allows aggregates to remain oblivious from the complex machinery that underlies them and leaves the decision simple and a pure business logic one; at the same time, moving actual deletion to a GC-style process allows us to be flexible in implementation and timing of the expensive part of this work.
Garbage collection
In essence, GC is simple: the core process will look at all primary streams that have a
startvalue larger than zero, then usestream_eventsto see what event ranges are "live" according to the aggregates. From there, it can calculate low water marks for linked streams as well (and set them accordingly), then figure out what events to delete. In essence, events that do not belong to an aggregate become eligible for GC, meaning that linked streams and$alleffectively become akin to weak references.There are a couple of complications I can think of. One is simple: custom actions before GC - we need to define hooks so that events and stream_events can be backed up, if required, to cold storage. The GC machinery will call hooks before executing any destructive action so that custom code can do whatever is needed just prior to deletion.
A bit complex is how to delete from linked streams. An example: we have "type" streams at work, all events of type "X". When an aggregate says that it trims its stream at event number N, these events should immediately disappear from (new) event handlers for such linked streams and thus the events should immediately be eligible for deletion. However, this will result in streams with gaps. Say, we have aggregates
AandBand this particular stream looks like:When aggregate
Ais them trimmed emitting event 9 (of a different type) and event 10 (of the same type):And GC will then turn that into:
with a
startvalue of3. Technically, the event store can deal with this but the question remains whether anything relies on consecutive event numbers.Beta Was this translation helpful? Give feedback.
All reactions