Fix for salt-call sending duplicate events when event.send or fire_event is used #67714

Feb 8, 2025

amalaguti
Feb 8, 2025

These relates to
#66341
#66933

Tired of getting duplicate events on the same master, decided to do some diagnosing, finally finding the root cause.
I'm not sure what's the right solution in terms of not causing problems with other possible usage or configurations, so I've opened this discussion thread for some feedback.

The issue is generated when minion is set in multimaster configuration and salt-call is used to send an event,
using event.send module salt-call event.send "some/event" -l info or using fire_event option in a state.

Discovered the problem is in the function event.fire_master which is called internally by event.send.

salt-call event.fire_master '{}' 'some/event'

This function event.fire_master when called using salt-call, creates a list of masters uri and loop the list calling salt.channel.client.ReqChannel.factory(__opts__, master_uri=master) in an attempt to send the event to each master.
And this function salt.channel.client.ReqChannel.factory(cls, opts, **kwargs) receives the master_uri in kwargs but given the following line it's never set to be used due master_uri is also present in opts

        if "master_uri" not in opts and "master_uri" in kwargs:
            opts["master_uri"] = kwargs["master_uri"]

Here's a trace (my two masters are 172.21.0.10 and 172.21.0.11) after changing to INFO/adding some extra logging lines that demostrates the behavior.

PS C:\Users\adrian> salt-call event.send "some/event" -l info
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[INFO    ] >>>> Firing event using fire_master()
[INFO    ] >>>> Looks like it's a salt-call execution or preload is specifiied
[INFO    ] >>>> master_uri: tcp://172.21.0.10:4506
[INFO    ] >>>> master_uri_list: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']
[INFO    ] >>>> masters: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']


[INFO    ] >>>> for loop, master: tcp://172.21.0.10:4506
        [WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
        [INFO    ] ReqChannel send crypt load={'id': 'minion-win-1', 'tag': 'some/event', 'data': {'__pub_fun': 'event.send', '__pub_pid': 2332, '__pub_jid': '20250208001850109537', '__pub_tgt': 'salt-call'}, ..., 'cmd': '_minion_event'}

        # HERE IT CAN BE SEEN 172.21.0.11 was passed to  AsyncReqChannel.factory
        # but it's using 172.21.0.10 from opts
[INFO    ] >>>> for loop, master: tcp://172.21.0.11:4506 
        [WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
        [INFO    ] ReqChannel send crypt load={'id': 'minion-win-1', 'tag': 'some/event', 'data': {'__pub_fun': 'event.send', '__pub_pid': 2332, '__pub_jid': '20250208001850109537', '__pub_tgt': 'salt-call'}, ..., 'cmd': '_minion_event', 'nonce': '15fdc1f01230479699a72d4eea70559f'}


[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[INFO    ] ReqChannel send crypt load={'cmd': '_return', 'id': 'minion-win-1', 'jid': 'req', 'return': True, 'retcode': 0, 'fun': 'event.send', 'fun_args': ['some/event']}
local:
    True

Items for consideration:

should event.fire_event() try to send event to each master, or should be sent only to the active master when called using salt-call. My opinion is it should send the event only once to the active master, having the same event sent to both masters may be a problem, for instance causing reactor based tasks to be triggered on both masters. So I would recommend removing the intent to loop thru masters_uri and just send the event to the active/connected master_uri, making a single call to salt.channel.client.ReqChannel.factory
Note: I've verified master_uri is updated to reflect the second master uri in case the first master uri is not reachable.
Fix the line code in salt.channel.client.ReqChannel.factory to properly take the master_uri passed in kwargs instead the option from opts. I think the code should be fixed but I don't know the possible spread of issues, maybe this works fine for other uses, configurations.
Add an argument to event.send/event.fire_event to ONLY optional send the event to all masters

In summary, I think event.fire_event should send only 1 event to the active master_uri and the code in salt.channel.client.ReqChannel.factory should be fixed if that's appropriate, and add the option to send event to all masters for those maybe edge cases that this may be needed, given the code is already there to it so.

Here a log showing only 1 event is sent after tweaking fire_event masters list to use only the active master_uri

PS C:\Users\adrian> salt-call event.send "some/event" -l info
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[INFO    ] >>>> Firing event using fire_master()
[INFO    ] >>>> Looks like it's a salt-call execution or preload is specifiied
[INFO    ] >>>> master_uri: tcp://172.21.0.10:4506
[INFO    ] >>>> master_uri_list: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']
[INFO    ] >>>> masters: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']
[INFO    ] >>>> masters list tweaked for a single master_uri: ['tcp://172.21.0.10:4506']

[INFO    ] >>>> for loop, master: tcp://172.21.0.10:4506
        [WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
        [INFO    ] ReqChannel send crypt load={'id': 'minion-win-1', 'tag': 'some/event', 'data': {'__pub_fun': 'event.send', '__pub_pid': 6980, '__pub_jid': '20250208004447793682', '__pub_tgt': 'salt-call'}, ... 'cmd': '_minion_event'}

[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[INFO    ] ReqChannel send crypt load={'cmd': '_return', 'id': 'minion-win-1', 'jid': 'req', 'return': True, 'retcode': 0, 'fun': 'event.send', 'fun_args': ['some/event']}
local:
    True

Testing the behavior when the active master is no longer available, the minion connected to the second master and continued fine.

PS C:\Users\adrian> salt-call event.send "some/event" -l info
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506

[INFO    ] Master 172.21.0.10 could not be reached, trying next master (if any)
[WARNING ] Master ip address changed from 172.21.0.10 to 172.21.0.11

[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.11:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.11:4506
[INFO    ] >>>> Firing event using fire_master()
[INFO    ] >>>> Looks like it's a salt-call execution or preload is specifiied
[INFO    ] >>>> master_uri: tcp://172.21.0.11:4506
[INFO    ] >>>> master_uri_list: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']
[INFO    ] >>>> masters: ['tcp://172.21.0.10:4506', 'tcp://172.21.0.11:4506']
[INFO    ] >>>> masters list tweaked for a single master_uri: ['tcp://172.21.0.11:4506']

[INFO    ] >>>> for loop, master: tcp://172.21.0.11:4506
        [WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.11:4506
        [INFO    ] ReqChannel send crypt load={'id': 'minion-win-1', 'tag': 'some/event', 'data': {'__pub_fun': 'event.send', '__pub_pid': 6356, '__pub_jid': '20250208004800268591', '__pub_tgt': 'salt-call'}, 'cmd': '_minion_event'}

[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.11:4506
[INFO    ] ReqChannel send crypt load={'cmd': '_return', 'id': 'minion-win-1', 'jid': 'req', 'return': True, 'retcode': 0, 'fun': 'event.send', 'fun_args': ['some/event']}
local:
    True

amalaguti · Feb 8, 2025

whytewolf
Feb 8, 2025
Collaborator

this has been gone over again and again. the salt minion should send events to all CONNECTED masters. if there are two masters that the minion is connected to then it should send events to both. there is no such thing as active and passive in a hot/hot configuration. any master can be used to communicate with all of the minions in a hot/hot config.

failover is where active and not active comes into play. but since in failover the minion doesn't connect to all masters. it only connects to one and will failover to the other

5 replies

amalaguti Feb 8, 2025
Author

Hi THomas,
In this case, DUPLICATE means that the same event is seen twice on the same server, this ain't right. No reason to get two identical events on the same server. I'm talking specifically when using salt-call, this is important. If you send the instruction from the master all is good.

In this case, I'm pretty sure salt-call tries to connect to the first one in the list, not sure what term to use, active/passive, primary/secondary, I'm referring to the masters list in the minion configuration. When you run salt-call it tries to connect to the first one, if it can't tries with the second one and so in the masters list. It does it task and send the event back to the master that got connected to.
At least with the default option for multimaster, minion config with just a master list.

AWhen you send a command from the master, the events back from the minion are seen only in the master that sent the command, not in both masters, to me this is right.

I'v been observing these behavior for quite so long time, not sure if years ago it was not like this.
To me it's fine that the minion does not send events to both masters, and only to the one it received the command from, or the one that was able to connect to if using salt-call, that would complicate reactor configuration a bit I guess.

One particular event that is seen in both masters is the minion start event. Other than that the second master event bus remains pretty quiet while the other server is "active" and reponding to minions.

dwoz Feb 9, 2025
Maintainer

If we're trying to send an event to two master but two events get sent to a single master; it is a bug.

whytewolf Feb 10, 2025
Collaborator

Hi THomas, In this case, DUPLICATE means that the same event is seen twice on the same server, this ain't right. No reason to get two identical events on the same server. I'm talking specifically when using salt-call, this is important. If you send the instruction from the master all is good.

Okay, this is a bug. as @dwoz said if you get the same event on a single master that is a bug.

but the reason that it works from master like you expect in hot/hot is that it will send to the requesting master. but in things like salt-call or beacons there is no requesting master. so all connected masters should get one copy of the same event. this is because there is no idea which master is what.

In this case, I'm pretty sure salt-call tries to connect to the first one in the list, not sure what term to use, active/passive, primary/secondary, I'm referring to the masters list in the minion configuration. When you run salt-call it tries to connect to the first one, if it can't tries with the second one and so in the masters list. It does it task and send the event back to the master that got connected to. At least with the default option for multimaster, minion config with just a master list.

that is incorrect. it depends on the master_type setting as to how it will act. if you have it set for failover it will act like you expect. but if you don't set it it will treat the multimasters as hot/hot and the salt-call option will iterate through ALL of the masters and send the event to them all.

AWhen you send a command from the master, the events back from the minion are seen only in the master that sent the command, not in both masters, to me this is right.

This is because there is a requesting master that the minion can track. with salt-call or beacons or any minion generated event. there is no requesting master.

I'v been observing these behavior for quite so long time, not sure if years ago it was not like this. To me it's fine that the minion does not send events to both masters, and only to the one it received the command from, or the one that was able to connect to if using salt-call, that would complicate reactor configuration a bit I guess.

the same event to a single master is a bug. but sending events to all masters from salt-call in a hot/hot config isn't a bug.

One particular event that is seen in both masters is the minion start event. Other than that the second master event bus remains pretty quiet while the other server is "active" and reponding to minions.

that is because start events are minion generated. most of the rest that fills up the bus is master generated. and you most likely favor one master over another. again. failover is the only multimaster type that the word active means anything. in a hot/hot [the default for multimaster] ALL masters are active.

amalaguti Feb 10, 2025
Author

Thanks Thomas, yes I understand how it's suppossed to work, for the most part :).
And all the multimasters concept seem to be ok here, I can send orders from both masters, and minions reply correctly.

Specifically this conversation is from the minion using salt-call and more specifically involving the event.send module (and fire_event in a state that you call using salt-call)

To summarize, so the code is buggy due two events get seen in the same master event bus when using salt-call event.send in a multimaster (hot/hot) configuration.
As I've shown in the beginning of the conversation, in the for loop that iterates over the masters list it sends the event twice to the first master in the list (if it's reachable).

If for some reason the first master in the list is not reachable by the minion, after some attemps it swaps to the next master in the list (THIS IS RIGHT) and then this other master also receives two events (THIS IS WRONG :) ).

This is my minion config below, standard multimaster hot/hot. No failover. Master/Windows Minion 3007.1
As per the logging messages it's clear that salt-call makes connection only to the first master in the list, as long as this master is reachable, otherwise it will attempt with the next one.

So mostly all the events generated by salt-call are sent only to the first master in the masters list if reachable/available, not all masters. I think this is right or at least my preference, you may say that the other master/s should get the same event too, but I don't see it happening here and I (personally) don't think that's useful anyhow, just a waste of events.

PS C:\Users\adrian> salt-call config.get master_list
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
local:
    - 172.21.0.10
    - 172.21.0.11
    
PS C:\Users\adrian> salt-call config.get master
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
local:
    172.21.0.10


PS C:\Users\adrian> salt-call config.get master_type
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
local:
    str


PS C:\Users\adrian> salt-call test.echo "hello"
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
[WARNING ] >>>> master_uri in AsyncReqChannel.factory set to tcp://172.21.0.10:4506
local:
    hello

some/event	{
    "_stamp": "2025-02-10T12:43:40.659678",
    "cmd": "_minion_event",
    "data": {
        "__pub_fun": "event.send",
        "__pub_jid": "20250210124340600231",
        "__pub_pid": 2528,
        "__pub_tgt": "salt-call"
    },
    "id": "minion-win-1",
    "tag": "some/event"
}
some/event	{
    "_stamp": "2025-02-10T12:43:40.723805",
    "cmd": "_minion_event",
    "data": {
        "__pub_fun": "event.send",
        "__pub_jid": "20250210124340600231",
        "__pub_pid": 2528,
        "__pub_tgt": "salt-call"
    },
    "id": "minion-win-1",
    "tag": "some/event"
}

dwoz Feb 26, 2025
Maintainer

It sounds like the crux of the issue here is that salt-call essentially spins up it's own minion and is not aware of anything about a minion daemon process if there is one running. So you have two distinct processes doing to different things. The salt-call command was never intended to have to rely on a minion processes running. Salt call may not have all the same logic associated with multi-master as a minion. There could be an argument made that salt-call should handle multi-master the same as a regular minion. Even if this were the case there is no guarantee salt-call would end up connecting to the same master as the minion daemon. It would only change the fact that you are getting the event on both master and you'd only get it on one master.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for salt-call sending duplicate events when event.send or fire_event is used #67714

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment · 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Search code, repositories, users, issues, pull requests...

Fix for salt-call sending duplicate events when event.send or fire_event is used #67714

Uh oh!

amalaguti Feb 8, 2025

Replies: 1 comment · 5 replies

Uh oh!

whytewolf Feb 8, 2025 Collaborator

Uh oh!

amalaguti Feb 8, 2025 Author

Uh oh!

Uh oh!

dwoz Feb 9, 2025 Maintainer

Uh oh!

whytewolf Feb 10, 2025 Collaborator

Uh oh!

amalaguti Feb 10, 2025 Author

Uh oh!

dwoz Feb 26, 2025 Maintainer

amalaguti
Feb 8, 2025

whytewolf
Feb 8, 2025
Collaborator

amalaguti Feb 8, 2025
Author

dwoz Feb 9, 2025
Maintainer

whytewolf Feb 10, 2025
Collaborator

amalaguti Feb 10, 2025
Author

dwoz Feb 26, 2025
Maintainer