res_pjsip_session: Handle multi-stream re-invites better
When both Asterisk and a UA send re-invites at the same time, both send 491 "Transaction in progress" responses to each other and back off a specified amount of time before retrying. When Asterisk prepares to send its re-invite, it sets up the session's pending media state with the new topology it wants, then sends the re-invite. Unfortunately, when it received the re-invite from the UA, it partially processed the media in the re-invite and reset the pending media state before sending the 491 losing the state it set in its own re-invite. Asterisk also was not tracking re-invites received while an existing re-invite was queued resulting in sending stale SDP with missing or duplicated streams, or no re-invite at all because we erroneously determined that a re-invite wasn't needed. There was also an issue in bridge_softmix where we were using a stream from the wrong topology to determine if a stream was added. This also caused us to erroneously determine that a re-invite wasn't needed. Regardless of how the delayed re-invite was triggered, we need to reconcile the topology that was active at the time the delayed request was queued, the pending topology of the queued request, and the topology currently active on the session. To do this we need a topology resolver AND we need to make stream named unique so we can accurately tell what a stream has been added or removed and if we can re-use a slot in the topology. Summary of changes: * bridge_softmix: * We no longer reset the stream name to "removed" in remove_all_original_streams(). That was causing multiple streams to have the same name and wrecked the checks for duplicate streams. * softmix_bridge_stream_sources_update() was checking the old_stream to see if it had the softmix prefix and not considering the stream as "new" if it did. If the stream in that slot has something in it because another re-invite happened, then that slot in old might have a softmix stream but the same stream in new might actually be a new one. Now we check the new_stream's name instead of the old_stream's. * stream: * Instead of using plain media type name ("audio", "video", etc) as the default stream name, we now append the stream position to it to make it unique. We need to do this so we can distinguish multiple streams of the same type from each other. * When we set a stream's state to REMOVED, we no longer reset its name to "removed" or destroy its metadata. Again, we need to do this so we can distinguish multiple streams of the same type from each other. * res_pjsip_session: * Added resolve_refresh_media_states() that takes in 3 media states and creates an up-to-date pending media state that includes the changes that might have happened while a delayed session refresh was in the delayed queue. * Added is_media_state_valid() that checks the consistency of a media state and returns a true/false value. A valid state has: * The same number of stream entries as media session entries. Some media session entries can be NULL however. * No duplicate streams. * A valid stream for each non-NULL media session. * A stream that matches each media session's stream_num and media type. * Updated handle_incoming_sdp() to set the stream name to include the stream position number in the name to make it unique. * Updated the ast_sip_session_delayed_request structure to include both the pending and active media states and updated the associated delay functions to process them. * Updated sip_session_refresh() to accept both the pending and active media states that were in effect when the request was originally queued and to pass them on should the request need to be delayed again. * Updated sip_session_refresh() to call resolve_refresh_media_states() and substitute its results for the pending state passed in. * Updated sip_session_refresh() with additional debugging. * Updated session_reinvite_on_rx_request() to simply return PJ_FALSE to pjproject if a transaction is in progress. This stops us from creating a partial pending media state that would be invalid later on. * Updated reschedule_reinvite() to clone both the current pending and active media states and pass them to delay_request() so the resolver can tell what the original intention of the re-invite was. * Added a large unit test for the resolver. ASTERISK-29014 Change-Id: Id3440972943c611a15f652c6c569fa0e4536bfcb
Loading
Please register or sign in to comment