-
Russell Bryant authored
This bug surfaced in 1.6.2 and does not affect code in any other released version of Asterisk. It manifested itself as SIP qualify not happening when it should, causing peers to go unreachable. This was debugged down to scheduler entries sometimes not getting executed when they were supposed to, which was in turn caused by an error in the heap code. The problem only sometimes occurs, and it is due to the logic for removing an entry in the heap from an arbitrary location (not just popping off the top). The scheduler performs this operation frequently when entries are removed before they run (when ast_sched_del() is used). In a normal pop off of the top of the heap, a node is taken off the bottom, placed at the top, and then bubbled down until the max heap property is restored (see max_heapify()). This same logic was used for removing an arbitrary node from the middle of the heap. Unfortunately, that logic is full of fail. This patch fixes that by fully restoring the max heap property when a node is thrown into the middle of the heap. Instead of just pushing it down as appropriate, it first pushes it up as high as it will go, and _then_ pushes it down. Lastly, fix a minor problem in ast_heap_verify(), which is only used for debugging. If a parent and child node have the same value, that is not an error. The only error is if a parent's value is less than its children. A huge thanks goes out to cappucinoking for debugging this down to the scheduler, and then producing an ast_heap test case that demonstrated the breakage. That made it very easy for me to focus on the heap logic and produce a fix. Open source projects are awesome. (closes issue #16936) Reported by: ib2 Tested by: cappucinoking, crjw (closes issue #17277) Reported by: cappucinoking Patches: heap-fix.rev2.diff uploaded by russell (license 2) Tested by: cappucinoking, russell git-svn-id: https://origsvn.digium.com/svn/asterisk/trunk@261496 65c4cc65-6c06-0410-ace0-fbb531ad65f3
Russell Bryant authoredThis bug surfaced in 1.6.2 and does not affect code in any other released version of Asterisk. It manifested itself as SIP qualify not happening when it should, causing peers to go unreachable. This was debugged down to scheduler entries sometimes not getting executed when they were supposed to, which was in turn caused by an error in the heap code. The problem only sometimes occurs, and it is due to the logic for removing an entry in the heap from an arbitrary location (not just popping off the top). The scheduler performs this operation frequently when entries are removed before they run (when ast_sched_del() is used). In a normal pop off of the top of the heap, a node is taken off the bottom, placed at the top, and then bubbled down until the max heap property is restored (see max_heapify()). This same logic was used for removing an arbitrary node from the middle of the heap. Unfortunately, that logic is full of fail. This patch fixes that by fully restoring the max heap property when a node is thrown into the middle of the heap. Instead of just pushing it down as appropriate, it first pushes it up as high as it will go, and _then_ pushes it down. Lastly, fix a minor problem in ast_heap_verify(), which is only used for debugging. If a parent and child node have the same value, that is not an error. The only error is if a parent's value is less than its children. A huge thanks goes out to cappucinoking for debugging this down to the scheduler, and then producing an ast_heap test case that demonstrated the breakage. That made it very easy for me to focus on the heap logic and produce a fix. Open source projects are awesome. (closes issue #16936) Reported by: ib2 Tested by: cappucinoking, crjw (closes issue #17277) Reported by: cappucinoking Patches: heap-fix.rev2.diff uploaded by russell (license 2) Tested by: cappucinoking, russell git-svn-id: https://origsvn.digium.com/svn/asterisk/trunk@261496 65c4cc65-6c06-0410-ace0-fbb531ad65f3