Exceptionally Long Queue Length Queuing

Home » Asterisk Users » Exceptionally Long Queue Length Queuing
Asterisk Users 4 Comments

Hi,

We have a box up and we are starting to see a lot of “Exceptionally long queue length queuing” in the logs. From all the research so far it seems like this leads to their systems crashing and being unreachable. In our case the box remains up and takes calls. We are running Asterisk 16.6.1. We are using MusicOnHold to play online music streams via ffmpeg. Any idea on how to troubleshoot this further to see why this is happening?

TIA.

Regards,

Dovid

4 thoughts on - Exceptionally Long Queue Length Queuing

  • I ended up re-writing my code to use less Local channels and most of the errors went away. I also noticed that my load average and CPU usage is way down. I will open a ticket since it seems that it may be a bug that others are experiencing.

  • A backtrace would still be needed for the issue, otherwise there is no way to know what exactly is going on our where things are getting held up.

  • Hello list, Hope you are all doing well!

    Sorry for the long email but I tried to explain all I’ve seen regarding this issue…. I am going to open a ticket for this issue but I found it useful to already explain it here.

    I’ve also recently faced this “Exceptionally long queue length queuing”
    error in some servers running 16.8.0 and after A LOT of investigation, I’ve discovered what is causing it in my case (not sure if is the same case as initially reported by Dovid)
    So, what I discovered is that in case the diaplan has a Wait() and during this wait period many “deferrable frames” are received, the final piece of the Wait function (ast_safe_sleep_conditional) can throw the “Exceptionally long queue length queuing” error message. It is not really common to get a lot of “deferrable frames” (see ast_is_deferrable_frame) but we can have that by simply putting on hold and off hold while the channel is on “wait”. My production case is a little more complex and happens while on Wait and with the AST_CONTROL_SRCCHANGE event. This happens when an old Asterisk
    13.13.0 receives a hold REINVITE and starts a new RTP stream (with different SSRC) towards some other victim Asterisk with the Music on Hold while still sending the original caller RTP (so at this point 2 RTP streams with different SSRCs are sent to the destination that happens to be a channel currently on Wait). I could not find any ticket for such bug, but version 16.8.0 does not have this problem and instead seems to inject the Music on Hold in the existing RTP stream so there aren’t 2 streams at the same time to the same destination. The problem with 2 streams is that each RTP package of the new stream generates a source change frame
    (AST_CONTROL_SRCCHANGE), and at 50pps this builds up a lot of events fast and so if the Wait is high the issue pops up (I agree having a channel on Wait for too long is not a good practice but noone is perfect…). Also if multiple channels are facing this situation then the problem just escalates very bad. When the “Exceptionally long queue length queuing” is happening all AMI
    commands seem to fail, BYE messages seem to be missed/delayed, internal timeouts expire and all sort of weird things happen to all channels in the server, apparently the whole Asterisk process gets locked in this loop affecting everything else… and this is what makes this issue really bad because Asterisk can become completely unresponsive while the error message is happening… if it only could throw the error and keep working fine I
    guess it would be reasonable leaving it up to the dialplan logic to avoid a long Wait()…. Anyway, to replicate the issue is enough a simple dialplan like this:
    exten => 8888,1,Answer()
    same => n,Wait(30)
    same => n,NoOp(After wait)
    same => n,Playback(goodbye)
    same => n,Hangup()
    After calling in and while in the Wait(30), repeatedly press Hold/Unhold in the telephone (with Linphone just press the hold once and then press and hold the spacebar which will repeatedly do the hold and unhold several times). After Wait is finished the “Exceptionally long queue length queuing” will show up but gets resolved very fast just because I think there weren’t enough frames queued to really freeze Asterisk for too long.

    Thank you, Kind regards, Patrick Wakano