Troubleshooting Load Issues

Home » Asterisk Users » Troubleshooting Load Issues
Asterisk Users 7 Comments

Hi,

I have an Asterisk box which has an IVR that plays random gsm files. The box has SSD’s and two CPU E5-2695 v2 cpus with 64GB ram. The Asterisk CPU
usage along with the load seems to jump around. With about 500 callers it hovers between 250-400% CPU (so 2.5 to 4 cores) which seems reasonable. Every so often the load average spikes. The idle never drops below 85%. When the load average spikes I see a lot of kworker threads and the CPU
usage tends to (not not always) go up as well. How would I go about seeing what in Asterisk is causing the spike? The box is locked down and only takes calls from an OpenSiPS box. There is nothing else running on the box.

TIA.

Dovid

7 thoughts on - Troubleshooting Load Issues

  • Could some calls be arriving with a different codec? (Is transcoding causing the spikes)? Are you limiting codecs to match your audio files?

    From: asterisk-users [mailto:asterisk-users-bounces@lists.digium.com] On Behalf Of Dovid Bender Sent: Wednesday, April 22, 2020 2:01 PM
    To: Asterisk Users Mailing List – Non-Commercial Discussion
    Subject: [asterisk-users] Troubleshooting load issues

    Hi,

    I have an Asterisk box which has an IVR that plays random gsm files. The box has SSD’s and two CPU E5-2695 v2 cpus with 64GB ram. The Asterisk CPU usage along with the load seems to jump around. With about 500 callers it hovers between 250-400% CPU (so 2.5 to 4 cores) which seems reasonable. Every so often the load average spikes. The idle never drops below 85%. When the load average spikes I see a lot of kworker threads and the CPU usage tends to (not not always) go up as well. How would I go about seeing what in Asterisk is causing the spike? The box is locked down and only takes calls from an OpenSiPS box. There is nothing else running on the box.

    TIA.

    Dovid

  • All the calls are using ulaw. The files that I am playing are gsm. I
    suppose doing a file convert with sox to .ulaw may help but it should be able to do 500 calls without an issue. Can it possibly be a bug? if not how do I profile which call(s) can be causing the spike?

  • One of the things that come to mind is that the operating system is flushing your SSDs at the time of the spike. You could always use iotop to watch what the file system is doing at the time of the spike.

    Doug

  • I assumed the spikes were within the Asterisk process. If the spikes last long enough use htop and iotop to see if the spikes are outside of your process.

    If outside the Asterisk process then there are lots of generic troubleshooting guides. If within the Asterisk process (and no transcoding) then turn verbose way up and watch for clues on CLI when a spike occurs.

    From: asterisk-users [mailto:asterisk-users-bounces@lists.digium.com] Could some calls be arriving with a different codec? (Is transcoding causing the spikes)? Are you limiting codecs to match your audio files?

    From: asterisk-users [mailto:asterisk-users-bounces@lists.digium.com ] On Behalf Of Dovid Bender Sent: Wednesday, April 22, 2020 2:01 PM
    To: Asterisk Users Mailing List – Non-Commercial Discussion >
    Subject: [asterisk-users] Troubleshooting load issues

    Hi,

    I have an Asterisk box which has an IVR that plays random gsm files. The box has SSD’s and two CPU E5-2695 v2 cpus with 64GB ram. The Asterisk CPU usage along with the load seems to jump around. With about 500 callers it hovers between 250-400% CPU (so 2.5 to 4 cores) which seems reasonable. Every so often the load average spikes. The idle never drops below 85%. When the load average spikes I see a lot of kworker threads and the CPU usage tends to (not not always) go up as well. How would I go about seeing what in Asterisk is causing the spike? The box is locked down and only takes calls from an OpenSiPS box. There is nothing else running on the box.

    TIA.

    Dovid

  • Try setting transcode_via_sln=no in /etc/asterisk/asterisk.conf and restart Asterisk. A reload will NOT apply the new value. Setting it to no seems to smooth out CPU usage on one of my servers.


    http://help.nyigc.net/