CEL Entries Over ODBC Several Hours Late (Matthew Jordan)
Hi Matthew
Thank you very much for the reply.
I must have something seriously wrong somewhere else then – I retested now and the “apparent” effect is as I describe but your info definitely contradicts that. But you’re obviously correct.
One more question – I’ve noted that if I run a combination of queries in the CEL backing DB (MariaDB) and the CEL table is locked, this severely affects the Asterisk instance – thousands of occurrences of
chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
‘6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060’ with owner SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for 10000
ms
appear in the CLI and users complain that if they hang up then they cannot make another call on the same SIP handset for several minutes.
This is obviously because the dialplan gets delayed in the H extension, and cannot write to the CEL table, waiting for the MariaDB instance to clear the locks so it can write again. The above apparently comes from a watchdog process that watches how “fast” the H extension is and if it takes “too long” it forces the channels closed.
Is this assumption correct?
Addtionally, it seems that the writing of CEL in Asterisk is NOT async? E.g. it appears the thread that was running the conversation ALSO does the CEL
writing / pushing to the CEL core as you describe in a synchronous manner.
For 1.8, is this correct that CELs are synchronous, and do newer Asterisk versions do it async?
E. g. my point being if there are major DB issues, it is quite a bit kryptonite to have that “spill back” into Asterisk and start blocking off users from calling out – wouldn’t it be much better to simply have failed CEL writes just die in a distal thread instead of the main call thread for the channels running on that handset.
Or am I completely misunderstanding things?
Anyway, thanks for the reply. 🙂
Kind regards,
will CELs even ODBC?
Asterisk does not buffer CEL entries. If anything, it pushes the entries out to ODBC much more aggressively than what you would get with CDRs.
An event is generated in Asterisk that corresponds to the CEL entry. That entry is pushed over a message bus (the ‘event’ message bus in 1.8 – 11;
‘stasis’ in 12+) and is picked up by the CEL core. The events are immediately sent to the registered backends, who also immediately write it out to the backend they support. In the case of ODBC, this immediately does an INSERT into the appropriate table.
In Asterisk 1.8, you can look for a verbose level 11 message that will show when this occurs:
ast_verb(11, “[%s]\n”, ast_str_buffer(sql));
In later versions, this was turned into a debug level 3 message (as anything over a verbose 5/debug 5 was cleaned up).
If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn’t show up in the database, then it is either in the ODBC driver or the Maria database.
If you don’t see that message, then something is preventing those events from getting delivered inside of Asterisk, which would only occur if you had some other serious call related issues occurring.
Matt
2 thoughts on - CEL Entries Over ODBC Several Hours Late (Matthew Jordan)
Sorry for the probably obvious question, but it’s better to cover all bases.
The DBMS is running on the same box as Asterisk is? If that’s the case then maybe the DBMS is using too much CPU and starving Asterisk?
2015-12-10 12:57 GMT-02:00 Stefan Viljoen:
That’s actually a bit surprising. Assuming you are using CELGenUserEvent in the ‘h’ extension, I would not expect that to block the channel when writing out to the database. The act of creating the CEL event will lock the channel briefly, but the actual CEL event is queued up onto a message bus and dispatched to another thread. I’d have to see the output of a gdb backtrace or ‘core show locks’ to know why that is impacting the channels.
It should be asynchronous everywhere – in 1.8+. While the implementation of the message bus changed in Asterisk 12, that doesn’t change the nature of how it is dispatched. As I said, a gdb backtrace or ‘core show locks’ would show who the culprit it.
E. g. my point being if there are major DB issues, it is quite a bit