Page 1 of 2

ZWay crashed with SEGV + core dump

Posted: 13 Jan 2024 12:01
by piet66
Hi,
my z-way-server crashed this morning, a core dump was written.

As far as I can see, it was killed with SEGV in thread zway/core, libv8.so at 0x0000000076bef49c.

Perhaps it is possible to gain further insights from the attached collocation. If I can contribute further information, please ask.

Re: ZWay crashed with SEGV + core dump

Posted: 19 Jan 2024 19:38
by PoltoS
Please send the exact build. Or is it 3.2.3 ? If soo, it is too old to try to find issues in it... But we will have a look at it

Re: ZWay crashed with SEGV + core dump

Posted: 19 Jan 2024 19:51
by piet66
It is 3.2.3.
Since this problem is existing since years over all Z-Way releases, I don't expect different behavior in the current release.

I will not upgrade to release 4 till the issue with growing CPU load is solved.

Re: ZWay crashed with SEGV + core dump

Posted: 20 Jan 2024 03:10
by PoltoS
Do you have evidence this crash occured on newer releases? Am I right that this is the first time it happened to you?

Re: ZWay crashed with SEGV + core dump

Posted: 20 Jan 2024 11:52
by piet66
This happens from time to time, about 10 - 15 times per year.
I just didn't know why until now. Only since the last two times have I known that it is a SEGV (after I moved to systemd).

Normally it is not a big problem for me, because my watchdog restarts the server automatically. However, at the end of last year, z-way-server refused to restart in the middle of the night.

This is a problem for me because the heating then stays cold. I have therefore decided to analyze the problem in detail now.

Re: ZWay crashed with SEGV + core dump

Posted: 28 Jan 2024 05:08
by PoltoS
Do any of you know how to reproduce it to make it happen more often? So we can catch it. We believe stability is very important in automation and would like to analyse the issue to chase it. Better would be to track the new 4.1.2 rather than 3.2.3. But we can do it with 3.2.3 too

Re: ZWay crashed with SEGV + core dump

Posted: 28 Jan 2024 11:22
by piet66
I don't see any possibility of reproduction. The occurrence is completely unpredictable. One month it was 5 times, another time nothing in 4 months.

It's always the same: SEGV in v8 at the same address.

Re: ZWay crashed with SEGV + core dump

Posted: 28 Jan 2024 14:38
by GokMasE
piet66 wrote:
28 Jan 2024 11:22
I don't see any possibility of reproduction. The occurrence is completely unpredictable. One month it was 5 times, another time nothing in 4 months.

It's always the same: SEGV in v8 at the same address.
Any chance that the segmentation fault you have been seeing could be the same problem that was discussed in this thread?

https://forum.z-wave.me/viewtopic.php?f=3419&t=35688

That particular issue was concerning incoming requests and was present in official releases until 4.0.3 IIRC.
While the best option would probably be to sort out problems in current builds, maybe the fix for the SIGSEGV from 4.0.3 (adding the missing JS lock) could be added to older builds too, if it turns out it is actually the same problem.

As for growing CPU load under recent builds, I can't say I have noticed that - but I did have 2 CF-cards failing on me within 3 months this autumn. If that was due to bad luck, old CF cards, or if something in z-way is generating a higher stress level to it, I really cannot say.

Re: ZWay crashed with SEGV + core dump

Posted: 28 Jan 2024 19:33
by piet66
GokMasE wrote:
28 Jan 2024 14:38
Any chance that the segmentation fault you have been seeing could be the same problem that was discussed in this thread?

https://forum.z-wave.me/viewtopic.php?f=3419&t=35688

That particular issue was concerning incoming requests and was present in official releases until 4.0.3 IIRC.
While the best option would probably be to sort out problems in current builds, maybe the fix for the SIGSEGV from 4.0.3 (adding the missing JS lock) could be added to older builds too, if it turns out it is actually the same problem.
Thank you for the hint. I had seen this thread but lost track of it a bit. It was not clear to me whether the problem there had been solved.

As I understand, it was about a SEGV crash caused by http calls in release < 4.0.3? Is that correct?
And the increasing CPU load in 4.1.2 is possibly also due to the http call?
If so, then it doesn't really matter which ZWay version I use.

Re: ZWay crashed with SEGV + core dump

Posted: 28 Jan 2024 19:52
by piet66
I thought about whether the error pattern at that thread could also apply to me, but came to no conclusion:
PoltoS wrote:
13 Jan 2023 19:32
I'll try to share the solution with you. In short, the HTTP module (modhttp.so) is a wrapper of cURL for Google V8 JS engine Z-Way is using. The problem was that the thread releasing pointers to JS callback functions was doing it without a JS lock. All that was ok until HTTP answer was returning with a significant delay - this led to many objects being allocated and then freed. V8 has an internal garbage collector (GC) and when an object is released immediately, it is not involved. But when they are allocated in mass and then freed, it is called regularly. So it happened that GC was working while HTTP thread was releasing pointers to callbacks. A simple lock missing.

The clue to stably reproduce it was many concurrent requests (generated by HTTPGet, Tasmota or Z-Way-to-Z-Way binding) AND slow response of the remote side.
I do indeed use http calls with many instances. But each instance usually only sets one post every few minutes. In rare cases, the answer may take a little longer.

PS: I looked at the last two crashes: There was no unusual accumulation of http calls.