job queue runs full after upgrade to v4.1.0

Official announcements from Z-Wave.Me team. Check this forum to get information about new releases and bug fixes.
johanneslarsson
Posts: 18
Joined: 29 Jun 2023 21:18

Re: job queue runs full after upgrade to v4.1.0

Post by johanneslarsson »

Finally its in a hanged state, again. I'll try to share the logs later today
johanneslarsson
Posts: 18
Joined: 29 Jun 2023 21:18

Re: job queue runs full after upgrade to v4.1.0

Post by johanneslarsson »

Email sent with logs and enabled remote support access.

I also see this when I run strace, unsure if its helpful. But it could indicate that process is stuck in a sleep?

Code: Select all

root@raspberrypi:~# strace -p 475
strace: Process 475 attached
[ Process PID=475 runs in 32 bit mode. ]
strace: WARNING: Proper structure decoding for this personality is not supported, please consider building strace with mpers support enabled.
restart_syscall(<... resuming interrupted io_setup ...>) = 0
clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=17844176517316214784}, 0xffd16628) = 0
clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=17844176517316214784}, 0xffd16628) = 0
clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=17844176517316214784}, 0xffd16628) = 0
clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=17844176517316214784}, 0xffd16628) = 0
clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=17844176517316214784}, 0xffd16628) = 0

queue screenshot
queue screenshot
Screenshot 2023-07-03 at 09.09.17.png (224.78 KiB) Viewed 6171 times
johanneslarsson
Posts: 18
Joined: 29 Jun 2023 21:18

Re: job queue runs full after upgrade to v4.1.0

Post by johanneslarsson »

The full queue as json

Code: Select all

[[0.20000000298023224, [0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 9, "Battery Get", null, [9, 2, 128, 2, 5], 4030764512, []], [20.100000381469727, [3, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 3, "Meter Get (v2/3)", "Not delivered to recipient\nNot delivered to recipient", [3, 3, 50, 1, 0, 37], 4048551728, []], [0.20000000298023224, [0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0], 10, "Time Parameters Set", null, [10, 13, 86, 1, 139, 1, 7, 231, 7, 3, 3, 0, 0, 164, 234, 5], 4048364848, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 2, 0, 0], 13, "SwitchMultilevel Set", null, [13, 8, 108, 1, 139, 4, 38, 1, 255, 255, 37], 4048294280, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "SwitchBinary Set", null, [4, 3, 37, 1, 255, 37], 4048354400, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "SwitchBinary Get", null, [4, 2, 37, 2, 37], 4048421472, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 2, 0, 0], 13, "SwitchMultilevel Set", null, [13, 8, 108, 1, 140, 4, 38, 1, 99, 255, 37], 4030761496, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "SwitchBinary Set", null, [2, 3, 37, 1, 0, 37], 4048462416, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "SwitchBinary Set", null, [2, 3, 37, 1, 255, 37], 4030783320, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "SwitchBinary Get", null, [2, 2, 37, 2, 37], 4048401256, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 2, 0, 0], 13, "SwitchMultilevel Set", null, [13, 8, 108, 1, 141, 4, 38, 1, 99, 255, 37], 4030779528, []], [0.20000000298023224, [0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], 0, "Get statistics gathered by the Z-Wave protocol", null, [], 4030842608, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "SensorMultilevel V5 Get", null, [2, 4, 49, 4, 4, 0, 37], 4048439232, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "Meter Get (v2/3)", null, [2, 3, 50, 1, 0, 37], 4048456584, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "Meter Get (v2/3)", null, [2, 3, 50, 1, 16, 37], 4048261632, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 2, "Alarm Get (v3)", null, [2, 5, 113, 4, 0, 8, 0, 37], 4048263672, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 3, "Meter Get (v2/3)", null, [3, 3, 50, 1, 0, 37], 4048265752, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 3, "Meter Get (v2/3)", null, [3, 3, 50, 1, 16, 37], 4048268168, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 3, "Alarm Get (v3)", null, [3, 5, 113, 4, 0, 8, 0, 37], 4048301880, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "SensorMultilevel V5 Get", null, [4, 4, 49, 4, 4, 0, 37], 4048283208, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "Meter Get (v2/3)", null, [4, 3, 50, 1, 0, 37], 4030778984, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "Meter Get (v2/3)", null, [4, 3, 50, 1, 16, 37], 4048281056, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 128, 0, 0], 4, "Alarm Get (v3)", null, [4, 5, 113, 4, 0, 8, 0, 37], 4030814984, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 2, 0, 0], 13, "Meter Get (v2/3)", null, [13, 3, 50, 1, 0, 37], 4030842680, []], [0.20000000298023224, [0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 2, 0, 0], 13, "Meter Get (v2/3)", null, [13, 3, 50, 1, 16, 37], 4048255416, []], [0.20000000298023224, [0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], 0, "Get background noise level", null, [], 4048310600, []]]
thomas@unverzagt.org
Posts: 2
Joined: 17 Oct 2021 15:17

Re: job queue runs full after upgrade to v4.1.0

Post by thomas@unverzagt.org »

PoltoS wrote:
03 Jul 2023 01:24
@thomas please provide your log file and the queue inspector view (screenshot or better save the html page) (click on Jobs on the top of expert UI)
Image

https://1drv.ms/i/s!AtbxelMsNXH-sUdUKzY ... C?e=axQKZC



Device 76 seems to be the bad guy (probably out of range during "set" command)
It became evident when I tried to switch on device 56 (Fibaro Wall Plug).

It is curious because many inbound things still work, like window contacts.

BR
Thomas
User avatar
PoltoS
Posts: 7588
Joined: 26 Jan 2011 19:36

Re: job queue runs full after upgrade to v4.1.0

Post by PoltoS »

Good news! The issue was reproduced and located.

Why was it so hard to find and reproduce in our lab? And how it passed our release tests.

The issue happened when a Security S0 encrypted packet was not delivered three times in a row right after receiving a Nonce Report (which is very rare). The clean-up of the undelivered Security S0 packet was incorrect, making the queue stuck. The node became undelivered due to routing failures that happen from time to time in real life.

We are preparing the fix. Thank you all for your support.
User avatar
PoltoS
Posts: 7588
Joined: 26 Jan 2011 19:36

Re: job queue runs full after upgrade to v4.1.0

Post by PoltoS »

Please check this version: z-way-4.1.0-27-g589c9812-lws16_armhf.deb
johanneslarsson
Posts: 18
Joined: 29 Jun 2023 21:18

Re: job queue runs full after upgrade to v4.1.0

Post by johanneslarsson »

PoltoS wrote:
06 Jul 2023 03:45
Please check this version: z-way-4.1.0-27-g589c9812-lws16_armhf.deb
Installed it now
J.nissen
Posts: 111
Joined: 02 Sep 2016 22:12

Re: job queue runs full after upgrade to v4.1.0

Post by J.nissen »

Installed
PKG=z-way-4.1.0-27-g589c9812-lws16_armhf.deb && wget -O /tmp/$PKG https://storage.z-wave.me/z-way-server/$PKG && sudo dpkg -i /tmp/$PKG
Greetings from germany
Jörg

Raspi 4(Buster), Zwave Version 4.1.1 Raz 7 Pro, Serial API Version:07.38(SDK 7.20)
J.nissen
Posts: 111
Joined: 02 Sep 2016 22:12

Re: job queue runs full after upgrade to v4.1.0

Post by J.nissen »

Hello PoltoS,

it doesn´t work
2023-07-07 07_59_41-Z-Wave Expert UI – Mozilla Firefox.png
2023-07-07 07_59_41-Z-Wave Expert UI – Mozilla Firefox.png (50.51 KiB) Viewed 6000 times

Code: Select all

[2023-07-07 00:00:00.020] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.021] [D] [zway] SENDING (cb 0xa4): ( 01 09 00 13 37 02 98 40 25 A4 89 )
[2023-07-07 00:00:00.025] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.026] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2023-07-07 00:00:00.026] [D] [zway] SENT ACK
[2023-07-07 00:00:00.026] [D] [zway] Delivered to Z-Wave stack
[2023-07-07 00:00:00.031] [D] [zway] RECEIVED: ( 01 1D 00 13 A4 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 57 )
[2023-07-07 00:00:00.031] [D] [zway] SENT ACK
[2023-07-07 00:00:00.031] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.032] [D] [zway] SENDING (cb 0xa5): ( 01 09 00 13 37 02 98 40 25 A5 88 )
[2023-07-07 00:00:00.035] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.036] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2023-07-07 00:00:00.036] [D] [zway] SENT ACK
[2023-07-07 00:00:00.036] [D] [zway] Delivered to Z-Wave stack
[2023-07-07 00:00:00.041] [D] [zway] RECEIVED: ( 01 1D 00 13 A5 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 56 )
[2023-07-07 00:00:00.041] [D] [zway] SENT ACK
[2023-07-07 00:00:00.041] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.042] [I] [zway] Job 0x13 (Nonce Get to node 55): Removing job due to too much retransmitions
[2023-07-07 00:00:00.042] [I] [zway] Adding job: Check if node is failed to node 55
[2023-07-07 00:00:00.042] [D] [zway] Job 0x13 (Nonce Get to node 55): fail
[2023-07-07 00:00:00.042] [I] [zway] Node 55:0 CC Security: sending Nonce Get
[2023-07-07 00:00:00.042] [D] [zway] SETDATA devices.55.instances.0.commandClasses.152.data.rNonceAckWait = **********
[2023-07-07 00:00:00.042] [I] [zway] Adding job: Nonce Get to node 55
[2023-07-07 00:00:00.042] [I] [zway] Removing job: Nonce Get to node 55
[2023-07-07 00:00:00.042] [D] [zway] SENDING: ( 01 04 00 62 37 AE )
[2023-07-07 00:00:00.045] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.046] [D] [zway] RECEIVED: ( 01 04 01 62 00 98 )
[2023-07-07 00:00:00.046] [D] [zway] SENT ACK
[2023-07-07 00:00:00.046] [D] [zway] SETDATA devices.55.data.isFailed = False
[2023-07-07 00:00:00.046] [I] [zway] Job 0x62 (Check if node is failed to node 55): Node 55 is operating
[2023-07-07 00:00:00.046] [D] [zway] Job 0x62 (Check if node is failed to node 55): success
[2023-07-07 00:00:00.046] [D] [zway] SETDATA devices.55.data.failureCount = 0 (0x00000000)
[2023-07-07 00:00:00.046] [D] [zway] SETDATA devices.55.data.lastSendInternal = **********
[2023-07-07 00:00:00.046] [I] [zway] Removing job: Check if node is failed to node 55
[2023-07-07 00:00:00.088] [D] [zway] SENDING (cb 0xa6): ( 01 09 00 13 37 02 98 40 25 A6 8B )
[2023-07-07 00:00:00.091] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.092] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2023-07-07 00:00:00.092] [D] [zway] SENT ACK
[2023-07-07 00:00:00.092] [D] [zway] Delivered to Z-Wave stack
[2023-07-07 00:00:00.097] [D] [zway] RECEIVED: ( 01 1D 00 13 A6 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 )
[2023-07-07 00:00:00.097] [D] [zway] SENT ACK
[2023-07-07 00:00:00.097] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.098] [D] [zway] SENDING (cb 0xa7): ( 01 09 00 13 37 02 98 40 25 A7 8A )
[2023-07-07 00:00:00.102] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.103] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2023-07-07 00:00:00.103] [D] [zway] SENT ACK
[2023-07-07 00:00:00.103] [D] [zway] Delivered to Z-Wave stack
[2023-07-07 00:00:00.103] [D] [zway] Job 0x13: deleted from queue
[2023-07-07 00:00:00.106] [D] [zway] Job 0x62: deleted from queue
[2023-07-07 00:00:00.107] [D] [zway] RECEIVED: ( 01 1D 00 13 A7 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 54 )
[2023-07-07 00:00:00.107] [D] [zway] SENT ACK
[2023-07-07 00:00:00.108] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.108] [D] [zway] SENDING (cb 0xa8): ( 01 09 00 13 37 02 98 40 25 A8 85 )
[2023-07-07 00:00:00.112] [D] [zway] RECEIVED ACK
[2023-07-07 00:00:00.113] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2023-07-07 00:00:00.113] [D] [zway] SENT ACK
[2023-07-07 00:00:00.113] [D] [zway] Delivered to Z-Wave stack
[2023-07-07 00:00:00.118] [D] [zway] RECEIVED: ( 01 1D 00 13 A8 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5B )
[2023-07-07 00:00:00.118] [D] [zway] SENT ACK
[2023-07-07 00:00:00.118] [I] [zway] Job 0x13 (Nonce Get to node 55): Not delivered to recipient
[2023-07-07 00:00:00.119] [I] [zway] Job 0x13 (Nonce Get to node 55): Removing job due to too much retransmitions
[2023-07-07 00:00:00.119] [I] [zway] Adding job: Check if node is failed to node 55
[2023-07-07 00:00:00.119] [D] [zway] Job 0x13 (Nonce Get to node 55): fail
[2023-07-07 00:00:00.119] [I] [zway] Node 55:0 CC Security: sending Nonce Get
[2023-07-07 00:00:00.119] [D] [zway] SETDATA devices.55.instances.0.commandClasses.152.data.rNonceAckWait = **********
Greetings from germany
Jörg

Raspi 4(Buster), Zwave Version 4.1.1 Raz 7 Pro, Serial API Version:07.38(SDK 7.20)
AlesKO
Posts: 84
Joined: 24 Nov 2016 09:58

Re: job queue runs full after upgrade to v4.1.0

Post by AlesKO »

I can confirm that installing 'z-way-4.1.0-27-g589c9812-lws16_armhf.deb' solved my problem.
Installed 24 hours ago and Job queue is empty right now :-)

Z-way works just like before upgrading to v4.1.0

Thank you PoltoS !
Post Reply