PDA

View Full Version : LuaMutex not released in Girder2Girder.lua



kjellrni
March 16th, 2007, 04:29 PM
Hi all.

I have two computers running WinXP SP2, communicating with each other through Girder2Girder, 'tazenda' and 'terminus'.

Once every few weeks, Girder hits an error in Girder2Girder.lua, and I believe it's causing the LuaMutex not to be released.

Below are the relevant lines from the lua console (with G2G logging enabled) and the log. The lua console on tazenda shows the error occuring in Girder2Girder.lua, then the log display shows the plugin KalganSerial (my own plugin) not being able to aquire the LuaMutex.

This time the error occured right after the start of Girder, but usually it happens when Girder has been running for days. Also, it seems to happen mostly after terminus has gotten out of standby after sleeping through the night.

Computer 'tazenda', running Girder 4.0.14:

Interactive Lua Console:


21:44:29:542: G2G: Enqueue terminus
21:44:29:572: G2G: Error in callback: ...ramfiler\Promixis\Girder\luascript\Girder2Girde r.lua:400: attempt to index local `c' (a
number value)


Log Display:


Time Date Source Details Payloads
21:50:28:809 3/16/2007 KalganSerial Error obtaining LuaMutex, serial message lost
21:50:23:811 3/16/2007 KalganSerial Error obtaining LuaMutex, serial message lost
21:46:24:277 3/16/2007 KalganSerial Error obtaining LuaMutex, serial message lost
21:46:06:502 3/16/2007 KalganSerial Error obtaining LuaMutex, serial message lost
21:44:29:572 3/16/2007 Communication Server Outgoing Connection Authenticated
21:44:06:900 3/16/2007 LG LCD Communications OK
21:44:02:213 3/16/2007 Girder2Girder Found Client: terminus
21:44:01:422 3/16/2007 NetRemote Enabled
21:44:01:422 3/16/2007 Girder2Girder Started
21:44:01:422 3/16/2007 Communication Server: Servers Server (GIRDER): tazenda, @ 192.168.1.50:20005
21:44:01:422 3/16/2007 Communication Server: Servers Server (GIRDER): terminus, @ 192.168.1.7:20005
21:43:40:892 3/16/2007 KalganSerial Open
21:43:40:802 3/16/2007 WebServer Webserver Started


Computer 'terminus', running Girder 4.0.4.11:

Interactive Lua Console:


21:44:14:609: G2G: Client online: tazenda
21:44:44:187: G2G: Incoming client connection: tazenda


Log Display:


Time Date Source Details Payloads
21:44:44:187 3/16/2007 Communication Server Connection Authenticated 192.168.1.50 GIRDER
21:44:14:609 3/16/2007 Communication Server: Servers Server (GIRDER): tazenda, @ 192.168.1.50:20005
21:44:14:593 3/16/2007 Communication Server: Servers Server (GIRDER): terminus, @ 192.168.1.7:20005


Also, here is the function in Girder2Girder.lua that fails, the line with the error indicated with arrows. My theory is that failing to call signal(r) (like at the end of the function), after the error occurs, causes the LuaMutex not to be released.



local Pcallback = function(c, arg1, arg2, arg3, arg4)
local wp; local r = 0
lock()
if arg1 == comserv.CMD.AUTH_SUCCESS then
if c.Con and (c.Con:Status() == comserv.CMD.AUTH_SUCCESS) then

c.Status = 2
if cfg.diag then
print("G2G: Connection authenticated", c.Hostname)
print('G2G: Remote GUID:' ,c.Con:RemoteInstance())
end
r = CONNECTION + CLIENTCHANGE
end
elseif arg1 == comserv.CMD.AUTH_FAIL then
c.Status = 4
if c.Con then c.Con:Close(); c.Con = nil; end
if cfg.diag then print("G2G: Authentication failure", c.Hostname) end
r = CLIENTCHANGE
elseif arg1 == comserv.CMD.CONNECT_FAILED then
c.Status = 4
if c.Con then c.Con:Close(); c.Con = nil; end
if cfg.diag then print("G2G: Connection failure", c.Hostname) end
r = CLIENTCHANGE
elseif arg1 == comserv.CMD.CLOSE then
c.Status = 4
c.Con = nil
gir.TriggerEventEx("ClientOffline", devno, 0, c.Hostname, c.Address)
if cfg.diag then print("G2G: Connection closed", c.Hostname) end
r = CLIENTCHANGE
else
wp = c.ToAck[arg2] <<<<<< ERROR HERE <<<<<<
if wp and (wp.ackcid == arg1) then
if cfg.diag then print("G2G: Acknowledge", arg2) end
wp.ackp1 = arg3
wp.ackp2 = arg4
r = ACKNOWLEDGE
end
end
signal(r)
end


Other than this snag, I'm very happy with using Girder4 for home automation. I'm using a Java cell phone with bluetooth to remote control all of my appliances through Girder. It's very nice to be rid of all the regular remotes on the living room table, leaving just the cell phone.

Unfortunately, people get a bit annoyed when the remote stops working because of this error. I would be very happy if someone could take a look at it.

Best regards,
Kjell Ronny Nilsen

Ron
March 16th, 2007, 07:42 PM
Hmmmm, I'll have to have a close look at this.

JohnHind
March 19th, 2007, 02:46 AM
Ron asked me to have a look at this as the author of this code.

First, thank you for an excellent problem report - the precision and attention to detail was worthy of Hari Seldon himself!

You are correct that the error handler for this callback (and one other similar one) should release the mutex - the attached version adds these, which should stop G2G blocking other Lua applications.

I have a hunch about what might cause the root problem and have made some changes to try to avoid this.

Could you please test this version, preferably keep "Show diagnostic messages" ticked so you can see if errors are still being trapped.

- John

kjellrni
March 19th, 2007, 07:46 AM
Thank you, Ron and John!

I feel deeply honored to be compared to Hari Seldon, the last great scientist of the First Empire! :-D Seriously though, as a fellow sw developer I know how important the accuracy and completeness of bug reports are.

I'll replace Girder2Girder.lua on both computers and report back - hopefully in a few weeks, after I'm convinced the problem is gone!

KR

kjellrni
April 25th, 2007, 07:08 AM
It has now gone almost five weeks, and I can with joy say that the problem has not shown itself.

I have had Girder2Girder diagnostic messages enabled, and I have regularly checked the lua output. No error messages have shown up, regarding mutexes or anything else. Just regular debug info from G2G.

I'm pretty confident the problem is fixed, but to be absolutely sure I will keep checking the output for a few more weeks.

Thank you John and the rest of the Girder team for creating such a great product!