PDA

View Full Version : Feedback from GIrder to NR breaks on suspend



avid
May 25th, 2005, 09:45 PM
Does anyone else have the problem that Girder to NR communication stops working when the PPC suspends and resumes?

I actually use feedback from Girder for very little (the live TV programme name only) and so it has taken me a long while to decide that I really do have a problem and where it lies.

When my PPC sleeps and resumes, the feedback comunication path stops working. All NR to Girder actions are fine. But the any calls to NetRemote.SendLabel after the resume do nothing. I need to exit and restart NR to re-establish feedback communication.

Does anyone else see this? Is there anything I should be doing in my GML/CCF/LUA?

I am using the NetRemote Feedback LUA that is identical to the one Rob posted in http://www.promixis.com/phpBB2/viewtopic.php?t=11423

Flummoxed that no-one else is complaining ...

Brian

avid
May 26th, 2005, 10:54 PM
Is anyone using a PPC able to reproduce this? Or (even better) to refute it?

I suspect that most people don't use Girder feedback much - most feedback comes from Zoom or MediaBridge. So if it is broken, then this could have been broken for a very long time.

Brian

Rob H
May 27th, 2005, 01:24 AM
I use Girder feedback, but not on a PPC unfortunately.

Does Girder think the connection from NR has gone?

The Girder plugin does have a force reconnect to girder command, but I'm not sure of the IR code. I've seen it in the designer, but I've just had no luck trying to save a file with that code in it.

avid
May 27th, 2005, 01:46 AM
Thanks Rob,

AFAIK, Girder does not think that NR has gone as all Girder *actions* continue to work.

Thank you for the reminder about Ben's "test disconnection" action code. I shall try that myself over the weekend. It is code 0000 0002. But if you can try something as well I will be very grateful.

Ideally I am looking for someone who really uses Girder feedback to a PPC that can confirm or refute this bug.

Cheers

Brian

Rob H
May 27th, 2005, 01:51 AM
Presumably there's no error return from NetRemote.SendLabel()?

avid
May 27th, 2005, 01:53 AM
Presumably there's no error return from NetRemote.SendLabel()?I'm not aware of one, but the documentation is basically "folklore"

Rob H
May 27th, 2005, 02:27 AM
I meant you're not getting an error code back from SendLabel().

It should return an error, assuming that sock:send() actually returns an error appropriately. Mind you, you should see a 'Disconnect' message in the logger under those circumstances.

OOI do you get the same problem with G4?

avid
May 27th, 2005, 02:52 AM
I meant you're not getting an error code back from SendLabel().

It should return an error, assuming that sock:send() actually returns an error appropriately. Mind you, you should see a 'Disconnect' message in the logger under those circumstances.
Ah - I understand.This is more for me to investigate and collect info. It'll have to be a bit later - I've got some real work to do today :(


OOI do you get the same problem with G4?
I must confess to not keeping up with G4. I have changed so much recently with NR and my move to MB2 and JRMC. G4 would have been one distraction too many.

And I am preparing for a major new version of DigiTV which I think will have *lots* of feedback info available through Girder. This is the driver behind my current need to get Girder feedback made reliable. So am trying to "clear the decks" ready for a major push with that new version.

Brian

avid
May 27th, 2005, 09:10 AM
The Girder log was fairly interesting:

17:50:06 LUA EVENTS: 'Name' type event processing using: 'GirderRegisterNRClient'
17:50:06 PRINT: Registering 192.168.0.253:30000
17:50:06 PRINT: Connected
17:53:46 LUA EVENTS: 'Name' type event processing using: 'GirderRegisterNRClient'
17:53:46 PRINT: Registering 192.168.0.253:30000
17:59:39 PRINT: Disconnect: closed
18:02:28 LUA EVENTS: 'Name' type event processing using: 'GirderRegisterNRClient'
18:02:28 PRINT: Registering 192.168.0.253:30000
18:07:03 LUA EVENTS: 'Name' type event processing using: 'GirderRegisterNRClient'
18:07:03 PRINT: Registering 192.168.0.253:30000
18:07:33 LUA EVENTS: 'Name' type event processing using: 'GirderRegisterNRClient'
18:07:33 PRINT: Registering 192.168.0.253:30000
18:07:33 PRINT: Connected


I suspended at about 17:51 and resumed at 17:53:46. And feedback was still working. I suspended again and left it longer this time, resuming at 18:02:28. Thereafter feedback was broken. I resumed again at 18:07:03 with nothing improved. To cure it, I exited NR and restarted at 18:07:33.

It strikes me as a bug in the Girder plug-in. Ben??

Brian

Rob H
May 27th, 2005, 09:31 AM
Not necessarily - there may be a problem with NetRemote Feedback.lua when it attempts to reconnect to the socket.

I've certainly noticed an error in Private.Register :-


if (ipaddr and port and ipaddr ~= '' and port ~= '') then
print("Registering "..ipaddr..":"..port)

local sock, err = connect(ipaddr, port)
if not sock then return nil, err end


It should only return 'err' rather than 'nil, err'
ie


if (ipaddr and port and ipaddr ~= '' and port ~= '') then
print("Registering "..ipaddr..":"..port)

local sock, err = connect(ipaddr, port)
if not sock then return err end

avid
May 27th, 2005, 10:26 AM
Sorry Rob,

That's not it. All this change does is to cause a message to be logged. And it's even the wrong message in the case of a successful reconnection (when the previous feedback socket was still connected).AND it logs the string "err" and not the value!! :D

I still think that the Girder NR driver does not start listening again on a disconnect.

Brian

Rob H
May 27th, 2005, 10:41 AM
Worth a try.

avid
May 27th, 2005, 10:46 AM
Worth a try.
Certainly was - thanks for the suggestion.

But I think we need to wait for Ben now.

Brian

avid
May 27th, 2005, 11:49 PM
@Ben:

I've been looking at some old Girder code you sent me months ago for another problem. If the current code is similar, I am suspicious of the treatment of m_SocketServer in SocketConnected and SocketDisconnecting.

SocketConnected contains:

//bring up the server listener here
this->listenport = 30000;
this->m_SocketServer = new PluginSocketServer(this,listenport);

int num_errors = 0,max_errors = 1000;
while&#40;++num_errors < max_errors && !m_SocketServer->StartListening&#40;/*this->listenport*/&#41;&#41;&#123;
this->listenport++;


but SocketDisconnecting simply contains:

s->Send&#40;"\nclose\n",strlen&#40;"\nclose\n"&#41;,0&#41;;
if &#40;this->m_SocketServer&#41;&#123;
this->m_SocketServer->StopListening&#40;&#41;;
&#125;

By my reading, this will leave us with multiple listeners - not a good thing!

What do you think?

Brian

avid
May 28th, 2005, 12:14 AM
@Ben:

If my suspicion is correct, then could the fix be as simple as to add a test to SocketConnected:

if &#40; this->m_SocketServer == NULL &#41;
&#123;
this->m_SocketServer = new PluginSocketServer&#40;this,listenport&#41;;
&#125;
??

Brian

Ben S
May 28th, 2005, 06:19 AM
Brian - Good find, that will absolutely cause a memory leak on suspend/resume.

But would multiple listeners cause the issue you are describing regarding suspend/resume?

avid
May 28th, 2005, 07:08 AM
But would multiple listeners cause the issue you are describing regarding suspend/resume?
Yes - I believe so. You still have the old listen socket bound to port 30000, even though it is not currenty listening. You then lose the pointer to this old socket and attempt to listen on the new one, which is probably not really bound, as only one socket can listen on a port.

The net effect is that no-one is now listening on port 30000 and gives the symptoms I am seeing.

It sounds plausible, anyway! :) :)

Brian

Ben S
May 28th, 2005, 07:46 AM
Does sound plausible, although there is a provision to allow NetRemote to change the listen port if the previous port is already used (it's commented out in your code above).

I'm working on a few issues today, and I'll have this fix out with everything else a little later.

avid
May 28th, 2005, 07:53 AM
I'm working on a few issues today, and I'll have this fix out with everything else a little later.
Great, thanks. I bet that when you fix that leak, my feedback problem will go away. If it doesn't, it should be easy to add enough extra logging to work out why.

Brian

Ben S
May 28th, 2005, 09:24 PM
I just uploaded it, Brian. Let me know if this fixes it or not.

avid
May 29th, 2005, 12:03 AM
Thanks Ben,

1.0.30 seems to fix the problem. I now have reliable feedback from Girder when I resume a suspended PPC.

On my short tests so far, I obviously haven't manage to try *all* the cases (e.g. does it matter if I resume after NR has detected the disconnection but Girder has not?), But it's certainly looking good.

Great fix!

Brian

mhund
May 29th, 2005, 01:30 AM
I have the same Problem on my PC. Will your release fix this as well?

avid
May 29th, 2005, 01:39 AM
1.0.30 seems to fix the problem. I now have reliable feedback from Girder when I resume a suspended PPC.
I was wrong :( :(

It does not seem to be reliable. On resuming (oten soon after suspending) I have seen intermittent total disappearance of NR, occasional failure to connect, and intermittent failure to set Girder.LinkActive.

I'm out much of today, so will be unable to investigate much further.

Brian

Ben S
May 29th, 2005, 04:47 AM
Avid - when you return, can you send me some sample files you're working with? I don't believe I'm using the latest NetRemoteFeedback.lua, but all seems well here (in my 5 or so tests where I connect, use "Jump To Device" and "Jump to Home" back and forth, suspend, resume, repeat).

mhund - How are you duplicating this on PC? Going into suspend on PC?

mhund
May 29th, 2005, 08:26 AM
mhund - How are you duplicating this on PC? Going into suspend on PC?

I use NetRemote PC Client on a notebook. The notebook gets into suspend-mode several times a day and wakes up again without problems. Also NetRemote wakes up again and is able to control girder. But feedback fails - often, not allways.
I can add that after waking up, I have observed that the notebook needs some seconds (>10sec) to reanimate the network connections allthough the windows desktop comes alive very fast (2 sec).

avid
May 30th, 2005, 01:38 AM
A recipe that fails close to 100% (including with 1.0.30) is;
1) Have NR & Girder connected and talking fine.
2) Suspend the Pocket PC.
3) From Girder, send a label
4) Wait 3 minutes
5) Resume the Pocket PC

After this, Girder actions are still executed fine, but feedback fails. Somethimes a further Suspend/Resume might clear the state. Otherwise, exiting and restarting NR is always fine.

The reason why step (3) is important to me is if the TV moves onto a new programme or MBM detects a small temperature change. These changes raise Girder events which send text to NR asynchronously.

@Ben: Can you reproduce this?

Brian

Ben S
May 30th, 2005, 05:35 AM
4) Wait 3 minutes

How important is the time in this?

So one thing I'll tell you I had to workaround with MB 2 is that when you close the PPC, it doesn't turn off the network, the network connection thinks it's still good. Is it possible that the Girder lua code is doing something funky with this?



@Ben: Can you reproduce this?

Out of 2 tries, one worked, and one brought NetRemote down. I'm going to add more logging around this and retry.

Ben S
May 30th, 2005, 07:01 AM
Okay. I believe I've resolved. Can you try the attached girder plugin for PPC?

This was a bit of a deeper issue, in that the SocketConnected call was not returning a boolean value (not sure why the old compiler didn't complain like normal), so depending on the value it might keep trying to connect after a connect already worked, in fact connecting multiple times and basically making a mess.

If this works for you, I'll pack it into 30rc2 and we'll be good.