PDA

View Full Version : girder coroutine handling



mhund
September 25th, 2016, 05:08 PM
Hello community,

I am currently having an issue with the realtime behaviour of girder and I am wondering if coroutines are the answer. Maybe someone of you can help me with this.
The situation: I use girder as my home automation system and (for more than 200 devices) as my tv control application in the same time. Therefore I am polling several local/online rest interfaces (e.g. to Philips hue, my broadband router to retrieve the local hosts, ... ) multiple times in a minute. This blocks all girder ressources for several seconds (up to 10 seconds depending on the complexity of the call). In this moment girder is not able to perform realtime functions like translating IR-signals or switching lights immediately.
My question: Is using coroutines the solution for my problem? I could imagine that girder should pack the long lasting remote call into an extra thread to stay responsive in the main thread. Would this work?
Second question: The long lasting calls are mostly initiated by a single function call which splits into multiple subcalls and these are resulting in asynchronious callbacks. Is it the case that all of this subcalls and callbacks are executed in the coroutine or are they handled by the main thread again (which would mean that my plan to relieve the main thread is not working)?

I hope my description is sufficient to describe my isse well enough so that someone can answer.

regards,

mhund

Ron
September 25th, 2016, 05:11 PM
You probably need to start using threads for those long lived scripts. Cooroutines by themselves do -not- start threads.

mhund
September 25th, 2016, 05:22 PM
Hm. I thought coroutines is the way how lua splits into threads. But when I get you right, coroutines are still running in the same thread and therefore blocking each other from running in parallel? Do you have more information about the kind of threads you are mentioning? Did not find something in the reference manual.

Ron
September 25th, 2016, 05:33 PM
Have a look in the manual under scripting/thread

mhund
September 25th, 2016, 05:36 PM
Ah. Thanks. Will give it a try :-)

mhund
September 25th, 2016, 06:24 PM
I think I have the idea of it but some more examples would be helpful :-/

Ron
September 25th, 2016, 06:38 PM
The manual shows:


thread.newthread( function( a,b )
print(a,b)
end, {"hello", "there" })


That's all there is too it. ( Well almost :) )

mhund
September 26th, 2016, 04:31 AM
Thanks. Well - this was the part I got already :-)

I am not sure how far I have to deal with this conflict management thing "mutex". I understand the idea behind it but I don't know if it is necessary of if it brings complexity to my code which I could prevent.

mhund
September 26th, 2016, 04:48 AM
What about my Second question: The calls I want to put into a separate thread are a single function call which splits into multiple subfunction calls and these are resulting in asynchronious callbacks. Are all the subcalls and callbacks executed in the separated thread or are they handled by the main thread again? Maybe a dumb question but I am not sure.

Ron
September 26th, 2016, 09:54 AM
:-) Sorry!

Threads as you see can be started easy. Mutex can protect parts that may only be accessed by one thing at a time ( like counters ). Threads by their nature bring complexity I'm afraid, there is potential for deadlocks and race conditions with threads and mutexes so be very careful.

First things to do before splitting things off is figuring out where your slowdowns are. Are you using any wait functions? Any long calculations? If so that is what you need to get off the main thread. The transport functions -should- not block as they are all asynchronous.

When you call a transport function from a thread the callback that you get when the async operation completes is actually back on the main Girder thread. The thread you started most likely has stopped running by then.

mhund
September 26th, 2016, 01:41 PM
Thanks for the explanation. This means threads would probably not solve my problem.
To give you a little bit more background: I wrote several API polling plugins like one which uses a TR64 based API calls to my broadband router (which is xml/soap in the end). And I use requests to retrieve all so called "hosts" which are the devices connected to my home network. This is > 50 devices over ethernet cable and wifi in total. For each device I have to encode and post a soap call to get the online state from the router and I receive an asonchronious callback with a xml soap message which I decode using the built in xml parser. This means the heavy weight operation is not primarily the call but the callback and the parsing. At least this is what I assume. And to get this information in time, I have to poll frequently because there is no event subscription - only polling. I am using this information for example to recognize who is in the house or which device is in standby or not.
Ontop of this, I included a powerful degubbing function into my code because I use this to support my implementation. This means I permanentlty write debug lines into text files. This brings additional load. On the other side it's a windows 10 Intel Core i5 mechine which has enough power. The CPU and memory usage of girder is allways low.

Ron
September 26th, 2016, 02:50 PM
Which function are you using to do network communications?

mhund
September 26th, 2016, 02:55 PM
network.post() for the soap calls

Ron
September 26th, 2016, 03:04 PM
OK.

1. Main POST to router for all devices
2. Parse XML answer find all devices
3. Roughly 50 network POSTS happening in quick succession.
4. During each post callback you do some XML parsing.
5. Repeat 1-4 often ( once a minute ? )
So total roughly 51 network.post and 51 xml parsing sessions. What do you use for XML parsing and generating?

mhund
September 26th, 2016, 04:23 PM
lxp.lom as xml-parser. As encoder I wrote a short script by myself. Nothing special.
By the way, what I have observed now is that after a restart of girder (or only the lua engine) the program is more responsive. feels like a leak which increases after some days/weeks of running nonstop. I mean, the memory consumption on system level is not suspicious to me (always something arround 50-100 MByte. CPU usage around zero, goes up to 8 % for the moment when an update of my interface is running).

In the next days I will deactivate some of my modules to find out which one of them is doing the delay. As I said, I didn't only implement this one thing with my router but I am polling additional rest apis e.g. philips hue or an online football database (Yes you read correctly; I am treating my favorite soccer teams like a logical device which sends events when a game starts/ends or a goal happens; I like this stuff which mixes up physical and logical things).

Ron
September 26th, 2016, 05:18 PM
I'd be interested in seeing how much time is spent in Lua Parsing your XML. Barring that here is a trick to make network.post call the callback on a new thread without blocking the main thread. The basic idea is below:



network.post(..., function(data)

thread.newthread( function(postResultData)
-- XML PROCESS HERE OF MAIN THREAD.
end, { data } )

end)


You could rewrite the network.post to do this in one go for your whole system...



local netpost = network.post

network.post = function( url, postData, mimeType, callback, timeout, username, password, headers )

local threadedCallback = function(success, status, body)

if callback then

thread.newthread( function()

callback(success,status,body)

end, { } )

end

end

netpost( url, postData, mimeType, threadedCallback, timeout, username, password, headers )

end


Note this was not tested and things might go crazy just threading off all returns from network.post, but it's worth a try.

mhund
October 6th, 2016, 11:25 AM
topic is still pending on my side. Just want to inform you that I didn't forgett your proposal after you spent some time to help me on that and I didn't respond.

Currently the issue is less relevant after I changed some settings in my own debug routine and I am writing less data into debug files.

mhund
January 24th, 2017, 09:25 AM
Hi Ron,

took some time for me to try this one. Did the integration in my code in the meantime but had no success. The code surrounding this call is executed succefully. But the function expected in the separated thread seems to be "swallowed" completly. No execution recognized, no change of variable values, no output- and no error-message at lua console. simply nothing. Not sure if you have any ideas.

You have already mentioned that threading in girder/lua is tricky. It looks to me that you are right :-)

Ron
January 24th, 2017, 09:01 PM
slight change:



local netpost = network.post

network.postAsync = function( url, postData, mimeType, callback, timeout, username, password, headers )


local threadedCallback = function(success, status, body)


if callback then


thread.newthread( function()

callback(success,status,body)

end, { } )


end


end


netpost( url, postData, mimeType, threadedCallback, timeout, username, password, headers )

end


Usage:



network.postAsync( "http://www.promixis.com/", "{}", "application/json", function(success, status, body )
print(success, status,body)
end)


prints: true 302

mhund
January 27th, 2017, 05:14 PM
Hi Ron,

you are so patient with me. I did it already in the way you have proposed and it didn't help.

But don't care: I found a different issue: I observed, that my frequent requests worked well allways after a girder / lua engine restart and got worse and worse after some hours or days. And this made me sceptical because it felt like a memory leak, (allthough the memory usage of the girder process was not striking). After hours of analysis, I found my typical beginner's failure: For every single request I constructed the http header in a table by copying a static sample table to an header object. But instead of using a table.copy(), I have used something like headertable = sampletable and added additional values to the headertable (which added the same values to the sample table with each run). This lead to a sample table object growing with every single request. You can imagine after some hours or days, this sample table had multiple thousands of entries and was extended until girder started leaking.
I fixed this and now I will observe if this is sufficient over several days. Sorry for wasting your time.

mhund
January 29th, 2017, 03:19 AM
Just for your information: After my solution from above is in place girder's performance lack decreased. To search for further improvements, I made some observations:
Up to now all the > 50 requests have been sent to the corresponding system (my broadband router) all to the same time. And the callbacks were comming back like a bow wave. melting down this hill took a lot of the computing power and other side executions had been blocked. This lead for example my TV application freeze for some seconds and simple use cases like channel switching or menu navigating with the remote becomes unbearable in this time.
My solution: I changed the way how the requests are sent. All request data are now writen into a queue table and an independent queue handler function sends out the requests one after the other, waiting until the callback returns before sending the next request. The handler is called ever 300ms as long as there are requests in the queue and stops repeating when the queue is empty. The time for releaseing all requests is longer than before but CPU load is deferred and the 300ms between the requests gives girder enough time to execute other realtime relevant things (like my TV app).

I hope this helps others who are dealing with similar problems.

Yoggi
February 9th, 2017, 09:54 AM
Hi mhund,

This sounds interesting, any chance you could upload a working example of this?

I have learned at lot of programing etc. using Girder but unfortunately a lot of times it has been a frustrating and time consuming process. I think the steep learning curve a lot of Girder userís faces could be reduced if more users uploaded working example and code. It would surely have help me!

Regards,

yoggi