2010-2-11
| 07:28 | mulka | still ponding how to do subtasks... haven't figured out an ideal solution yet. anyone want to brainstorm? |
| 07:28 | mulka | pondering* |
| 07:56 | mulka | anyone here? |
| 09:11 | asksol | mulka: subtasks? |
| 09:11 | mulka | yea... one task depends on a set of other tasks |
| 09:18 | mulka | asksol? |
| 09:18 | asksol | just launching them, or depends on their result? |
| 09:19 | mulka | depends on the result |
| 09:20 | asksol | well, that's when it gets funky |
| 09:20 | asksol | the best way is to use callbacks |
| 09:20 | asksol | callbacks all the way down |
| 09:20 | asksol | and never wait for the result |
| 09:21 | mulka | how would I do a callback with celery? |
| 09:21 | asksol | e.g. @task def refresh_feed(feed_url, callback=None): result = refresh_the_feed(); if callback: callback(result) |
| 09:22 | asksol | that result could be another task for example |
| 09:22 | asksol | s/result/callback |
| 09:23 | asksol | refresh_feed(url, callback=update_subscription_counts.delay) |
| 09:25 | asksol | siwu was even working on a task chain wrapper |
| 09:25 | mulka | didn't realize you could pass functions as arguments to tasks |
| 09:26 | asksol | well, you can, as long as you use pickle |
| 09:26 | mulka | does the code get passed across the wire, or does it just use the code that's on the worker server? |
| 09:27 | asksol | to be compatible with json as well, you could just pass the name of a task, and then in the code: from celery.registry import tasks; callback = tasks[callback]; callback.delay(result) |
| 09:27 | asksol | it doesn't serialize the actual source code |
| 09:27 | asksol | if you think about it, that would be kinda crazy :) |
| 09:28 | asksol | >>> pickle.dumps(operator.add) |
| 09:28 | asksol | 'coperator\nadd\np0\n.' |
| 09:30 | mulka | serializing source code I think has been done before, but I don't think its straight forward to implement |
| 09:30 | mulka | that's what picloud does, right? |
| 09:30 | asksol | no, it doesn't actually |
| 09:31 | asksol | or, it does that for lambda |
| 09:31 | asksol | lambdas |
| 09:31 | asksol | and for stuff defined in __main__ |
| 09:31 | asksol | but you still need to upload any modules to your vm |
| 09:32 | asksol | imagine disassembling most of django and assembling it on the other side again |
| 09:33 | mulka | that would be kind of silly |
| 09:34 | asksol | I'd like to steal the picloud pickler, but it's not open source |
| 09:35 | mulka | let's say I have a task that depends on the results of two other tasks. how would I get the results from the other two tasks together in the same function? |
| 09:39 | asksol | well, that's where it gets really tricky |
| 09:39 | asksol | using TaskSet and waiting for it is the easiest |
| 09:39 | asksol | but it's not optimal |
| 09:40 | asksol | read this thread: http://groups.google.com/group/celery-users/bro... |
| 09:43 | mulka | asksol: great work on Celery by the way! it's shaping up to be a really useful piece of software |
| 09:54 | asksol | thanks! |
| 10:12 | mulka | seems like a TaskSet itself should be able to have a callback passed to it |
| 10:14 | mulka | task chains could be interesting. you said siwu was working on something? anywhere I could get more info? |
| 10:26 | asksol | you'd have to ask him |
| 10:27 | asksol | it's not that easy to track when a taskset is done |
| 10:27 | asksol | but it was an idea yeah |
| 10:28 | asksol | to keep a counter, and the last task calls the callback when the counter hits the taskset size |
| 10:32 | asksol | padt: killall usually doesn't work for me |
| 10:32 | asksol | maybe if you write killall python manage.py celeryd, or something like that |
| 10:35 | padt | asksol: ok. I just assumed it would do more or less the same thing as what's in the docs. If it doesn't work just kill the fixme |
| 10:35 | asksol | at least, that's the universal solution :) |
| 10:35 | asksol | works on an old sparcstation, works now |
| 10:36 | padt | heh |
| 10:37 | mat__ | asksol: is there anyway of setting a routing_key for httpdispatchtask |
| 10:37 | mulka | Oh... another question... what result store do you recommend for large results? Seems like a lot of them can't handle things bigger than 1MB. |
| 10:37 | asksol | mat__: HttpDispatchTask.apply_async(..., routing_key=) |
| 10:38 | asksol | mulka: what is large? for really large files maybe you should use HDFS :) |
| 10:38 | asksol | large results |
| 10:39 | asksol | that's for gigabyte large. |
| 10:39 | mulka | I think my results are only going to be a few MBs |
| 10:40 | mat__ | asksol: HttpDispatchTask.apply_async(args=[TORNADO_URL, action_template, instance.follower_id], routing_key='activity.stream.follow') tried that, just having serious thinking issues about how to do kwargs |
| 10:40 | mat__ | can you pass in a dict to kwargs? |
| 10:40 | asksol | yes, of course |
| 10:40 | asksol | no it's the special CeleryKeywordArgumentDictWriter |
| 10:41 | mat__ | lol, too early for me |
| 10:41 | mat__ | was up re deploying machines till 3 am, and started again at 9am |
| 10:41 | mat__ | brain dead |
| 10:42 | asksol | hehe, i understand |
| 10:43 | mat__ | just broke all our stream system, doh! |
| 10:43 | asksol | ouch |
| 10:44 | asksol | mulka: the best is to not send large results |
| 10:44 | asksol | but I guess you know that |
| 10:45 | asksol | just trying to make you think of ways to avoid it |
| 10:47 | mat__ | asksol: thanks a lot again, back online now |
| 10:48 | mat__ | still cant find a way of testing real time streams |
| 10:49 | mulka | mat__: real time streams? |
| 11:04 | padt | I wrote some context thingy to test the twitter streamin api. might be what you need |
| 11:05 | padt | |
| 11:05 | padt | it the stream is http that is |
| 13:53 | padt | redditors have no sense of humour: http://www.reddit.com/r/Python/comments/b0g9u/c... |
| 13:55 | asksol | hehe |
| 14:02 | asksol | |
| 14:02 | asksol | nice nice! |
| 21:21 | donspaulding | I'm trying to run celeryd as a Windows service. Am I in uncharted waters here? |
| 21:22 | donspaulding | scratch that, I'm *thinking* of running celeryd as a Windows service, and trying to gauge how much work will be involved. |
| 22:29 | mulka | donspaulding: http://botland.oebfare.com/logger/celery/2010/0... |