Wednesday, April 15, 2009

DRb and RangeError

I almost decided this was so obscure that it wasn't worth writing up, but in the short foray in to Googleland to find an answer, I found that other people (both of them) were having the same problem, and one of them was working with Rails, so maybe someone else might like to understand this issue.

Here's what happens:
You're using DRb. You've got a remote object, but when you try to reference it, ruby blows up with:

RangeError: 0xcabbage is recycled object

Now, possibly what's happened is that you're trying to reference an object that's fallen out of scope in it's native process, and it's been garbage collected. That's most likely, really. But you can check to see if that object is still live in the native interpreter to rule that out.

What I'd done was a little different: I'd started DRb, then forked. From the forked process, I was trying to start a child process. When that child (that's three processes deep now) tried to access a object in the forked parent, I was getting RangeErrors. Very confusing.

Especially since the DRb URIs matched up. And the DRb refs. And the parent's object that was supposedly "recycled" was surviving after the child had failed. And I'd stopped DRb and restarted it in the forked parent process.
But then I started to dig into what DRb does when you call stop_service. Have a gander:


if Thread.current['DRb'] && Thread.current['DRb']['server'] == self
Thread.current['DRb']['stop_service'] = true
else
@thread.kill
end

Most poignantly, if we're in the DRb servers main loop thread, we wait until we're done with a loop before we quit the server. Which makes sense. You don't want to drop the current request on the floor. (Plus, it's good use of thread local variables, to boot.) But since the forked parent was launched in the context of a DRb request, when the fork occured, it's main thread was the loop thread for DRb.

So, what happens is that the forked parent gets a DRb server with the same URI as its parent. So when its child tries to connect back to an object created in the forked parent, it's actually asking the forked parent's parent, which has no idea about this object.

Solution: create a new thread, restart DRb in the thread. The DRb in the forked parent can die now, and a new DRbServer gets created, and now it's children talk to it. Voila

Hm. Rereading that, I suspect a diagram might help. Maybe if anyone ever cares about this post, I'll knock one together.

1 comment:

  1. I had similar problem; the gc collected my serverside object.
    I solved it by holding reference to a server side created object in server in one array.

    So gc didn't collected them.

    ReplyDelete