I looked into the engine code that deals with input events. The short answer is that what I want is impossible. Yes, once the event is marked as handled, it won't be sent to anyone else. But the physics system counts as a single entity for this - basically if the event doesn't get handled prior to that stage, it gets dumped into a separate queue that has its own rules.
The physics system notifies every physics object (well, up to the first 64) that's under the mouse. It has no way of knowing if a previous physics object tried to handle the event or didn't. Thus if you want only the topmost physics object to react, you'll have to code that yourself.
So to answer my questions:
What am I missing?
Are all applicable CollisionObject2Ds supposed to receive the event even after it's marked as handled?
Not quite right. If it's marked as handled before this point, none will receive it. But marking it as handled doesn't work during this stage.
Also, I thought this was done by casting a ray - shouldn't the ray only hit one thing in the first place?
No, there are legitimate reasons for wanting a ray to hit multiple things. In this case the ray is set to hit up to 64 objects.
How can I limit the click to being handled by just the topmost node that it hits (using the engine rather than rolling my own)?
Edit: Important addendum. The raycasting doesn't find objects in their actual z order anyway, so this appears to be entirely the wrong approach.
I do wonder if this is a bug since the page explaining how the engine does this says:
If no one wanted the event so far, and a Camera is assigned to the Viewport, a ray to the physics world (in the ray direction from the click) will be cast. If this ray hits an object, it will call the CollisionObject._input_event() function in the relevant physics object (bodies receive this callback by default, but areas do not. This can be configured through Area properties).