I think it was, because Tilemaps are generally used to, at least, display graphical tiles, so Sprites.
Then Sprites can have colliders, light occluders etc but they at least have a Sprite most often, so it was forcibly implemented as if tilemaps were to be used primarily for displaying sprites.
I agree it can be confusing because most of the time, in any interactive game object, the root is physics and location, and visuals are children. It shouldn't be difficult to modify the TileSet exporter to relax this constraint, however you still have to match only one Sprite per tile (although even that could be worked around).
I never had problems with this so far anyways, but if you do, maybe you can suggest something on Github.