There are other tree structures similar to the octree. Generally, octree is the easiest one to construct. Other tree structures give greater flexibility how the space is partitioned on each level, but to actually get the significant speed benefit, these trees must be also constructed in much smarter way.
kd-tree partitions space at each node by a plane parallel to one of the base planes. In other words, it uses one plane where octree uses three planes. This allows greater flexibility, for example it may be more optimal to divide the space more often by a X = const plane than Y = const. Octree is forced to divide space by all three planes at each node.
If you will use the simple “rotational” strategy (X, Y, Z, then again X, Y, Z and so on) to choose partitioning axes at each depth, then the kd-tree becomes similar to an octree.
The name kd-tree comes from “k-dimensional tree” term, since kd-tree may be used for any number of dimensions, not necessarily 3D.
BSP (Binary Space Partitioning) tree partitions space in each node by a plane. Any plane, not necessarily parallel to one of the base X, Y, Z planes.
This gives even more flexibility than kd-tree, but it makes constructing optimal BSP trees much harder (assuming that you want to actually produce a better tree than what can be achieved with kd-tree). It also means that at each node you have to check for collision between your reference object and an arbitrary plane (instead of a plane parallel to one of the base coordinate system planes), so computations get a little slower than for kd-tree.
Note that BSP tree is suitable for any number of dimensions, just like kd-tree. You just use different equations to represent hyperplanes in other dimensions.
Finally, note that the only thing that ties octree to 3 dimensions is actually it's name. The same approach could be used for any number of dimensions. For N dimensions, each internal node will have 2N children. For example for 2 dimensions each node has 4 children, and such tree is called a quadtree.
Note that this approach is inadequate when we have a really large number of dimensions, because then 2N will be so large that “organizational” data of all tree nodes may eat a lot of memory. But it is not a problem if we stay within reasonable number of dimensions, like 2 or 3.