I believe 68000 could use an MMU, but the catch was that it couldn't do demand paging, just memory protection and virtual/physical translation. I can't find the specific explanation right now, but it's something along the lines of the bus error exception (needed to actually stop the memory cycle) being special in a way that sometimes causes an incorrect PC value to be pushed to the stack. So you could terminate a process on an MMU exception, but resuming it was not reliable.
There was at least one company (Apollo, I think) that implemented demand paging on 68000 by using two 68000s. You had one, the leader, running as the "real" CPU, with the other, the follower, executing the same code on the same data but delayed by one instruction.
If the leader got a bus error they would generate an interrupt on the follower to stop it before it executed the bus erroring instruction.
The leader and follower would then switch roles, and the new leader could deal with the situation that had caused the bus error on the former leader.