I recently saw a post by James Gosling (via David Strommer ) called the Eight Fallacies of Distributed Computing . These are eight assumptions on network almost anyone new to distributed computing assumes which proves to be wrong in the long run (and thus cause big problems and headaches).
I thought I'd try a to complement this list by adding few realities on distributed systems and data
- Expect a certain level of entropy in the system - Sites are never fully synchronized (unless you stop new data from pouring in)
- You can only afford to cache immutable data
- It very very hard to be able to scale indefinitely
- Observing global state is only possible via control messages
- It is hard to achieve distributed consensus (membership in a cluster, total order, commitment etc.)
- Expect to debug by log-files
There are probably many others - but these are the first few that came to mind :)