Advantages of Immutability in distributed systems and in programming
Specific to Distributed systems:
1.
Performance – simple, easy to share the RDD with multiple processing
elements
2.
Fault tolerance – re-creatable on failure
3.
Caching and in-memory processing - References can be cached as they are not going to change.
4.
Multiple threads can access the
partition
5.
Sharing - Safe to share across
processes
6.
Processing is easy - makes it easy to parallelize, as there are no conflicts
7.
Replication - internal state will be in consistent even if you have an exception.
8.
Rules out the potential problems due to
updates from multiple threads.
9.
Can easily live in memory as on disk,
this makes it reasonable to easily move operations that hit disk to instead use
data in memory, and again, adding memory is easy than I/O bandwidth.
RDD
significant wins, at cost of having to copy the data rather than mutate it in
place.
General list of reasons to favor immutability in programming:
- immutable objects are
simpler to construct, test, and use
- truly immutable objects are
always thread-safe
- they help to avoid temporal coupling
- their usage is side-effect
free (no defensive copies)
- identity mutability problem
is avoided
- they always have failure
atomicity
- they are much easier to
cache
- they prevent NULL
references, which are bad
Comments
Post a Comment