RDD Interface
RDD is the basic abstraction in Spark.
Internally it is characterized by:
- Dependencies
- Partitions
- Compute
- Partitioner for Key Value RDD's
- List of Preferred locations.
4 and 5 are Optional.
RDD achieves Resilience with the dependencies.
Comments
Post a Comment