When to use RDD?
1.
Want to precisely instruct Spark how to do a
query i.e., controlling the low-level operations.
2.
Can forgo the code optimization, efficient space
utilization and performance benefits available with DF’s and DS’.
3.
If the data is unstructured such as media
streams or streams of text.
4.
Not imposing the schema while processing or
accessing the attributes by name or column.
5.
Want to manipulate the data with functional
programming constructs than domain specific expressions.
6.
Existing dependent third-party package is written
using RDD’s.
Comments
Post a Comment