Spark Persistence(Caching)

Cached RDDs should have modest size so that they can fit in the MEMORY entirely. To identify the size, it is challenging and unclear.
Caching strategy that caches the blocks in memory and disk is preferred. The reason is the cached blocks that are evicted will be written to disk. Reading from disk is relatively fast as compared to re-evaluating the RDD.
Internals:

Internally caching is performed at the block level. That means each RDD consists of multiple blocks, each block is being cached independently of other blocks.
Caching is performed on the node that is generated the particular RDD block.
Each executor in spark has an associated Block Manager that is used to cache RDD blocks.
The memory allocation for the Block Manager is given by the storage memory fraction, which gives the fraction from the memory pool that is allocated for the spark engine.
The block manager manages the cached partitions as well as intermediate shuffle operations.

Persist and cache are the API's to cache the RDD
Cache default storage level is MEMORY only or in other words cache is an alias for persist(StorageLevel.MEMORY_ONLY)
Persist allows storage level as an argument
Replication can be done by adding _2 to the end of the storage level. Replication will be helpful when the node goes down, in other words gives fault tolerance..
Serialization increases the processing cost, but reduces the memory footprint of large datasets.
Storing the data in serialized form reduces the GC pressure as less java objects gets created.
During the life cycle, the partitions may exists in MEMORY/DISK across the cluster depending on the memory available.
Caching Strategies/Storage levels/persistent levels:

RDD blocks can be stored on multiple stores: MEMORY, DISK, OFF_HEAP in Serialized and non-serialized formats.
Once the storage level is defined, it can't be changed
MEMORY_ONLY: Data is cached in memory only in NON Serialized format.
MEMORY_AND_DISK: Data is cached in memory, if not enough memory evicted block from memory are serialized to disk. This strategy is helpful when re-evaluation is expensive and memory resources are scarce
DISK_ONLY: Data cached on disk in serialized format.
OFF_HEAP : Blocks are cached on Off-heap, e.g. Alluxio [2], data is in serialized format.

Reuse/iterative loop(i.e. ML Algorithms)
Reusing in a single application/Notebook/job
When regeneration is very expensive operation, which will help in recovery from node failure
Better use case for caching is cache only those RDD's that are expensive to re-evaluate.

LRU eviction happens independently on each worker and depends on the available memory

Freeing up the storage memory is done with unpersist()
Recommended level/Best practice to use: MEMORY_AND_DISK, which will spill the RDD partitions to the workers local disk on eviction from memory. In this scenario, rebuilding a partition only requires getting the data from workers local disk, which is relatively fast.
Where to see in the UI:

Storage tab, shows where the partition exists across the cluster(i.e. MEMORY/DISK) at any given point of time.

References:

SparkScalaNotes