Spark Main Concepts
Apache Spark is an open-source distributed general-purpose cluster-computing framework.
Why spark faster
1. In memory processing
2. Supports data parallelism
3. Lazy execution
Concepts must know
1. RDD
2. Lineage graph
3. DAG Schedular
4. Action
5. Transformation