Posts

Showing posts from July, 2019

Python vs Scala in Spark

PySpark Scala Nativity w.r.to spark API's provided, not for all features Spark is developed in Scala, things are more natural - de-facto interface Learning and use Comparatively easy to learn and use Complex to learn and easy to use Complexity Less compared to Scala More complex when compared to PySpark - lot of internal manipulations and conversions will happen Conciseness More of imperative styles More concise - fewer lines of code allows faster development, testing and deployment Performance Less compared to Scala - internal conversions are required 10 times faster than PySpark Effective for? smaller ad-hoc experiments for production and engineering applications Scalability Not much Scalable Refactoring not good, more bugs get