Apache PySpark by Example

LinkedIn Learning
via LinkedIn
Save (0)
ClosePlease login

No account yet? Register

Want to get up and running with Apache Spark as soon as possible? If you’re well versed in Python, the Spark Python API (PySpark) is your ticket to accessing the power of this hugely popular big data platform. This practical, hands-on course helps you get comfortable with PySpark, explaining what it has to offer and how it can enhance your data science work. To begin, instructor Jonathan Fernandes digs into the Spark ecosystem, detailing its advantages over other data science platforms, APIs, and tool sets. Next, he looks at the DataFrame API and how it’s the platform’s answer to many big data challenges. Finally, he goes over Resilient Distributed Datasets (RDDs), the building blocks of Spark.

Instructor(s)

Jonathan Fernandes
LinkedIn Learning
via LinkedIn
Paid, free trial available
English
Certificate of Completion
1h 58m
Self paced
Intermediate