GithubHelp home page GithubHelp logo

learningjournal / spark-programming-in-python Goto Github PK

View Code? Open in Web Editor NEW
302.0 15.0 437.0 45.8 MB

Apache Spark 3 - Spark Programming in Python for Beginners

License: MIT License

Python 99.39% Shell 0.61%

spark-programming-in-python's Introduction

Apache Spark 3 - Spark Programming in Python for Beginners

This is the central repository for all the materials related to Apache Spark 3 - Spark Programming in Python for Beginners
Course by Prashant Pandey.
You can get the full course at Apache Spark Course @ Udemy.

Apache Spark 3 - Spark Programming in Python for Beginners

Description

I am creating Apache Spark 3 - Spark Programming in Python for Beginners course to help you understand the Spark programming and apply that knowledge to build data engineering solutions. This course is example-driven and follows a working session like approach. We will be taking a live coding approach and explain all the needed concepts along the way.

Who should take this Course?

I designed this course for software engineers willing to develop a Data Engineering pipeline and application using the Apache Spark. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Spark implementation. Still, they work with the people who implement Apache Spark at the ground level.

Spark and source code version

This Course is using the Apache Spark 3.x. I have tested all the source code and examples used in this Course on Apache Spark 3.0.0 open-source distribution.

spark-programming-in-python's People

Contributors

learningjournal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spark-programming-in-python's Issues

executing .py files

Hi Sir, @LearningJournal
I have the clone of the project and want to execute the .py files using spark-submit command , but facing an error. Can you guide me through on how can I execute these scripts in spark env.

command:./bin/spark-submit /home/ubuntu/Spark-Programming-In-Python-master/01-HelloSpark/HelloSpark.py

the error logs are pasted below for reference:

21/07/16 14:01:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.8/configparser.py", line 846, in items
d.update(self._sections[section])
KeyError: 'SPARK_APP_CONFIGS'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ubuntu/Spark-Programming-In-Python-master/01-HelloSpark/HelloSpark.py", line 7, in
conf = get_spark_app_config()
File "/home/ubuntu/Spark-Programming-In-Python-master/01-HelloSpark/lib/utils.py", line 25, in get_spark_app_config
for (key, val) in config.items("SPARK_APP_CONFIGS"):
File "/home/ubuntu/anaconda3/lib/python3.8/configparser.py", line 849, in items
raise NoSectionError(section)
configparser.NoSectionError: No section: 'SPARK_APP_CONFIGS'

Doubt at Setting Logger in Pyspark

Sir for The view SP20 Creating Spark Session for the playlist Spark Programming for beginner using Pyspark3
I have a doubt...what does organization name refers to 5:22 ..how can I find out that

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.