Installing pyspark with Jupyter Check List Python is a wonderful programming language for data analytics. Normally, I prefer to write python codes inside Jupyter Notebook (previous known as IPython ), because it allows us to create and share documents that contain live code, equations, visualizations and explanatory text. Apache Spark is a fast and general engine for large-scale data processing. PySpark is the Python API for Spark. So it’s a good start point to write PySpark codes inside jupyter if you are interested in data science: IPYTHON_OPTS="notebook" pyspark --master spark://localhost:7077 --executor-memory 7g Install Jupyter If you are a pythoner, I highly recommend installing Anaconda . Anaconda conveniently installs Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science. Go to https://www.continuum.io/downloads , find the ins...
Comments
Post a Comment