Skip to content

Run Jupyter for Pyspark on DockerΒΆ

Install WSL 2ΒΆ

Install Ubuntu 20.01 LTSΒΆ

Install DockerΒΆ

Make sure to have Ubuntu 20.04 installed and turn on Ubuntu-20.04 in Docker Settings -> Resources -> WSL Integration

Bash
sudo apt-get update
sudo apt install docker.io
Start docker and enable it to start after the system reboot:
Bash
sudo systemctl enable --now docker

Fork ProjectΒΆ

https://github.com/padamshrestha/pyspark-practice-notebook

Run docker command from WSL commandΒΆ

Bash
docker run -p 8888:8888 -p 4040:4040 -v /mnt/c/Projects/pyspark-practice-notebook/:/home/jovyan/work --name spark jupyter/pyspark-notebook
Where Port 8888 for Jupyter editor and 4040 is for Spark Jobs

or

Bash
docker-compose up

Open VS Code in containerΒΆ

Open VS Code

For web version Jupyter web