Skip to content

Run and Debug Dotnet Spark application on DockerΒΆ

Warning

Please go to "Running Dotnet Spark applications on Ubuntu Continer" for running dotnet spark applicatoin on Docker

Instal Docker DesktopΒΆ

Batchfile
choco install docker-desktop

Tip

Make sure to restart the machine after installation. If there is an access issue add current user to docker group.

User Docker containerΒΆ

Precompiled Container If you want to custom compile, visit Dotnet Spark Container Script and build.sh

Batchfile
docker run -d --name dotnet-spark -p 8080:8080 -p 8081:8081 -e SPARK_DEBUG_DISABLED=true 3rdman/dotnet-spark:latest

Docker container should be running as

Batchfile
docker ps
Running Docker Container

Builing and exectuting the example applicationΒΆ

Go to docker bash

Batchfile
docker exec -it dotnet-spark /bin/bash

Navigate to the project

Batchfile
cd /dotnet/HelloSpark
ls -la
Project Directory Where json file as sample data.

Now let's build

Batchfile
dotnet build
Build Once the build is completed, there will be bin and obj folders

and let's navigate to the bin folder now

Batchfile
cd /dotnet/HelloSpark/bin/Debug/netcoreapp3.1
Bin folder

The sample application expects json file, so let's copy it from the application folder

Batchfile
cp /dotnet/HelloSpark/people.json .

Finally, let's execute the application spark-submit

Batchfile
spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master spark://$HOSTNAME:$SPARK_MASTER_PORT microsoft-spark-2.4.x-0.12.1.jar dotnet HelloSpark.dll

Tip

Have a proper version of jar from the bin folder i.e microsoft-spark-2.4.x-0.12.1.jar

Exectution

We can verify the application and workder at http://localhost:8080/ Spark Web UI

Debug Spark applciation using VS CodeΒΆ

Start the container with mounted project volume or folder

Batchfile
docker run -d --name dotnet-spark -p 8080:8080 -p 8081:8081 -v "C:\Projects\ML\mySparkApp\bin\Debug:/dotnet/Debug" -e SPARK_DEBUG_DISABLED=true 3rdman/dotnet-spark:latest
Windows will have notification to share Debug folder to share, you should accept it. Contianer with Volume mounted

Unfortunately, there is an exception due to a lot more different undefined dyanmic port mapping there Exception

This can be resolve in linus using host network as

Batchfile
docker run -d --name dotnet-spark --network host -v "$HOME/Projects/ML/mySparkApp/bin/Debug:/dotnet/Debug" -e SPARK_DEBUG_DISABLED=true 3rdman/dotnet-spark:latest

Warning

So we might get fix for window from Spark or Docker in near future till then stick with Linux