π Spark linksΒΆ
- Working with Parquet
- Databricks tech talks
- Delta lake - github
- Delta spark api
- Delta lake intro
- Various install Manuals - in chinese
- How does Delta lake works?
- Vivek's learning notes
- Basic PySpark etl for function naming convention
- Read JSON using PySpark
- Download Intraday Stock Data with IEX and Parquet
- Config Documentation from Qubole
- Tempo - financial services quick start
- Very nice gitbook from https://github.com/jaceklaskowski - works only with private browser mode
- How Delta Lake Supercharges Data Lakes from tech-talk
- Making Apache Spark better with Delta Lake
Delta lake sample projectsΒΆ
- Old PySpark practice notebooks
- Spark worker number performance
- Ifood Data
- Spark Definitive Guide
- Spark Definitive Guide - data
- Stock Data Processing in Delta Lake
- Spark Kafka
- Spark Delta
- Nice docker compose for spark
- Traffic accident classification
- Nice Datahub custom git with containers for many services
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks
- Databricks tempo - value at risk