Hydronitrogen Tech Blog

Hamel Ajay Kothari writes about computers and stuff.

Articles in the hash tag

Apache Spark Shuffles Explained In Depth

I originally intended this to be a much longer post about memory in Spark, but I figured it would be useful to just talk about Shuffles generally so that I could brush over it in the Memory discussion and just make it a bit more digestible. Shuffles are one of the most memory/network intensive parts of most Spark jobs so it's important to understand when they occur and what's going on when you're trying …

Continue reading →

Posted in Spark on

Powered by Pelican, Python, Markdown and tons of other helpful stuff.