Hydronitrogen Tech Blog

Hamel Ajay Kothari writes about computers and stuff.



Articles in the free tag

Shuffle Free Joins in Spark SQL

As I've mentioned my previous post on shuffles, shuffles in Spark can be a source of huge slowdowns but for a lot of operations, such as joins, they're necessary to do the computation. Or are they?

...

Yes, they are. But you can exercise some more control over your queries and ensure that they only occur once if you know you're going to be performing the same shuffle/join over and over again. We'll briefly explore …


Continue reading →

Posted in Spark on


Powered by Pelican, Python, Markdown and tons of other helpful stuff.