Cirrus Minor

Software architecture workshop (slides)

Published by Arnon Rotem-Gal-Oz on November 29, 2023

The title says it all – These are slides from a session I was working on to explain the basics of software architecture based on…

pandas on spark apply_batch/transform_batch broken? (tl;dr; No – but it isn’t well documented)

pandas on spark apply_batch/transform_batch broken? (tl;dr; No – but it isn’t well documented)

Published by Arnon Rotem-Gal-Oz on October 16, 2022

Using pypark’s pandas integration via apply_batch and transform_batch is very powerful but lacking documentation can cause hard to trace bugs – hopefully my experience (below)…

Replacing Docker Desktop with hyperkit + minikube

Replacing Docker Desktop with hyperkit + minikube

Published by Arnon Rotem-Gal-Oz on September 2, 2021

Edit June 2023: Added a section on Colima MacOS is a Unix but it isn’t a Linux so, unfortunately, if/when we need to use linux-y…

Intro to Apache Spark (slides)

Published by Arnon Rotem-Gal-Oz on December 16, 2020

I gave a general overview of Apache Spark to our R&D teams. You can find the slides below

Where is Apache Spark heading?

Where is Apache Spark heading?

Published by Arnon Rotem-Gal-Oz on December 4, 2020

I watched (COVID19-era version of “attended”) the latest spark Summit and in one of the keynotes Reynold Xin from Databricks, presented the following two images…

Cirrus Minor Posts