Dataframe Performance Comparison - Pandas on Spark vs Pandas
I found this post about the new Pandas API on Spark very intriguing, specifically the performance improvements so I wrote a few simple tests to highlight them.
First Impressions of the M1 MacBook Pro
I will walk you through my thoughts on owning the new M1 MacBook Pro and what it took for me to get my development environment up and running on it.
BigQuery Integer Partitioning is in Beta
This post will talk about what integer range partitioning is, how to leverage it, and finally walk through a few scenarios demonstrating the benefits of it.
Querying External Data with BigQuery
In this post I will walk through how to use BigQuery’s new capability of querying Hive Partitioned Parquet files in GCS. It is a really cool feature.
The Power of Big Query with Public Data
If you are looking for an easy way to query a public dataset you should definitely check out Big Query’s publicly available datasets.