In the mail š§
One Tech idea š„: Zero-ETL
One Small Actionš”: 2023 Resolutions? - Donāt give up just yet
DevRetro2022 š : Delivering hope via my 2022 reflection
Read time: 2 minutes
Zero ETL
Data Warehousing combines data from multiple sources and stores them in centralized storage.
ETL
Conventionally, this is achieved by ETL systems, a combination of tools and code(Python/Spark/SQL) to retrieve and transform the data. For Large systems with TBs of data, this might not be a great solution. Since the data is duplicated from its source and pushed into the warehouse, duplication also means that they must be kept in constant sync, and data quality must be ensured at all times.
Zero-ETL
Zero-ETL enables you to integrate data from different sources and run federated queries on top of them, without explicit ETL pipelines. This means the data can remain in itās source but can still be accessed through a centralized warehouse, removing storage and duplication problem. The data is also available near real time, working past data freshness and sync issues.
Amazon recently announced ETL-free integration between
Aurora and Redshift.
Amazon Redshift and Apache Spark
Googleās Zero-ETL approach with big Query
Databricksā external DB support using JDBC
One downside of Zero-ETL is with the current implementations there is little support for transformation and compliance. But this is just the beginning of the Zero-ETL era. We will keep a watch for more updates.
One Small Action š”
Itās common to fall off our new year goals after the 1st week of Jan. After all, we are all human, and our willpower is limited. To keep you going, here is a slight nudge. Fill the following in a sticky note and put it where you can see it.
I will [action] every [frequency] for the next ____ weeks
My 2022 Reflectionš”
2022 was a mixture of good and bad, Uncertain times with specific actions.