What is Reverse ETL?
If you already know what is ETL? but wondering what is Reverse ETL? then let me try to explain it here.
In the Reverse ETL process, a Data Warehouse is used as the source and the data is moved into other OLTP systems like CRMs or any Saas applications, etc.
Why is this called Reverse ETL?
Traditionally, an ETL process refers to extracting data from multiple OLTP Systems and transforming them, and loading them into a Data Warehouse. Once the data is in the Datawarehouse it was used only by the BI Tools to create various reports and dashboards. But the Reverse ETL is doing the opposite of it and is using Datawarehouse as the source and other OLTP systems as the target. Hence the name!
Why do we need Reverse ETL?
One common scenario that I can think of this — Assuming that your organization has already spent millions to build a decent stable data warehouse (DWH) that caters to the data needs of different departments. Now if you want to introduce a new application in the workflow like Salesforce CRM then it would be much easier to sync the application with the data which is already present in the DWH instead of pulling data from the actual OLTP source systems that may not have the point in time information.
What problems we may face when we use the Reverse ETL approach?
One of the common problems is the data refresh rate of the endpoint applications (target OLTP system in reverse ETL). You must always consider that not all Data warehouse tables are refreshed on a daily basis. There could be facts and dimensions that run weekly, monthly, quarterly, half-yearly and yearly, etc. There may be some ETL processes that run only once. Apart from all of these, there could be some bad design and a few tables might have been maintained manually. All of these dependencies could affect the data availability in the end systems.
It is very critical to analyze each dimension and fact table before using it in the Reverse ETL process.
Clearly analyze and visualize how the data would look like in the end system over the period of time.
Sometimes Reverse ETL approach may require changing the loading strategies of the Data warehouse tables and these changes must be done with caution and one must ensure that the change won’t affect existing BI Reports.
Final Thought
This approach and the hybrids of this has been implemented by many engineers in the past but only recently this has got a formal name to it. There are many tools that are being built with a focus on the Reverse ETL approach.
This approach when implemented correctly could potentially save time, effort, and money for the organizations.
I hope this post helped you a bit.
Let me know in the comment section below if you feel that this approach is something you are considering in your project and what are the pros and cons you think of?
Have a great day. Meet you in the next one. Bye!