That Blue Cloud

Lakehouse Design for Automotive with Fabric: 3 - Trusted Data Products

In this design series, we're going to be using Microsoft Fabric to design a corporate-wide Lakehouse, and demonstrate its capabilities.
Lakehouse Design for Automotive with Fabric: 3 - Trusted Data Products
Lakehouse Design for Automotive with Fabric: 3 - Trusted Data Products

Articles in this design series:

An automotive company's ecosystem is much larger than our defined universe, but the principle is the same on a bigger scale. We're trying to make the data available on our platform, and we have a few methods to access and retrieve the data:

  • Scheduled: Includes scheduled data transfers like overnight, daily, weekly, etc., which can happen as a pull or push method.
  • Event-Based: Executing data ingestion when the data is available and an event is received.
  • Streaming: Includes near-realtime or realtime scenarios with large amounts of data constantly received from multiple sources.

In our Trusted Data Products, we will package the data coming from source systems into separate Fabric Workspaces, including the Bronze and Silver layers. This will allow us to reuse both datasets in multiple Gold product scenarios and ensure the data is protected with proper access controls. Fabric's item-level security currently doesn't cover every scenario, so whilst creating these sources as Lakehouses in a single Workspace is possible, it would get crowded very quickly with Pipelines and Dataflows, and the access management would be a living hell. Until there's a better approach, splitting into multiple Workspaces is better.

When ingesting data, we will focus on getting the data into the Landing area and then consume that data to the Bronze area tables for safekeeping. The landing layer is transient, and the data isn't kept there long. Bronze tables have the same structure as the data files received, but the format is kept as Delta in the background.

After the data is stored in the Bronze layer, we'll validate it, transform it, and store it in the Silver layer. The purpose is to keep the data modelled and normalised, allowing proper relational storage with easy discovery.

Let's begin by going through each of these data products.

Read the full story

Sign up now to read the full story and get access to all posts for subscribers only.

Already have an account? Sign in
Harun Legoz

Harun Legoz

I’m a cloud solutions architect with a coffee obsession. Have been building apps and data platforms for over 18 years, I also blog on Azure & Microsoft Fabric. Feel free to say hi on Twitter/X!

That Blue Cloud

Design awesome data platforms using Microsoft Fabric

That Blue Cloud

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to That Blue Cloud.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.