what is staging, dimension and facts in data warehouse?

+1 vote
I am getting into ETL testing. I had a discussion with my team and they are talking about staging data, dimension and facts. What are those? Can you give me brief about the same? Also, it would be great if you can include some example of it.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
To avoid this verification in future, please log in or register.

2 Answers

answered by (151 points)
0 votes

Staging Data: 

A staging area, or landing zone, is an intermediate storage area used for data processing during the extract, transform and load (ETL) process. The data staging area sits between the data source(s) and the data target(s), which are often data warehouses, data marts, or other data repositories.

Above is the standard defination of Staging Data in Wikipedia. So, staging data is a set of DataBases and tables where raw data will be stored.

 Example, assume that you are working for a digital marketing company. They will tie up with various vendors to do digital marketing like, Facebook, Google and many more. Different vendors will provide marketing campaign data into different format, like one vendor will give data in form of CSV via FTP, another may expose API, other may give raw XMLs. So, you gather all data and first store it into one database. That is called staging data.


Dimension is the way you want the data to be fetched. Example, in marketing campaing, you may need report country wise, weekly, montly. So, your input to get reprot is your dimention.


Facts are actual data. Example in marketing campaing last week you had 2k views. So, last week is your dimension and 2k views is your fact.

commented by (151 points)
Thank you Rajeev. You can even write it as an answer.
answered by (408 points)
0 votes

The fact table mainly consists of business facts, measurements of a business process and foreign keys that refer to primary keys in the dimension tables. It holds the data we want to analyze.
For example, revenue generated in a software testing company is a fact and this data is stored in fact table.

A dimension table stores the attributes to narrate the objects in a fact table. It holds the necessary information to query the data from the facts table.
For example, user may want to fetch the revenue generated of the company for a particular time period so "Date" is a common dimension here.

Staging Area:
DWH Staging area is the location where the data from source system is temporarily copied. Staging area is required since the source systems are available for a specific time period to extract data.