Data Warehousing Interview Questions Part – 2
1.What is a level of Granularity of a fact table?
Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now,level of granularity would mean what detail are you willing to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto minute and put that data.
2.What are Data Marts?
Data Marts are designed to help manager make strategic decisions about their business. Data Marts are subset of the corporate-wide data that is of value to a specific group of users.
There are two types of Data Marts:
1.Independent data marts – sources from data captured form OLTP system,external providers or from data generated locally within a particular department or geographic area.
2.Dependent data mart – sources directly form enterprise data warehouses.
3.What is conformed fact?
Conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts tables accordingly.
4.What are the Different methods of loading Dimension tables?
Conventional Load: Before loading the data,all the Table constraints will be checked against the d ata. Direct load:(Faster Loading) All the Constraints will be disabled. Data will be loaded directly.Later the dat a will be checked against the table constraints and the bad data won’t be indexed.
5.What are lookup tables?
A lookup table is the table placed on the target table based upon the primary key of the target,it just updates the table by allowing only modified (new or updated) records based on the lookup condition.
6.What are conformed dimensions?
Conformed dimensions mean the exact same thing with every possible fact table to which they are joined. They are common to the cubes.
7.What are non-additive facts?
Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. However they are not considered as useless. If there are changes in dimensions the same facts can be useful.
8.What is factless facts table?
A fact table which does not contain numeric fact columns it is called factless facts table.
9.What is active data warehousing?
An active data warehouse provides information that enables decision-makers within an organization to manage customer relationships nimbly,efficiently and proactively.
10.What is a Cube and Linked Cube with reference to data warehouse?
Cubes are logical representation of multidimensional data. The edge of the cube contains dimension members and the body of the cube contains data values. The linking in cube ensures that the data in the cubes remain consistent.
11.What is junk dimension?
A number of very small dimensions might be lumped together to form a single dimension,a junk dimension – the attributes are not closely related. Grouping of Random flags and text attributes in a dimension and moving them to a separate sub dimension is known as junk dimension.
12.What is surrogate key?
Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. It is useful because the natural primary key can change and this makes updates more difficult. Surrogated keys are always integer or numeric.
13.What is degenerate dimension table?
If a table contains the values,which is neither dimension nor measures is called degenerate dimensions.
14.What is Difference between ER Modelling and Dimensional Modelling?
ER Modelling is used for normalizing the OLTP database design. Dimensional Modelling is used for de-normalizing the ROLAP/MOLAP design.
15.What is Business Intelligence (BI)?
Business Intelligence (BI) refers to technologies,applications and practices for the collection,integration,analysis,and presentation of business information and sometimes to the information itself. The purpose of business intelligence is to support better business decision making.
Thus,BI is also described as a decision support system (DSS). BI systems provide historical,current,and predictive views of business operations,most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.
16.What is hybrid slowly changing dimension?
Hybrid SCDs are combination of both SCD 1 and SCD
2. It may happen that in a table,some columns are important and we need to track changes for them i.e. capture the historical data for them whereas in some columns even if the data changes,we don’t care.
17.What is snapshot?
You can disconnect the report from the catalogue to which it is attached by saving the report with a snapshot of the data. However,you must reconnect to the catalogue if you want to refresh the data.
18.What are the methodologies of Data Warehousing?
Every company has methodology of their own. But to name a few SDLC Methodology,AIM methodology are stardadly used. Other methodologies are AMM,World class methodology and many more.
19.What is a general purpose scheduling tool?
The basic purpose of the scheduling tool in a DW Application is to stream line the flow of data from Source To Target at specific time or based on some condition.