All Time ETL Testing Multiple choice Questions
1 All data in flat file is in this format.
A Sort
B ETL
C Format
D String
Ans: D
2 It is used to push data into a relation database table. This control will be the destination for most fact table data flows.
A Web Scraping
B Data inspection
C OLE DB Source
D OLE DB Destination
Ans: D
3 Logical Data Maps
A These are used to identify which fields from which sources are going to with destinations. It allows the ETL developer to identify if there is a need to do a data type change or aggregation prior to beginning coding of an ETL process.
B These can be used to flag an entire file-set that is ready for processing by the ETL process. It contains no meaningful data bu the fact it exists is the key to the process.
C Data is pulled from multiple sources to be merged into one or more destinations.
D It is used to massage data in transit between the source and destination.
Ans: A
4 Data access methods.
A Pull Method
B Push and Pull
C Load in Parallel
D Union all
Ans: B
5 OLTP
A Process to move data from a source to destination.
B Transactional database that is typically attached to an application. This source provides the benefit of known data types and standardized access methods. This system enforces data integrity.
C All data in flat file is in this format.
D This control can be used to add columns to the stream or make modifications to data within the stream. Should be used for simple modifications.
Ans: B
6 COBOL
A Process to move data from a source to destination.
B The easiest to consume from the ETL standpoint.
C Two methods to ensure data integrity.
D Many routines of the Mainframe system are written in this.
Ans: D
7 What ETL Stands for
A Data inspection
B Transformation
C Extract, Transform, Load
D Data Flow
Ans: C
8 The source system initiates the data transfer for the ETL process. This method is uncommon in practice, as each system would have to move the data to the ETL process individually.
A Custom
B Automation
C Pull Method
D Push Method
Ans: D
9 Sentinel Files
A These are used to identify which fields from which sources are going to with destinations. It allows the ETL developer to identify if there is a need to do a data type change or aggregation prior to beginning coding of an ETL process.
B These can be used to flag an entire file-set that is ready for processing by the ETL process. It contains no meaningful data bu the fact it exists is the key to the process.
C ETL can be used to automate the movement of data between two locations. This standardizes the process so that the load is done the same way every run.
D This is used to create multiple streams within a data flow from a single stream. All records in the stream are sent down all paths. Typically uses a merge-join to recombine the streams later in the data flow.
Ans: B
10 Checkpoints
A Similar to “break up processes”, checkpoints provide markers for what data has been processed in case an error occurs during the ETL process.
B Similar to XML’s structured text file.
C Many routines of the Mainframe system are written in this.
D It is used to import text files for ETL processing.
Ans: A
11 Mainframe systems use this. This requires a conversion to the more common ASCII format.
A ETL
B XML
C Sort
D EBCDIC
Ans: D
12 Ultimate flexibility, unit testing is available, usually poor documentation.
A ETL
B Custom
C OLTP
D Sort
Ans: B
13 Conditional Split
A Many routines of the Mainframe system are written in this.
B Data is pulled from multiple sources to be merged into one or more destinations.
C It allows multiple streams to be created from a single stream. Only rows that match the criteria for a given path are sent down that path.
D This is used to create multiple streams within a data flow from a single stream. All records in the stream are sent down all paths. Typically uses a merge-join to recombine the streams later in the data flow.
Ans: C
14 Flat files
A The easiest to consume from the ETL standpoint.
B Three components of data flow.
C Three common usages of ETL.
D Two methods to ensure data integrity.
Ans: A
15 This is used to create multiple streams within a data flow from a single stream. All records in the stream are sent down all paths. Typically uses a merge-join to recombine the streams later in the data flow.
A OLTP
B Mainframe
C EBCDIC
D Multicast
Ans: D
16 There are little to no benefits to the ETL developer when accessing these types of systems and many detriments. The ability to access these systems is very limited and typically FTP of text files is used to facilitate access.
A Mainframe
B Union all
C File Name
D Multicast
Ans: A
17 Shows the path to the file to be imported.
A File Name
B Mainframe
C Format
D Union all
Ans: A
18 Wheel is already invented, documented, good support.
A Format
B COBOL
C Tool Suite
D Flat files
Ans: C
19 Similar to XML’s structured text file.
A Data Scrubbing
B EBCDIC
C String
D Web Scraping
Ans: D
20 Flat file control
A Three components of data flow.
B It is used to import text files for ETL processing.
C The easiest to consume from the ETL standpoint.
D Shows the path to the file to be imported.
Ans: B
21 Two methods to ensure data integrity.
A Sources, Transformation, Destination
B Data inspection
C Row Count Inspection, Data Inspection
D Row Count Inspection
Ans: C
22 Transformation
A Data is pulled from multiple sources to be merged into one or more destinations.
B It is used to import text files for ETL processing.
C Process to move data from a source to destination.
D It is used to massage data in transit between the source and destination.
Ans: D
23 Three common usages of ETL.
A Data Scrubbing
B Sources, Transformation, Destination
C Merging Data
D Merging Data, Data Scrubbing, Automation
Ans: D
24 Load in Parallel
A A value of delimited shou;d be selected for delimited files.
B Data is pulled from multiple sources to be merged into one or more destinations.
C This will reduce the run time of ETL process and reduce the window for hardware failure to affect the process.
D this should be check if column name have been included in the first row of the file.
Ans: C
25 This can be computationally expensive excluding SSD.
A Hard Drive I/O
B Mainframe
C Tool Suite
D Data Scrubbing
Ans: A
26 A value of delimited shou;d be selected for delimited files.
A Sort
B Format
C String
D OLTP
Ans: B
27 this should be check if column name have been included in the first row of the file.
A Row Count Inspection, Data Inspection
B Format of the Date
C Column names in the first data row checkbox
D Do most work in transformation phase
Ans: C
28. This control simulates the inner, left and outer joins found in SQL. This requires the sort control prior to this control. ? Sort
A. True
B. False
Ans: B
29. By running a sample test in production, a developer can insure that the ETL process is correct and will not experience any unforeseen issues when migrated to production. ? Test in production
A. True
B. False
Ans: A
30. Operate with RAM which is a limited resource. ? Excess Fields in data flow stream
A. True
B. False
Ans: A