Tuesday, 14 July 2015

ETL TESTING REAL TIME INTERVIEW QUESTIONS


ODS is nothing but the Operational Data Store which holds the data when the business gets started. It means , it holds the history of data till yesterdays data(depends upon customer requirement). Some...
Yes, ODS is a Open Data Source where it contains real time data (because we should apply any changes on real time data right..!) so dump the real time data into ODS called Landing area later we get the data into staging area here is the place where we do all transformation.
It requires 2 steps:

1.Select count(*) from source
Select count(*) from target

2. If source and target tables have same attributes and datatype

Select * from source
MINUS
Select * from target
Else
We have to go for attribute wise testing for each attribute according to design doc.
As other posts have mentioned, I would do some of the following: "sql Select COLUMN, count(*) from TABLE group by COLUMN order by COLUMN Select min(COLUMN), max(COLUMN) from TABL...
Asked By: phani.nekkkalapudi | Asked On: Nov 28th, 2011
Answered by: Uttam22 on: Feb 20th, 2014
A ETL Tester primarily test source data extraction, business transformation logic and target table loading . There are so many tasks involved for doing the same , which are given below - 1. Stage tab...
Answered by: radhakrishna on: Nov 4th, 2013
ETL tester responsibilities are: writing sql queries for various scenarios like count test, primary key test, duplicate test, attribute test, default check, technical data quality, business data qua...
Asked By: kishore.vakalapudi | Asked On: Aug 3rd, 2010
In etl testing if the src is flat file, then how will you verify the data count validation?
Answered by: JeffManson on: Feb 13th, 2014
Not always (in my experience, quite rarely in fact). Most often the flat file is just that, a flat file. If in UNIX, the wc command is great, in Windows, one could open the file in notepad and CTRL+...
Answered by: Radhakrishna on: Nov 13th, 2013
To find the count in flat file, you have to import that file into excel sheetDescription: http://images.intellitxt.com/ast/adTypes/icon1.png (use Data->Import External Data->import data) and find the count of records using count() [goto Insert->function->count->] in excel....
Asked By: Interview Candidate | Asked On: Jun 5th, 2005
Answered by: Mark Rutherford on: Jun 17th, 2013
9.5.1
Answered by: Kumar on: Nov 17th, 2012
9.5
Asked By: sandydil | Asked On: Mar 7th, 2013
1.To find faults in software 2.To purify that software has no defects. 3.To find perform problems. 4.To give cunt in software. select best option.
Asked By: perlamohan | Asked On: May 28th, 2008
Can any one let me know where can I find the list of etl& reporting tolls available in the market. Thanks in advance
Answered by: Michael on: Feb 27th, 2013
Add InetSofts Style Report to the list of BI reporting tools
Answered by: sajid on: Feb 22nd, 2013
Hi, i would like share my knowledge on this. ETL Tools: Ab Initio, BusinessObjects Data Integrator, IBM InfoSphereDataStage, Informatica, Oracle Warehouse Builder, SQL Server In...
Asked By: phani.nekkkalapudi | Asked On: Nov 28th, 2011
As an etl tester what are the things to test in Informatica tool. What are the types of testing has to do in Informatica and the things to test in each type? If there is any documents on etl testing
Answered by: lakshmi on: Jan 31st, 2013
ETL Testing in Informatica: 1. First check the workflow exist in specified folder. 2. Run the workflow. If the workflow is success then check the target table is loaded on proper data else we can nee...
Answered by: Lokesh M on: Feb 18th, 2012
- Test ETL software. - Test ETL datawarehouse components. - Executing backend data-driven test. - Create, design and execute test plans, test harnesses and test cases. - Identify, troubleshoot and pro...
Asked By: sindhugowda | Asked On: Jan 23rd, 2013
Answered by: jatin on: Feb 7th, 2013
Data-Centric Testing: Data-centric testing revolves around testing quality of the data. The objective of the data-centric testing is to ensure valid and correct data is in the system. Following are th...
If LKP on target table is taken, can we update the rows without update strategy transformation?
Yes, by using dynamic lookup
Update strategy transformation determines whether to insert, update, delete or reject a record for the target.  We can bypass update strategy transformation by creating a routerDescription: http://images.intellitxt.com/ast/adTypes/icon1.png to divide rows based on insert, update, etc and connecting to one of the multiple instances of target.  In the session, for that target instance we can check the appropriate box  to mark records for insert or update or delete

If you even do dynamic lookup you should make use of update strategy to mark the records either to insert or update in the target using look-up cache (look-up will be cached on target).

your requirement is not so clear here whether to use dynamic lookup or session properties.

Note:When you create a mapping with a Lookup transformation that uses a dynamic lookup cache, you must use Update Strategy transformations to flag the rows for the target tables.
Asked By: riyazz.shaik | Asked On: Sep 17th, 2008
Why is it necessary to clean data before loading it into the warehouse
Answered by: Prashant Khare on: Oct 3rd, 2012
Data Cleansing is a process of detecting and correcting the corrupt and inaccurate data from table or database.
There are following steps used:-
1) Data auditing
2) Workflow Specification
3) Workflow Execution
4) Post-processing and controlling
Asked By: Mahendra | Asked On: Nov 16th, 2011
If we are using flat file in our loading, and flat file name change daily so how we handle this without changing file name manually daily? for example: like file name is changing depend on date so what should I do? Plshelp…
Answered by: Charu on: Jan 24th, 2012
Use the indirect filelist option at informatica session. say the filelist name is daily_file.txt put a shell script daily_filename.sh at the pre-session command. The content of daily_filename.sh is ...
Answered by: SHIV on: Dec 28th, 2011
You can use Informatica File List option in target to get the dynamic file names along with the transaction control transformation so that you can create dynamic filenames based on some transaction properties.....
Asked By: Interview Candidate | Asked On: Sep 9th, 2005
Answered by: SHIV on: Dec 28th, 2011
The main difference between connected and unconnected lookup is, we can call unconnected lookup based on some conditions but not the connected lookup.
Answered by: SreedharLokaray on: Nov 2nd, 2011
Lookup can be used as Connected or Unconnected. Apart from cache and receiving input values from pipe line, there is one more difference. If you want to use the same lookup more than one in a mapping ...
Asked By: arjunsarathy | Asked On: Jan 2nd, 2006
Answered by: Doyel Ghosh Dutta on: Sep 16th, 2011
I think in microstrategy view can be reusable object.
Asked By: debashish29 | Asked On: Sep 9th, 2011
How to check the existence of a lookup file in a graph ..The requriement is if lookup file is present then some search will be carried out in the same else default value will be set. Please note we need to check the existence of the lookup file in graph level only..

Asked By: Interview Candidate | Asked On: Aug 29th, 2005
Answered by: koteshchowdary on: Aug 26th, 2011
SIMPLY to say::::::::::::::::::

worklet : set of sessions

mapplet : set of transformations that can be called within a mapping
Answered by: Naveen on: Aug 22nd, 2011
Mapping :-set of transformations.And moving data from source to target along with transformation s

session :-set of instructions source addresses and target address u write it

worklet:- set of sessions
Asked By: YugandharS | Asked On: Jul 27th, 2010
What are the t/rs which create cache files?
Answered by: deepthi on: Aug 20th, 2011
Aggregator
joiner
lookup
rank
sorter
Answered by: shiva on: Jul 22nd, 2011
Actually the answer is partially related to the Question.......

t/rs like Joiner, Lookup, Aggregator,Rank Transformations uses the Caches.....
Cache files are mainly used in Lookup transformations in Informatica 2 types Static cache , Dynamic cache. Both are used by connected lookup Unconnected lookup uses only static cache. If LookupSQLDescription: http://images.intellitxt.com/ast/adTypes/icon1.png Override is used it uses Index cache and Data Cache Index cache -  Stores key columns ( i.e on which index present)Data Cache - Stores output values. These 2 files sufficient memory allocation is important aspect in case of lookup optimization. If the memory allocation is Auto, make sure that Maximum Memory Allowed For Auto Memory Attributes , Maximum Percentage of Total Memory Allowed For Auto Memory Attributes is defined properly.
Actually the answer is partially related to the Question.......

t/rs like Joiner, Lookup, Aggregator,Rank Transformations uses the Caches.....
Answer Question
  
Login to rate this answer.
deepthi
Answered On : Aug 20th, 2011
Aggregator
joiner
lookup
rank
sorter

Asked By: Interview Candidate | Asked On: May 27th, 2005
Answer posted by staline on 2005-05-27 00:42:44: you can use a command task to call the shell scripts, in the following ways:  1. Standalone command task. You can use a command task anywhere in the workflow or worklet to run shell commands.  2. Pre- and post-session shell command. You can call...
Answered by: Hanuma on: Jul 24th, 2011
You can use a Command task to call the shell scripts, in the following ways: 1. Standalone Command task. You can use a Command task anywhere in the workflow or worklet to run shell commands. 2. Pr...
Answered by: praveenathota on: Nov 12th, 2006
Hi,There are two ways to do this,they are as follows:1)we can use a command task anywhere  in the workflow or worklet to run the shell commands.2)In the session task,we can call reusable command ...
1.
What is Data warehouse?

In 1980, Bill Inmon known as father of data warehousing. "A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management's decision making process".
§  Subject oriented : means that the data addresses a specific subject such as sales, inventory etc.
§  Integrated : means that the data is obtained from a variety of sources.
§  Time variant : implies that the data is stored in such a way that when some data is changed.
§  Non volatile : implies that data is never removed. i.e., historical data is also kept.

2.
What is the difference between database and data warehouse?

A database is a collection of related data.
A data warehouse is also a collection of information as well as a supporting system.




A database is a collection of related data. Where as Data Warehouse stores historical data, the business users take their decisions based on historical data only.



In Data base we can maintain only current data which was not more than 3 years But in datawarehouse we can maintain history data it means from the starting day of enterprise DDL commands it means ( Insert ,update,delete)we can do in Database In datawarehouse once data loaded in Datawarehouse we can do any DDL operatations.



Database is used for insert, update and delete operation where asdatawarehouse is used for select to analyse the data.




Database
Data Warehouse
The tables and joins in DB are are complex since they are normalized
Tables and joins are simple since they are de-normalized
ER Modeling techniques are used for database design
Dimension modeling techniques are used for database design
Optimized for write operation
Optimized for read operrations
Performance is slow for analysis queries
High performance for anlytical queries



Database uses OLTP concept Data warehouse uses OLAP concept, means Data warehouse stores historical data.



A database is a collection related data and also it is related to same data. Where as come to Data warehouse, It is collection of data integrated from different sources and stored in one container for taking or ( getting knowledge) managerial decisions.



In database we are using CRUD operations means “create, read, use, delete” but in datawarehouse we are using select operation.

3.
What are the benefits of data warehousing?

§  Historical information for comparative and competitive analysis.
§  Enhanced data quality and completeness.
§  Supplementing disaster recovery plans with another data back up source.
4.
What are the types of data warehouse?

There are mainly three type of Data Warehouse are :
§  Enterprise Data Warehouse
§  Operational data store
§  Data Mart
5.
What is the difference between data mining and data warehousing?

Data mining, the operational data is analyzed using statistical techniques and clustering techniques to find the hidden patterns and trends. So, the data mines do some kind of summarization of the data and can be used by data warehouses for faster analytical processing for business intelligence.
Data warehouse may make use of a data mine for analytical processing of the data in a faster way.




Data mining is one of the key concept to analyze data in Datawarehouse.
Generally, basic testing concepts remains same across all domains. So, the basic testing questions will also remain same. The only addition would be some questions on domain. e.g. in case of ETL testing interview questions, it would be some concepts of ETL, how to’s on some specific type of checks / tests in SQL and some set of best practices. Here is the list of some ETL testing interview questions:
Description: ETL Testing Interview Questions, Answers, Examples
  
Q. 1) What is ETL? 
Ans. ETL - extract, transform, and load. Extracting data from outside source systems.Transforming raw data to make it fit for use by different departments. Loading transformed data into target systems like data mart or data warehouse. 
  
Q. 2) Why ETL testing is required? 
Ans. 
• To verify the correctness of data transformation against the signed off business requirements and rules. 
• To verify that expected data is loaded into data mart or data warehouse without loss of any data. 
• To validate the accuracy of reconciliation reports (if any e.g. in case of comparison of report of transactions made via bank ATM – ATM report vs. Bank Account Report). 
• To make sure complete process meet performance and scalability requirements 
• Data security is also sometimes part of ETL testing 
• To evaluate the reporting efficiency 
  
Q 3) What is Data warehouse? 
Ans. Data warehouse is a database used for reporting and data analysis. 
  
Q4) What are the characteristics of a Data Warehouse? 
Ans. Subject Oriented, Integrated, Time-variant and Non-volatile 
  
Q5) What is the difference between Data Mining and Data Warehousing? 
Ans. Data mining - analyzing data from different perspectives and concluding it into useful decision making information. It can be used to increase revenue, cost cutting, increase productivity or improve any business process. There are lot of tools available in market for various industries to do data mining. Basically, it is all about finding correlations or patterns in large relational databases. 
Data warehousing comes before data mining. It is the process of compiling and organizing data into one database from various source systems where as data mining is the process of extracting meaningful data from that database (data warehouse). 
  
Q6. What are the main stages of Business Intelligence. 
Ans. Data Sourcing –> Data Analysis –> Situation Awareness –> Risk Assessment –> Decision Support 
  
Q7. What tools you have used for ETL testing? 
Ans. 
1. Data access tools e.g., TOAD, WinSQL, AQT etc. (used to analyze content of tables)
2. ETL Tools e.g. Informatica, DataStage
3. Test management tool e.g. Test Director, Quality Center etc. ( to maintain requirements, test cases, defects and traceability matrix)
  
Below are few more questions that can be asked: 
  
Q8. What is Data Mart? 
Q9. Data Warehouse Testing vs Database Testing 
Q10. Who are the participants of data warehouse testing 
Q11. How to prepare test cases for ETL / Data Warehousing testing? 
Q12. What is OLTP and OLAP 
Q13. What is look up table 
Q14. What is MDM (Master data management) 
Q15. Give some examples of real time data warehousing 
  
Benefits of ETL Testing
• Production Reconciliation
• IT Developer Productivity
• Data Integrity

Certifications in Data warehousing:
There are few certifications that would help professionals to show their experience and skill set to the employers. Several credentials also granted  to the candidates by the institutions which include: CBIP which stands for Certified Business Intelligence Professional, IBM Cognos 10 BI DWH developer etc. The data warehousing training to obtain the CBIP certification is provided through a TDWH institute.
Thus, the information stated above shows the importance of the ETL and data warehousing training. The list of the institutes that offer such training programs can be found on Internet.
Give

2 comments:

  1. Hi,

    5) What is Appium's weakest point?
    Open source software is great, however it naturally comes with some downsides which nobody is to be blamed for: unreliability is probably one of the most undeniable. My test suites need to run many tests every day and those tests must be stable (it means they should fail only when the product, or the test has a defect). However it took me much time to build layers on top of Appium to make my tests stable. This is especially true when it comes to synchronizing your tests with the automation: your test issues a command which translates into an automation command by Appium causing an interaction on the device; a command returns when after it has been issued, not when the interaction is actually over! For this reason you will probably find yourself adding delays to your tests (if you are doing things in the wrong/most-straightforward way) or developing synchronization layers on top of your test APIs (the good approach).

    In the above answer you have mention about the unreliability of appium as I am facing the same problem with my automated test cases will you please let me know is there an issue with appium itself or if changing of test cases with a specific design pattern or integrating any tool would help me out.
    Please let me about any materials which will help me out in running my testcases seamlessly.

    ReplyDelete
  2. very nice information regarding interview questions. Keep posting.

    etl testing training hyderabad
    etl testing online training

    ReplyDelete