Ab Initio Interview Questions and Answers

  1. What is Ab Initio?

Ab Initio is an ETL tool ( Extract, Transact and Load) used for data analysis, manipulation, processing, and batch processing. The word “Abinitio” is written as “Ab Initio” and is a Latin word meaning “from the beginning”. It is a graphical user interface-based parallel processing tool.

2. What are the essential GDE components in Ab Initio?

The essential components include input file, output file, input table, output table, join with table, reformat, dedup sort, sort, rollup, log component, error, filter by expression reject etc based on logic and requirements.

3. How to run the graph infinitely?

The graph is generally named with the .mp extension, to run the graph infinitely the end script should call the .ksh file. The .ksh file is generally the script form of the graph when invoked it immediately set up the project parameters.

4. What are all the extensions used in Ab Initio?

  • .mp — used to store Ab Initio graph
  • .ksh — Project script files
  • .mdc — stores dataset or a customer dataset component
  • .dml — the file to store data manipulation language(queries) or record type definition files
  • .xfr — stores transform function files. We can just use the “include” keywords to use the transform function in any other graph.
  • .dat — Used to store multiple or serial files
  • .bin — used to store the executable files
  • .pset — stores input values set files

5. Where are the output from the log port and error port captured?

Output from the log port of a component is written to the path given to the parameter $AI_SERIAL_LOG or $AI_SERIAL_TEMP whereas the path given to the parameter $AI_ADMIN_HOME captures the log when the scripts are deployed and it is written by the Ab initio environment.

Similarly, Output from the error port of a component is written to the path given to the parameter $AI_SERIAL_ERROR whereas the path given to the parameter $AI_ADMIN_EEROR captures the log when the scripts are deployed and it is written by the Ab initio environment.

6. What is EME and How to connect to EME?

EME is Enterprise Meta Environment which is a central repository that examines the project and checks how the data is transferred and transformed from one component to another, from one field to another within and between graphs. To connect EME to the Ab Initio server, we can use the following techniques:

  • Set AB_AIR_ROOT
  • Via GDE, you can connect to the EME data store.
  • Via air command
  • Login to EME web interface – http://serverhost:[serverport]/abinitio

7. What are the main project data directories in Ab Initio?

Admin directory: This directory contains graphs logs, graph errors, and other supporting files. mfs directory: This consists of MFS Control files. These are generally “data areas” locations for the actual data-holding partitions on the multifile system. The space required for these directories is often large. Serial directory: This consists of serial data files which include component error and log files.

8. Difference between phase and checkpoint

9. What is a roll-up component and how it is used?

Roll up component is used to summarize a group using aggregate functions like sum, count etc on data given a key specifier. Roll-up uses a key specifier, the field based on which the grouping or summarization is required has to be provided in the Key specifier.

They might also ask you to tell you an example where you used roll-up on your project.

10. What is a sandbox in Ab Initio?

Sandbox is a private area generally called the workspace for the project which consists of sub-directories, graphs and relevant files.

11. How do you run the graph in Ab Initio?

We can run the graph directly from Ab Initio GDE or use air sandbox run “the name of graph .mp” or “the name of graph .ksh” or the parameter set (upset). We can verify the status of the graph in the end script using the variable $mpjret

12. What is the scan component in Ab Initio?

The scan is used to generate cumulative summarization of input records. Scan generates a group of output records for every input record.

13. What is the difference between rollup and scan?

Generally, rollup generates a group of records which is the summarization of given input records whereas scan will generate a cumulative summary result of every input record. Sca n is used for generating intermediate summary records while roll up can not do that.

14. What are GDE Components?

The components are:-

  • GDE – Graphical Development Environment where graphs can be developed and can drag and drop components.
  • EME – Enterprise Meta-Environment acts as a UNIX-like environment which is basically a repository and used for metadata management.
  • Conduct>It – This provides space for the scheduler for graphs where job automation, creation of graphs and monitoring for the Ab Initio application are done. Majorly used for creating plans.
  • Co>operating system – Program provided by Ab Initio which operates on top of the operating system and it acts as a base for all Ab Initio processes.
  • Component Library – Which holds all the components

Other Basic Ab Initio Questions

15. What are air commands in Ab Initio?

Air commands in Ab Initio are used to work with EME (Ab Initio repository) from the command line.

16. m commands in Ab Initio?

m_commands are co>operating system shell level utility commands. Some basic Ab Initio m commands include:-

  • m_ls
  • m_dump
  • m_db test
  • m_rollback
  • m_mkdir
  • m_cat
  • m_mp
  • m_mkfs
  • m_touch
  • m_mv
  • m_cp
  • m_rm
  • m_compress
  • m_uncompress

17. What is pset?

Pset is called the parameter set files in Ab Initio. We can internally call the .mp file or run the graph using air sandbox run “name of .pset file” bypassing all input parameters.

18. What is a wrapper script?

The wrapper script is called the main script or master script which is used to run graphs through an automatic process. This is helpful when Unix is used as an Operating system.

19. Usage of replicate component. Why and where it is used?

This component is used of combining data records into a single flow, also it does the work of copying flow records to every output flow. It can be used when you need to perform more than one function on the flow of input records.

20. What is normalize component?

Normalize component converts an array of records into n number of records provided length. For example, if length = 15 then normalize generated 15 output records for each input record that is normalize can create multiple data output records from every single input record. There are different transform functions also not only length.

21. What is metadata?

Data that provides information about data. The metadata hub in Ab Initio is a data management platform.

22. What is reformat?

Reformat component generally changes the record format of data by dropping the fields or concatenating the fields and is also used to transform the data in the records.

23. What is a dbc file and how can you test it?

.dbc file is used to store database-related information. Dbc file allows Ab Initio GDE to connect to Database given configuration details like DB name, DB home, DB version, DB instance, DB host, User name and password. A dbc file can be tested using the command “m_db test <name of dbc file>” which checks for DB connection.

24. What is the dedup sorted component?

Dedup sorted is usually used to eliminate duplicate records, if the select parameter is given, the component applies the expression to records else it processes all the records. DEDUP SORTED has dup port and it is optional if not joined the records are discarded.

25. What does the sort component do?

It orders your data according to the mentioned key specifier.

26. What is a watcher in Ab initio?

Watcher is a feature in GDE that is used to identify the flow of data or view the intermediate results between the components. Watchers can be enabled from view –> toolbars –> watchers toolbar.

27. What is a lookup in Ab Initio?

The lookup file contains the data and it can be retrieved using a key parameter.

28. What is Layout in Ab Initio?

The layout in Ab Initio decides which component to run where. It has two types of layout serial and parallel. The default layout is serial.

Do Checkout our other posts on Web development Cheatsheets

Disclaimer: The information on this website is for general informational purposes only. I strongly recommend learning all these basic concepts from the Ab initio help, manuals and documentation.

Leave a Reply

Your email address will not be published. Required fields are marked *