Recently Asked Apache Hbase Interview Questions
Explain What Is Hbase?
Hbase is a column-oriented database management system which runs on top of HDFS (Hadoop Distribute File System). Hbase is not a relational data store, and it does not support structured query language like SQL.
In Hbase, a master node regulates the cluster and region servers to store portions of the tables and operates the work on the data.
What Are The Different Commands Used In Hbase Operations?
There are 5 atomic commands which carry out different operations by Hbase.Get, Put, Delete, Scan and Increment.
Explain Why To Use Hbase?
High capacity storage system
Distributed design to cater large tables
Column-Oriented Stores
Horizontally Scalable
High performance & Availability
Base goal of Hbase is millions of columns, thousands of versions and billions of rows
Unlike HDFS (Hadoop Distribute File System), it supports random real time CRUD operations
How To Connect To Hbase?
A connection to Hbase is established through Hbase Shell which is a Java API.
Mention What Are The Key Components Of Hbase?
Zookeeper:
It does the co-ordination work between client and Hbase Maser
Hbase Master:
Hbase Master monitors the Region Server
RegionServer:
RegionServer monitors the Region
Region:
It contains in memory data store(MemStore) and Hfile.
Catalog Tables:
Catalog tables consist of ROOT and META
What Is The Role Of Master Server In Hbase?
The Master server assigns regions to region servers and handles load balancing in the cluster.
Explain What Does Hbase Consists Of?
Hbase consists of a set of tables
And each table contains rows and columns like traditional database
Each table must contain an element defined as a Primary Key
Hbase column denotes an attribute of an object
What Is The Role Of Zookeeper In Hbase?
The zookeeper maintains configuration information, provides distributed synchronization, and also maintains the communication between clients and region servers.
Mention How Many Operational Commands In Hbase?
Operational command in Hbases is about five types:
Get
Put
Delete
Scan
Increment
When Do We Need To Disable A Table In Hbase?
In Hbase a table is disabled to allow it to be modified or change its settings. .When a table is disabled it cannot be accessed through the scan command.
Explain What Is Wal And Hlog In Hbase?
WAL (Write Ahead Log) is similar to MySQL BIN log; it records all the changes occur in data. It is a standard sequence file by Hadoop and it stores HLogkey’s. These keys consist of a sequential number as well as actual data and are used to replay not yet persisted data after a server crash. So, in cash of server failure WAL work as a life-line and retrieves the lost data’s.
What Are The Different Types Of Filters Used In Hbase?
Filters are used to get specific data form a Hbase table rather than all the records.
They are of the following types.
Column Value Filter
Column Value comparators
KeyValue Metadata filters.
RowKey filters.
In Hbase What Is Column Families?
Column families comprise the basic unit of physical storage in Hbase to which features like compressions are applied.
Name Three Disadvantages Hbase Has As Compared To Rdbms?
Hbase does not have in-built authentication/permission mechanism
The indexes can be created only on a key column, but in RDBMS it can be done in any column.
With one HMaster node there is a single point of failure.
Explain What Is The Row Key?
Row key is defined by the application. As the combined key is pre-fixed by the rowkey, it enables the application to define the desired sort order. It also allows logical grouping of cells and make sure that all cells with the same rowkey are co-located on the same server.
Is Hbase A Scale Out Or Scale Up Process?
Hbase runs on top of Hadoop which is a distributed system. Haddop can only scale uo as and when required by adding more machines on the fly. So Hbase is a scale out process.
Explain Deletion In Hbase? Mention What Are The Three Types Of Tombstone Markers In Hbase?
When you delete the cell in Hbase, the data is not actually deleted but a tombstone marker is set, making the deleted cells invisible. Hbase deleted are actually removed during compactions.
Three types of tombstone markers are there:
Version delete marker:
For deletion, it marks a single version of a column
Column delete marker:
For deletion, it marks all the versions of a column
Family delete marker:
For deletion, it marks of all column for a column family
What Are The Step In Writing Something Into Hbase By A Client?
In Hbase the client does not write directly into the HFile. The client first writes to WAL(Write Access Log), which then is accessed by Memdtore. The Memstore Flushes the data into permanent memory from time to time.
Explain How Does Hbase Actually Delete A Row?
In Hbase, whatever you write will be stored from RAM to disk, these disk writes are immutable barring compaction. During deletion process in Hbase, major compaction process delete marker while minor compactions don’t. In normal deletes, it results in a delete tombstone marker- these delete data they represent are removed during compaction.
Also, if you delete data and add more data, but with an earlier timestamp than the tombstone timestamp, further Gets may be masked by the delete/tombstone marker and hence you will not receive the inserted value until after the major compaction.
What Is Compaction In Hbase?
As more and more data is written to Hbase, many HFiles get created. Compaction is the process of merging these HFiles to one file and after the merged file is created successfully, discard the old file.
Explain What Happens If You Alter The Block Size Of A Column Family On An Already Occupied Database?
When you alter the block size of the column family, the new data occupies the new block size while the old data remains within the old block size. During data compaction, old data will take the new block size. New files as they are flushed, have a new block size whereas existing data will continue to be read correctly. All data should be transformed to the new block size, after the next major compaction.