Latest Apache Cassandra Interview Questions Part – 2
How To Write A Query In Cassandra?
Using CQL (Cassandra Query Language).Cqlsh is used for interacting with database.
Mention When You Can Use Alter Keyspace?
ALTER KEYSPACE can be used to change properties such as the number of replicas and the durable_write of a keyspace.
What Os Cassandra Supports?
Windows and Linux.
Explain What Is Cassandra-cqlsh?
Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things
Define a schema
Insert a data and
Execute a query
What Is Cassandra Data Model?
Cassandra Data Model consists of four main components:
Cluster:
Made up of multiple nodes and keyspaces
Keyspace:
a namespace to group multiple column families, especially one per partition
Column:
consists of a column name, value and timestamp
ColumnFamily:
multiple columns with row key reference.
Mention What Does The Shell Commands “capture” And “consistency” Determines?
There are various Cqlsh shell commands in Cassandra. Command “Capture”, captures the output of a command and adds it to a file while, command “Consistency” display the current consistency level or set a new consistency level.
What Is Cql?
CQL is Cassandra Query language to access and query the Apache distributed database. It consists of a CQL parser that incites all the implementation details to the server. The syntax of CQL is similar to SQL but it does not alter the Cassandra data model.
What Is Mandatory While Creating A Table In Cassandra?
While creating a table primary key is mandatory, it is made up of one or more columns of a table.
Explain The Concept Of Compaction In Cassandra.?
Compaction refers to a maintenance process in Cassandra , in which, the SSTables are reorganized for data optimization of data structure son the disk. The compaction process is useful during interactive with memtable.
There are two type sof compaction in Cassandra:
Minor compaction:
started automatically when a new sstable is created. Here, Cassandra condenses all the equally sized sstables into one.
Major compaction :
is triggered manually using nodetool. Compacts all sstables of a ColumnFamily into one.
Mention What Needs To Be Taken Care While Adding A Column?
While adding a column you need to take care that the
Column name is not conflicting with the existing column names
Table is not defined with compact storage option
Does Cassandra Support Acid Transactions?
Unlike relational databases, Cassandra does not support ACID transactions.
Explain How Cassandra Writes Data?
Cassandra writes data in three components
Commitlog write
Memtable write
SStable write
What Is Supercolumn In Cassandra?
Cassandra Super Column is a unique element consisting of similar collections of data. They are actually key-value pairs with values as columns. It is a sorted array of columns, and they follow a hierarchy when in action: keystore> column family> super column> column data structure in JSON.
Similar to row keys, super column data entries contains no independent values but are used to collect other columns. It is interesting to note that super column keys appearing in different rows do not necessarily match and will not ever.
Explain What Is Memtable In Cassandra?
Cassandra writes the data to a in memory structure known as Memtable
It is an in-memory cache with content stored as key/column
By key Memtable data are sorted
There is a separate Memtable for each ColumnFamily, and it retrieves column data from the key
Define The Consistency Levels For Read Operations In Cassandra.?
ALL:
Highly consistent. A write must be written to commitlog and memtable on all replica nodes in the cluster
EACH_QUORUM:
A write must be written to commitlog and memtable on quorum of replica nodes in all data centers.
LOCAL_QUORUM:
A write must be written to commitlog and memtable on quorum of replica nodes in the same center.
ONE:
A write must be written to commitlog and memtableof at least one replica node.
TWO, Three:
Same as One but at least two and three replica nodes, respectively
LOCAL_ONE:
A write must be written for at least one replica node in the local data center ANY
SERIAL:
Linearizable Consistency to prevent unconditional updates
LOCAL_SERIAL:
Same as Serial but restricted to local data center
Explain How Cassandra Writes Changed Data Into Commitlog?
Cassandra concatenate changed data to commitlog
Commitlog acts as a crash recovery log for data
Until the changed data is concatenated to commitlog write operation will be never considered successful
Data will not be lost once commitlog is flushed out to file.
What Is Difference Between Column And Super Column?
Both elements work on the principle of tuple having name and value. However, the former‘s value is a string while the value in latter is a Map of Columns with different data types.
Unlike Columns, Super Columns do not contain the third component of timestamp.
What Is Columnfamily?
As the name suggests, ColumnFamily refers to a structure having infinite number of rows. That are referred by a key-value pair, where key is the name of the column and value represents the column data. It is much similar to a hashmap in java or dictionary in Python.
Rememeber, the rows are not limited to a predefined list of Columns here. Also, the ColumnFamily is absolutely flexible with one row having 100 Columns while the other only 2 columns.
Explain How Cassandra Delete Data?
SSTables are immutable and cannot remove a row from SSTables. When a row needs to be deleted, Cassandra assigns the column value with a special value called Tombstone. When the data is read, the Tombstone value is considered as deleted.
Define The Use Of Source Command In Cassandra.?
Source command is used to execute a file consisting of CQL statements.
What Is Thrift?
Thrift is a legacy RPC protocol or API unified with a code generation tool for CQL. The purpose of using Thrift in Cassandra is to facilitate access to the DB across the programming language.
Explain Tombstone In Cassandra.?
Tombstone is row marker indicating a column deletion. These marked columns are deleted during compaction. Tombstones are of great significance as Cassnadra supports eventual consistency, where the data must respond before any successful operation.
What Platforms Cassandra Runs On?
Since Cassandra is a Java application, it can successfully run on any Java-driven platform or Java Runtime Environment (JRE) or Java Virtual Machine (JVM). Cassandra also runs on RedHat, CentOS, Debian and Ubuntu Linux platforms.
Name The Ports Cassandra Uses.?
The default settings state that Cassandra uses 7000 ports for Cluster Management, 9160 for Thrift Clients, 8080 for JMX. These are all TCP ports and can be edited in the configuration file: bin/Cassandra.in.sh
Can You Add Or Remove Column Families In A Working Cluster?
Yes, but keeping in mind the following processes.
Do not forget to clear the commitlog with ‘nodetool drain’
Turn off Cassandra to check that there is no data left in commitlog
Delete the sstable files for the removed CFs
What Is Replication Factor In Cassandra?
Replication Factor is the measure of number of data copies existing. It is important to increase the replication factor to log into the cluster.
Can We Change Replication Factor On A Live Cluster?
Yes, but it will require running repair to alter the replica count of existing data.
How To Iterate All Rows In Columnfamily?
Using get_range_slices. You can start iteration with the empty string and after each iteration, the last key read serves as the start key for next iteration.