Hive is an Apache project software for data warehouse and as part of Hadoop ecosystem it provides data query & analysis interface using SQL like query language know as Hive Query Language (HQL). Hive data processing works on Map Reduce framework making it ideal for processing large data set.

Data files are stored directly on the HDFS under warehouse directory. Hive’s metadata is stored in Hive Metastore, it stores Hive table schema, location, partition and other related information.

Select Add Service from Ambari UI

Assign services to cluster nodes

Assign slave and client nodes

Configure Hive service

Prior to configuring Hive, a Hive database need to be created and appropriate permission granted to root user. In this example mySQL database is used as a repository, therefore login to mySQL, create Hive database and assign the below permissions.

  • In Hive tab, select “Advanced” tab
  • Select “Existing MySQL/MariaDB Database” option
  • Run the below command, make sure the jar file path is correct ambari-server setup –jdbc-db-mysql –jdbc-driver=/usr/share/java/mysql-connector-java.jar
  • Populate the database parameters as per screenshot

Review and deploy the Hive service

Install service components

Review deployment summary

Log into Ambari UI and review Hive service

Follow the above steps to install services such as Spark, Pig, HBase, Ranger, Knox, Tez, Oozie and others.