This exercise will demonstrate a step by step deployment and configuration of Hortonworks distribution based Hadoop multi-node cluster on Redhat/Centos Linux distribution, the cluster will consist of Master & Worker nodes.

Prior to deployment of Hadoop software the Linux environment has to meet certain configuration requirement and in the below bullet point the pre-requisite configuration is listed.

A. Install httpd on master node

  • sudo yum install httpd
  • sudo systemctl enable httpd.service
  • sudo systemctl restart httpd.service

B. Install & Enable NTP

  • yum install ntp
  • systemctl ntpd start
  • systemctl ntpd status
  • chkconfig ntpd on

C. Set umask value to 0022

  • umask 0022

D. Disable firewall

  • systemctl mask firewalld
  • systemctl stop firewalld

E. Disable IPTABLES

  • systemctl stop iptables.service
  • systemctl stop ip6tables.service
  • chkconfig iptables off
  • chkconfig ip6tables off

F. Install Name Service Caching (NSC)

  • yum install nscd
  • systemctl nscd start
  • chkconfig nscd on

G. Disable Security Enhanced Linux
Edit /etc/sysconfig/selinux and set SELINUX=disabled from enforcing

H. Enable SSH

create SSH key using root account ~/.ssh folder

  • $ su – root
  • $ sh-keygen -t rsa
  • $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  • $ chmod 0600 ~/.ssh/authorized_keys.

copy SSH access from local (HOST 1) to remote host (HOST2 & HOST3, etc)

  • ssh-copy-id -i ~/.ssh/id_rsa.pub HOST2
  • ssh-copy-id -i ~/.ssh/id_rsa.pub HOST3

I. Download Ambari, HDP and HDP-GPL repo files
wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.1.5/ambari.repo -O /etc/yum.repos.d/ambari.repo
wget -nv http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/hdp.repo -O /etc/yum.repos.d/hdp.repo
wget -nv http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.4.0/hdp.gpl.repo -O /etc/yum.repos.d/hdp.gpl.repo

J. Download HDP repository files locally and unzip it on /var/www/html/HDP folder
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.1.5/ambari-2.6.1.5-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/HDP-2.6.4.0-centos7-rpm.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.4.0/HDP-GPL-2.6.4.0-centos7-rpm.tar.gz

K. Extract the tar files
Ambari tar in /var/www/html/ambari > tar vxf ambari-2.6.1.5-centos7.tar.gz
HDP tar in /var/www/html/hdp > HDP-2.6.4.0-centos7-rpm.tar.gz
HDP-UTIL tar in /var/www/html/hdp/hdp-utils > HDP-UTILS-1.1.0.22-centos7.tar.gz

L. Install mySQL 5.7 or above database on the master node

M. Edit hosts

Add all cluster nodes to the host file in /etc/hosts

N. Disable Transparent Huge Page (THP)

  • echo “never” > /sys/kernel/mm/transparent_hugepage/enabled
  • echo “never” > /sys/kernel/mm/transparent_hugepage/defrag

O. Install open or Oracle JDK 1.7 or above on all cluster nodes including JCE extension

wget –no-cookies –no-check-certificate –header “Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie” “http://download.oracle.com/otn-pub/java/jdk/8u161-b12/2f38c3b165be4555a1fa6e98c45e0808/jdk-8u161-linux-x64.tar.gz