This exercise will demonstrate a step by step deployment and configuration of Hortonworks distribution based Hadoop multi-node cluster on Redhat/Centos Linux distribution, the cluster will consist of Master & Worker nodes.
Prior to deployment of Hadoop software the Linux environment has to meet certain configuration requirement and in the below bullet point the pre-requisite configuration is listed.
A. Install httpd on master node
- sudo yum install httpd
- sudo systemctl enable httpd.service
- sudo systemctl restart httpd.service
B. Install & Enable NTP
- yum install ntp
- systemctl ntpd start
- systemctl ntpd status
- chkconfig ntpd on
C. Set umask value to 0022
- umask 0022
D. Disable firewall
- systemctl mask firewalld
- systemctl stop firewalld
E. Disable IPTABLES
- systemctl stop iptables.service
- systemctl stop ip6tables.service
- chkconfig iptables off
- chkconfig ip6tables off
F. Install Name Service Caching (NSC)
- yum install nscd
- systemctl nscd start
- chkconfig nscd on
G. Disable Security Enhanced Linux
Edit /etc/sysconfig/selinux and set SELINUX=disabled from enforcing
H. Enable SSH
create SSH key using root account ~/.ssh folder
- $ su – root
- $ sh-keygen -t rsa
- $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- $ chmod 0600 ~/.ssh/authorized_keys.
copy SSH access from local (HOST 1) to remote host (HOST2 & HOST3, etc)
- ssh-copy-id -i ~/.ssh/id_rsa.pub HOST2
- ssh-copy-id -i ~/.ssh/id_rsa.pub HOST3
I. Download Ambari, HDP and HDP-GPL repo files
wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.1.5/ambari.repo -O /etc/yum.repos.d/ambari.repo
wget -nv http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/hdp.repo -O /etc/yum.repos.d/hdp.repo
wget -nv http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.4.0/hdp.gpl.repo -O /etc/yum.repos.d/hdp.gpl.repo
J. Download HDP repository files locally and unzip it on /var/www/html/HDP folder
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.1.5/ambari-2.6.1.5-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/HDP-2.6.4.0-centos7-rpm.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.4.0/HDP-GPL-2.6.4.0-centos7-rpm.tar.gz
K. Extract the tar files
Ambari tar in /var/www/html/ambari > tar vxf ambari-2.6.1.5-centos7.tar.gz
HDP tar in /var/www/html/hdp > HDP-2.6.4.0-centos7-rpm.tar.gz
HDP-UTIL tar in /var/www/html/hdp/hdp-utils > HDP-UTILS-1.1.0.22-centos7.tar.gz
L. Install mySQL 5.7 or above database on the master node
M. Edit hosts
Add all cluster nodes to the host file in /etc/hosts
N. Disable Transparent Huge Page (THP)
- echo “never” > /sys/kernel/mm/transparent_hugepage/enabled
- echo “never” > /sys/kernel/mm/transparent_hugepage/defrag
O. Install open or Oracle JDK 1.7 or above on all cluster nodes including JCE extension
wget –no-cookies –no-check-certificate –header “Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie” “http://download.oracle.com/otn-pub/java/jdk/8u161-b12/2f38c3b165be4555a1fa6e98c45e0808/jdk-8u161-linux-x64.tar.gz“