Hadoop全分布式集群模式的搭建实验

作者:linux-study  发布日期:2013-04-29 11:03:54
1、实验环境:
操作系统:CentOS6.3  x86_64  desktop
主机名称 IP地址 实验角色 备注
master 192.168.1.85 Namenode-first  
slave 192.168.1.81 Namenode-second  
Node1 192.168.1.88 Datanode  
Node2 192.168.1.89 Datanode  
Node3 192.168.1.90 Datanode 用于集群维护节点添加实验
注意:使用cloudstack基本资源域模式创建CentOS6.3  x86_64  DeskTop模式,操作系统的安装过了就不再叙述了
2、前期准备:
准备需要的软件(hadoop)
3、开始安装(需要在所有的机器上以root执行)
第一步、关闭防火墙IPtables
[root@localhost ~]# chkconfig  iptables  off
[root@localhost ~]# service iptables  stop
第二步、更改主机名称
[root@localhost ~]# vi  /etc/sysconfig/network
(分别修改为) 
HOSTNAME=master              #主机  192.168.1.85
HOSTNAME=slave               #主机  192.168.1.81
HOSTNAME=node1              #主机  192.168.1.88
HOSTNAME=node2              #主机  192.168.1.89
:wq                            #保存并退出
[root@localhost ~]# reboot
第三步、修改名称解析文件“/etc/hosts”
注意:每台主机中的/etc/hosts文件都有以下配置  www.it165.net
[root@master ~]# vi  /etc/hosts
(删除原有的localhost,添加以下内容) 
192.168.1.85      master
192.168.1.81      slave
192.168.1.88      node1
192.168.1.89      node2
:wq                               #保存退出
第四步、名称解析及联通性测试
[root@slave ~]# ping  master         #slave主机上测试master主机
PING master (192.168.80.128) 56(84) bytes of data.
64 bytes from master (192.168.80.128): icmp_seq=1 ttl=64 time=1.88 ms
[root@master ~]# ping  slave         #master主机上测试slave主机
PING slave (192.168.80.131) 56(84) bytes of data.
64 bytes from slave (192.168.80.131): icmp_seq=1 ttl=64 time=0.753 ms
4opensshrsync的安装
注意:CentOS6.3 x86_64系统默认安装了openssh和rsync,此处我们只是为了验证,可以不用执行下面的操作
[root@master ~]# rpm  -qa  | grep  openssh
openssh-server-5.3p1-84.1.el6.x86_64
openssh-clients-5.3p1-84.1.el6.x86_64
openssh-askpass-5.3p1-84.1.el6.x86_64
openssh-5.3p1-84.1.el6.x86_64
[root@master ~]# service  sshd  status
openssh-daemon (pid  1717) 正在运行...
[root@master ~]# chkconfig   sshd  --list
sshd            0:关闭  1:关闭  2:启用  3:启用  4:启用  5:启用  6:关闭
[root@master ~]# rpm -qa  |  grep  rsync
rsync-3.0.6-9.el6.x86_64
5、创建Hadoop使用账号(所有节点上都需要创建该账号,并提升权限为root)
[root@master ~]# useradd  hadoop
[root@master ~]# passwd  hadoop
更改用户 hadoop 的密码 。
新的 密码:
无效的密码: 它基于字典单词
重新输入新的 密码:
passwd: 所有的身份验证令牌已经成功更新。
[root@master ~]# vi  /etc/passwd
#更改hadoop用户的uid和gid
hadoop:x:0:0::/home/hadoop:/bin/bash
5、配置master无密码ssh所有的slave
注意:以下的操作在角色namenode(主机master)上面执行
 
1、修改SSH配置文件,启用RSA认证
[root@master ~]# vi   /etc/ssh/sshd_config
打开下面47、48、49行首的注释:
RSAAuthentication yes              #启用RSA认证
PubkeyAuthentication yes            #启用公用和私钥配对认证方式
AuthorizedKeysFile      .ssh/authorized_keys  #所有目标主机公用文件路径
:wq   保存,退出
2、重新启动ssh服务
[root@master ~]# service  sshd  restart
停止 sshd:[确定]
正在启动 sshd:[确定]
   
3、切换到hadoop账号
[root@master ~]# su - hadoop
4、创建ssh密钥(以下命令直接敲默认回车,不用输入任何信息.)
[hadoop@master ~]$ ssh-keygen -t   rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
48:f4:29:86:4a:15:13:2b:24:c0:eb:02:db:41:1b:78 hadoop@master
The key's randomart image is:
+--[ RSA 2048]----+
|=.. =o.          |
|.+E. = . .       |
| o+oo + o        |
|.ooo o o         |
|oo..  . S        |
|o..              |
|.                |
|                 |
|                 |
+-----------------+
6、密钥生成完成以后,追加master主机的公钥到其他节点上,使得 master能访问其它节点
[hadoop@master ~]$  ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@master
[hadoop@master ~]$  ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@slave
[hadoop@master ~]$  ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@node1
[hadoop@master ~]$  ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@node2
7、验证master主机登陆其他节点不需要验证
[hadoop@master ~]$ ssh  slave
Last login: Fri Apr 26 09:47:39 2013 from master.cs1cloud.internal
[hadoop@slave ~]$ exit
logout
Connection to slave closed.
[hadoop@master ~]$ ssh  node1
Last login: Fri Apr 26 09:48:16 2013 from master.cs1cloud.internal
[hadoop@node1 ~]$ exit
logout
Connection to node1 closed.
[hadoop@master ~]$ ssh  node2
[hadoop@node2 ~]$ exit
logout
Connection to node2 closed.
6、配置所有的datanodeslave主机)无密码登陆namenodemaster主机)
注意:以下的操作在slave主机上执行
1、修改ssh服务的配置文件(root用户)
[root@slave ~]# vim /etc/ssh/sshd_config
(打开如下47、48、49行的注释)
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile      .ssh/authorized_keys
:wq     保存退出
2、重新启动ssh服务
[root@slave ~]# service  sshd  restart
停止 sshd:[确定]
正在启动 sshd:[确定]
3、切换到hadoop账号
[root@slave ~]# su - hadoop
4、生成ssh密钥
[hadoop@slave ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
0b:91:c7:d2:0a:d9:4e:e2:84:a4:6a:12:12:ba:f8:71 hadoop@slave
The key's randomart image is:
+--[ RSA 2048]----+
|. .              |
|.+ . o +         |
|= . = * +        |
|+o o = =         |
|=.. E + S        |
|o. o   . .       |
|  .     .        |
|                 |
|                 |
+-----------------+
5、将生成的公钥追加到其他所有的主机上(包括本机)
[hadoop@slave ~]$ ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@slave
[hadoop@slave ~]$ ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@master
[hadoop@slave ~]$ ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@node1
[hadoop@slave ~]$ ssh-copy-id -i  ~/.ssh/id_rsa.pub  hadoop@node2
6、验证slave主机无密码登陆其他主机
[hadoop@slave ~]$ ssh  master
Last login: Fri Apr 26 09:43:00 2013 from master.cs1cloud.internal
[hadoop@master ~]$ exit
logout
Connection to master closed.
[hadoop@slave ~]$ ssh  node1
Last login: Fri Apr 26 09:51:22 2013 from master.cs1cloud.internal
[hadoop@node1 ~]$ exit
logout
Connection to node1 closed.
[hadoop@slave ~]$ ssh  node2
Last login: Fri Apr 26 09:51:44 2013 from master.cs1cloud.internal
[hadoop@node2 ~]$ exit
logout
Connection to node2 closed.
7、验证slave无密码登陆本机
[hadoop@slave ~]# ssh  slave
Last login: Fri Apr 12 21:00:30 2013 from slave
备注:其他节点(node1node2)的操作同上,使之能够无需验证登陆本机和其他所有节点。
7、安装Sun JDK JAVA环境)
备注:切换root用户,在所有的节点上卸载Open JDK,并上传安装SUN JDK。
注意:需要卸载OpenJDK,安装Sun的jdk。否则没有javac命令。我们使用Eclipse工具
① CentOS6.3  x86_64自带java环境
[root@master ~]# java  -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.11) (rhel-1.61.1.11.11.el6_4-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
② CentOS6.3 x86_64默认安装了jdk
[root@hadoop ~]# rpm  -qa  | grep  jdk
java-1.6.0-openjdk-1.6.0.0-1.61.1.11.11.el6_4.x86_64
③ 卸载系统默认的Openjdk
[root@master ~]# rpm -e java-1.6.0-openjdk-1.6.0.0-1.61.1.11.11.el6_4.x86_64  --nodeps
④ 上传并安装sun版本的jdk
[root@master ~]# rpm -ivh  jdk-7u17-linux-x64.rpm
⑤ Java版本验证
[root@master ~]# java  -version
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
注意:JAVA家目录/usr/java/jdk1.7.0_17/
如果使用系统默认的java环境,则javac(命令行模式的java编译工具)命令没有被安装。
8、在所有的节点上都上传并安装hadoop软件包
备注:上传到目录 /home/hadoop 中,此处我们使用hadoop的tar.gz软件包
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ls
[root@master hadoop]# tar -zxf  hadoop-1.1.2.tar.gz
[root@master hadoop]# chown  hadoop.hadoop  -R  hadoop-1.1.2     #更改目录属主
注意:软件包的上传,我们使用的是SecureCRT自带的工具
9、配置Hadoop
此处我们配置master主机,配置完成后我们scp同步配置文件。
1、Master主机上创建必要的目录
[root@master ~]# mkdir  -p  /usr/hadoop/tmp       #需要在所有节点上手动创建
[root@master ~]# chmod  777 /usr/hadoop/tmp    #需要加上写权限,否则格式化时报错
[root@master hadoop]# chown  hadoop.hadoop  -R  /usr/hadoop/
2、配置masters文件,指定备用namenode节点
[root@master ~]# cd  /home/hadoop/hadoop-1.1.2/conf/
[root@master hadoop]# vi  masters
192.168.1.81                           #指定备用namenode,所有节点上都相同
:wq                                 #保存并退出
注意:masters文件中指hadoop集群中的备用namenode服务器,用于增加可靠性。
3、配置slaves文件
[root@master conf]# vi   slaves
192.168.1.88   #node1
192.168.1.89   #node2
:wq       退出并保存
注意:slave文件中用于指定集群中datanode节点成员
4、设置core-site.xml文件
[root@master conf]# vi   core-site.xml
修改后的内容如下:
<configuration>
       
        <!-- global properties -->
<property>
    <name>hadoop.tmp.dir</name>
<value>/usr/hadoop/tmp</value>
</property>
 
<!-- file system properties -->
  <property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
 
</configuration>
5、配置hdfs-site.xml文件
[root@master conf]# vi    hdfs-site.xml
注意:配置的备份方式默认是3,这里只有两个datanode(slave),所以数据最大复制2份
<configuration>
<property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
 
</configuration>
6、配置mapred-site.xml文件
注意:修改Hadoop中MapReduce的配置文件,配置的是JobTracker的地址和端口
[root@master conf]# vi   mapred-site.xml
修改后内容如下:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
 
</configuration>
7、修改java的运行环境变量,修改hadoop-env.sh
[root@master conf]# vi  hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_17
:wq  保存并退出
           
8、其他节点主机的配置
Slave节点:
[root@slave ~]# mkdir  -p  /usr/hadoop/tmp
[root@slave ~]# chmod  777  /usr/hadoop/tmp
[root@slave ~]# chown  hadoop.hadoop   -R  /usr/hadoop/
Node1节点:
[root@node1 ~]# mkdir   -p  /usr/hadoop/tmp
[root@node1 ~]# chmod  777  /usr/hadoop/tmp
[root@node1 ~]# chown  hadoop.hadoop -R  /usr/hadoop/
Node2节点:
[root@node2 hadoop]# mkdir  -p  /usr/hadoop/tmp
[root@node2 hadoop]# chmod  777  /usr/hadoop/tmp
[root@node2 hadoop]# chown  hadoop.hadoop  -R  /usr/hadoop/
10、所有节点上同步配置文件(在)
[root@master ~]# su  -  hadoop                 #切换hadoop账号
[hadoop@master ~]$ scp -rpv  ~/hadoop-1.1.2/conf/*   slave:~/hadoop-1.1.2/conf/
[hadoop@master ~]$ scp -rpv  ~/hadoop-1.1.2/conf/*   node1:~/hadoop-1.1.2/conf/
[hadoop@master ~]$ scp -rpv  ~/hadoop-1.1.2/conf/*   node2:~/hadoop-1.1.2/conf/
涉及到的配置文件包括:
/etc/hadoop/masters
/etc/hadoop/slaves
/etc/hadoop/core-site.xml
/etc/hadoop/hdfs-site.xml
/etc/hadoop/mapred-site.xml
/etc/hadoop/hadoop-env.sh
10、启动和验证
10.1关闭所有节点上的防火墙iptables
[root@master ~]# service  iptables  status
iptables:未运行防火墙。
10.2格式化HDFS文件系统
注意:在namenode(master主机)上使用hadoop的用户进行操作(备注:只需要操作一次,下次启动就不需要了)
[root@master ~]# su - hadoop
[hadoop@master ~]$ hadoop-1.1.2/bin/hadoop   namenode  -format
13/04/26 11:13:11 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.1.85
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.1.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
13/04/26 11:13:11 INFO util.GSet: VM type       = 64-bit
13/04/26 11:13:11 INFO util.GSet: 2% max memory = 19.33375 MB
13/04/26 11:13:11 INFO util.GSet: capacity      = 2^21 = 2097152 entries
13/04/26 11:13:11 INFO util.GSet: recommended=2097152, actual=2097152
13/04/26 11:13:12 INFO namenode.FSNamesystem: fsOwner=hadoop
13/04/26 11:13:12 INFO namenode.FSNamesystem: supergroup=supergroup
13/04/26 11:13:12 INFO namenode.FSNamesystem: isPermissionEnabled=true
13/04/26 11:13:12 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
13/04/26 11:13:12 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
13/04/26 11:13:12 INFO namenode.NameNode: Caching file names occuring more than 10 times
13/04/26 11:13:12 INFO common.Storage: Image file of size 112 saved in 0 seconds.
13/04/26 11:13:12 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/usr/hadoop/tmp/dfs/name/current/edits
13/04/26 11:13:12 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/usr/hadoop/tmp/dfs/name/current/edits
13/04/26 11:13:12 INFO common.Storage: Storage directory /usr/hadoop/tmp/dfs/name has been successfully formatted.
13/04/26 11:13:12 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.85
************************************************************/
10.3启动hadoop
[hadoop@master ~]$ hadoop-1.1.2/bin/start-all.sh
starting namenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-namenode-master.out
192.168.1.89: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-node2.out
192.168.1.88: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-node1.out
192.168.1.81: starting secondarynamenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-secondarynamenode-slave.out
starting jobtracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-jobtracker-master.out
192.168.1.89: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-node2.out
192.168.1.88: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-node1.out
可以通过以下启动日志看出,首先启动namenode 接着启动datanode1,datanode2,…,然后启动secondarynamenode,再启动jobtracker,然后启动tasktracker1,tasktracker2,…。
10.4验证方式一:
启动 hadoop成功后,在 Master 中的 tmp 文件夹中生成了 dfs 文件夹,在Slave 中的 tmp 文件夹中均生成了 dfs 文件夹和 mapred 文件夹。
[hadoop@master ~]$ ll  /usr/hadoop/tmp/
总用量 4
drwxrwxr-x. 3 hadoop hadoop 4096 4月  26 11:13 dfs
[root@node1 hadoop]# ll  /usr/hadoop/tmp/
总用量 8
drwxrwxr-x. 3 hadoop hadoop 4096 4月  26 11:16 dfs
drwxrwxr-x. 3 hadoop hadoop 4096 4月  26 11:16 mapred
10.5验证方式二:
[hadoop@master ~]$ hadoop-1.1.2/bin/hadoop  dfsadmin  -report
Configured Capacity: 47641346048 (44.37 GB)
Present Capacity: 37069099038 (34.52 GB)
DFS Remaining: 37069041664 (34.52 GB)
DFS Used: 57374 (56.03 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)
Name: 192.168.1.89:50010
Decommission Status : Normal
Configured Capacity: 23820673024 (22.18 GB)
DFS Used: 28687 (28.01 KB)
Non DFS Used: 5286121457 (4.92 GB)
DFS Remaining: 18534522880(17.26 GB)
DFS Used%: 0%
DFS Remaining%: 77.81%
Last contact: Fri Apr 26 11:21:13 CST 2013
Name: 192.168.1.88:50010
Decommission Status : Normal
Configured Capacity: 23820673024 (22.18 GB)
DFS Used: 28687 (28.01 KB)
Non DFS Used: 5286125553 (4.92 GB)
DFS Remaining: 18534518784(17.26 GB)
DFS Used%: 0%
DFS Remaining%: 77.81%
Last contact: Fri Apr 26 11:21:13 CST 2013
11、网页查看集群
1、访问jobtracker     "http:192.168.1.85:50030" 
2、访问DFS : "http:192.168.1.85:50070" 

12、停止 hadoop集群
[hadoop@master ~]$ hadoop-1.1.2/bin/stop-all.sh


stopping jobtracker
192.168.1.85: stopping tasktracker
192.168.1.89: stopping tasktracker
stopping namenode
192.168.1.85: stopping datanode
192.168.1.89: stopping datanode
192.168.1.81: stopping secondarynamenode
------至此,hadoop全分布式集群搭建完毕
 
Tag标签: Hadoop   全分布式集群  
  • 专题推荐

About IT165 - 广告服务 - 隐私声明 - 版权申明 - 免责条款 - 网站地图 - 网友投稿 - 联系方式
本站内容来自于互联网,仅供用于网络技术学习,学习中请遵循相关法律法规