| 2014年3月23日
没事整理了之前搭建hadoop的过程,这里使用了最新的hadoop版本,想在单机上做一些测试,顺手也就整理了一下这个文档。
一、准备环境
1.Hadoop是用Java开发的,必须要安装JDK1.6或更高版本
apt-get install openjdk-6-jdk
2.Hadoop是通过SSH来启动slave主机中的守护进程,必须安装OpenSSH
apt-get install openssh-server
3.Hadoop更新比较快,我们采用最新版hadoop2.3来安装
4.配置对应Hosts记录,关闭iptables和selinux(过程略)
5.创建相同用户及配置无密码认证
二.配置SSH无密码登录
helight:data1$ ssh-keygen -t rsa #一直回车生成
helight:data1$ cd
helight:~$ ls
helight:~$ls -a
. .. .bash_logout .bashrc .profile .ssh
helight:~$ cd .ssh/
helight:.ssh$ ls
id_rsa id_rsa.pub
helight:.ssh$ cat id_rsa.pub » authorized_keys
helight:.ssh$ ls
authorized_keys id_rsa id_rsa.pub
helight:.ssh$ chmod 600 authorized_keys
hhelight:.ssh$ chmod 700 ../.ssh #目录权限必须设置700
root@debian:/data1# vim /etc/ssh/sshd_config #开启RSA认证
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile %h/.ssh/authorized_keys
三、Hadoop的安装与配置
1.下载
http://hadoop.apache.org/下载hadoop Release 版本,这里我下载了2.3.0的版本,目前是最新版本。解压到本地文件系统中,我下载到了/data1/目录中。
$tar xzf hadoop-2.3.0.tar.gz
2.用户设置环境变量 编辑~/.bashrc,添加JAVA_HOME,PATH
JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
export JAVA_HOME export HADOOP_HOME=/data1/hadoop/
export HADOOP_HOME
PATH=$PATH:/data1/hadoop/bin:/data1/hadoop-2.3.0/sbin
export PATH
编辑完成之后,重新打开一个shell终端,这个大家应该都知道是为了上面配置的环境变量在新终端中生效,或者执行source ~/.bashrc
再来看看配置结果:看一下hadoop可不可以执行。
helight:data1$ hadoop version
Hadoop 2.3.0
Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1567123
Compiled by jenkins on 2014-02-11T13:40Z
Compiled with protoc 2.5.0
From source with checksum dfe46336fbc6a044bc124392ec06b85
This command was run using /data1/hadoop-2.3.0/share/hadoop/common/hadoop-common-2.3.0.jar
helight:data1$
3.hadoop环境变量配置
cd hadoop-2.3.0/etc/hadoop
vim hadoop-env.sh
The java implementation to use.
JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
export JAVA_HOME
export JAVA_HOME=${JAVA_HOME}
export HADOOP_HOME=/data1/hadoop-2.3.0/
export HADOOP_HOME
PATH=$PATH:/data1/hadoop-2.3.0/bin/:/data1/hadoop-2.3.0/sbin
export PATH
4.slaves设置从节点
只有一个所以是localhost,要是多个需要设置hosts和这里添加域名,并且设置ssh登录无密码。
helight:hadoop$ vi slaves
localhost
5.hdfs核心配置文件:core-site.xml
6.hdfs存储配置文件:hdfs-site.xml
7.mapreaduce管理器配置文件:yarn-site.xml
8.mapreaduce框架mapred-site.xml
四、格式化文件系统并启动
1.格式化新的分布式文件系统(hdfs namenode -format)
初始化hdfs目录结构和数据。
helight:sbin$ hdfs namenode -format
2.启动hadoop:
这里使用了start-all.sh ,实际上这个脚本是启动了start-dfs.sh和start-yarn.sh ,所以也可以分阶段启动,start-dfs.sh启动了hdfs,start-yarn.sh 启动mapreduce调度框架。
helight:sbin$ ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
14/03/23 22:38:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /data1/hadoop-2.3.0/logs/hadoop-helight-namenode-debian.out
localhost: starting datanode, logging to /data1/hadoop-2.3.0/logs/hadoop-helight-datanode-debian.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /data1/hadoop-2.3.0/logs/hadoop-helight-secondarynamenode-debian.out
14/03/23 22:39:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /data1/hadoop-2.3.0/logs/yarn-helight-resourcemanager-debian.out
localhost: starting nodemanager, logging to /data1/hadoop-2.3.0/logs/yarn-helight-nodemanager-debian.out
3使用jps检查守护进程是否启动
helight:sbin$ jps
14506 NodeManager
14205 SecondaryNameNode
14720 Jps
14072 DataNode
13981 NameNode
14411 ResourceManager
4.查看集群状态
hdfs dfsadmin -report
14/03/23 22:43:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Configured Capacity: 83225169920 (77.51 GB)
Present Capacity: 78803722240 (73.39 GB)
DFS Remaining: 78803697664 (73.39 GB)
DFS Used: 24576 (24 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 127.0.0.1:50010 (localhost)
Hostname: debian.xu
Decommission Status : Normal
Configured Capacity: 83225169920 (77.51 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4421447680 (4.12 GB)
DFS Remaining: 78803697664 (73.39 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.69%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Mar 23 22:43:48 HKT 2014
5.通过web查看资源(http://localhost:8088)
主要看mr进程管理
6、查看HDFS状态(http://localhost:50070)
可以看到hdfs的运营情况,是hdfs的一个简单oss。
关注「黑光技术」,关注大数据+微服务