关于apache:HBase master won\\’t start, Can\\’t connect to hbase.rootdir

HBase master won't start, Can't connect to hbase.rootdir

我正在尝试根据 apache 网站上的设置以伪分布式模式运行 HBase,但我无法正确配置 hbase.root 目录。

这是我的配置文件的样子:

在 Hadoop 目录中:

conf/core-site.xml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>

  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>

  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>

conf/hdfs-site.xml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>

  <property>
    <name>dfs.support.append</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.datanode.max.xcievers</name>
    <value>4096</value>
  </property>

</configuration>

conf/mapred-site.xml:

1
2
3
4
5
6
7
8
9
10
11
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>

在我的 HBase 目录中

hbase-site.xml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 * Copyright 2010 The Apache Software Foundation
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 *"License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an"AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

  <property>
    <name>dfs.support.append</name>
    <value>true</value>
  </property>

  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
  </property>

  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>

  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>localhost</value>
  </property>

</configuration>

当我运行 start-hbase.sh 脚本时,它说它启动了 zookeeper、hbase 主服务器和区域服务器,并且我能够登录到它们。然后我可以访问 hbase shell,但我不能创建表或任何东西。我尝试使用我的网络浏览器连接到主状态 ui,但它无法连接。起初我以为是因为我在亚马逊实例上运行它,并且端口 9000 没有被授予权限,但我发现它是。端口 50030 和 50070 被授予相同的权限,我可以从它们访问作业跟踪器和名称节点。我检查了日志,发现了这个错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
2013-08-05 18:00:35,613 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-08-05 18:00:35,616 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
    at org.apache.hadoop.ipc.Client.call(Client.java:1112)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:135)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:276)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:241)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:667)
    at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:112)
    at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:419)
    at java.lang.Thread.run(Thread.java:724)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:453)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579)
    at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
    at org.apache.hadoop.ipc.Client.call(Client.java:1087)
    ... 17 more

如您所见,它正在尝试访问 localhost/127.0.0.1:9000,这显然是错误的。:

Call to localhost/127.0.0.1:9000 failed on connection exception

这是我的 /etc/hosts 文件的样子:

1
2
3
4
5
6
7
8
9
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

还有:
用实例的公共 DNS 替换 localhost 也不起作用


先提几点建议。您实际上不需要将 dfs.replication、mapred.job.tracker 放在 core-site.xml 中,将 dfs.support.append 放在 hbase-site.xml 文件中。这不是必需的。

请确保 NN 运行良好且已退出安全模式。另外,最好关闭 IPv6 并在 hbase-site.xml 文件中添加 hbase.zookeeper.property.dataDir 和 hbase.zookeeper.property.clientPort 并将 hbase-env.sh 中的 export HBASE_MANAGES_ZK 设置为 true。更改配置文件后重新启动 HBase。