2017-02-24 13 views
0

HAモードでHDFSを設定しました。私は "アクティブ"ノードと "スタンバイ"ノードを持っています。私はZKFCを始めました。 アクティブノードのzkfcを停止すると、スタンバイノードは状態を変更して「アクティブ」ノードとして配置します。 問題は、zkfcを起動してアクティブなサーバーと1つの "スタンバイ"サーバーを停止し、スタンバイサーバーが状態を変更せず、常にスタンバイ状態を維持するアクティブサーバーをシャットダウンするときです。HDFS HAクラスタ実際のアクティブなネームノードのシャットダウン時にスタンバイノードがアクティブにならない

マイコア-site.xmlの

<configuration> 
<property> 
    <name>fs.default.name</name> 
    <value>hdfs://auto-ha</value> 
</property> 
</configuration> 

マイHDFS-site.xmlの

<configuration> 
<property> 
    <name>dfs.namenode.rpc-bind-host</name> 
    <value>0.0.0.0</value> 
    <description> 
    The actual address the RPC server will bind to. If this optional address is 
    set, it overrides only the hostname portion of dfs.namenode.rpc-address. 
    It can also be specified per name node or name service for HA/Federation. 
    This is useful for making the name node listen on all interfaces by 
    setting it to 0.0.0.0. 
    </description> 
</property> 

<property> 
    <name>dfs.namenode.servicerpc-bind-host</name> 
    <value>0.0.0.0</value> 
    <description> 
    The actual address the service RPC server will bind to. If this optional address is 
    set, it overrides only the hostname portion of dfs.namenode.servicerpc-address. 
    It can also be specified per name node or name service for HA/Federation. 
    This is useful for making the name node listen on all interfaces by 
    setting it to 0.0.0.0. 
    </description> 
</property> 

<property> 
    <name>dfs.namenode.http-bind-host</name> 
    <value>0.0.0.0</value> 
    <description> 
    The actual adress the HTTP server will bind to. If this optional address 
    is set, it overrides only the hostname portion of dfs.namenode.http-address. 
    It can also be specified per name node or name service for HA/Federation. 
    This is useful for making the name node HTTP server listen on all 
    interfaces by setting it to 0.0.0.0. 
    </description> 
</property> 

<property> 
    <name>dfs.namenode.https-bind-host</name> 
    <value>0.0.0.0</value> 
    <description> 
    The actual adress the HTTPS server will bind to. If this optional address 
    is set, it overrides only the hostname portion of dfs.namenode.https-address. 
    It can also be specified per name node or name service for HA/Federation. 
    This is useful for making the name node HTTPS server listen on all 
    interfaces by setting it to 0.0.0.0. 
    </description> 
</property> 
<property> 
    <name>dfs.replication</name> 
    <value>2</value> 
</property> 
<property> 
    <name>dfs.name.dir</name> 
    <value>file:///hdfs/name</value> 
</property> 
<property> 
    <name>dfs.data.dir</name> 
    <value>file:///hdfs/data</value> 
</property> 
<property> 
    <name>dfs.permissions</name> 
    <value>false</value> 
</property> 
<property> 
    <name>dfs.nameservices</name> 
    <value>auto-ha</value> 
</property> 
<property> 
    <name>dfs.ha.namenodes.auto-ha</name> 
    <value>nn01,nn02</value> 
</property> 
<property> 
    <name>dfs.namenode.rpc-address.auto-ha.nn01</name> 
    <value>master1:8020</value> 
</property> 
<property> 
    <name>dfs.namenode.http-address.auto-ha.nn01</name> 
    <value>master1:50070</value> 
</property> 
<property> 
    <name>dfs.namenode.rpc-address.auto-ha.nn02</name> 
    <value>master2:8020</value> 
</property> 
<property> 
    <name>dfs.namenode.http-address.auto-ha.nn02</name> 
    <value>master2:50070</value> 
</property> 
<property> 
    <name>dfs.namenode.shared.edits.dir</name> 
    <value>qjournal://master1:8485;master2:8485;master3:8485/auto-ha</value> 
</property> 
<property> 
    <name>dfs.journalnode.edits.dir</name> 
    <value>/hdfs/journalnode</value> 
</property> 
<property> 
    <name>dfs.ha.fencing.methods</name> 
    <value>sshfence</value> 
</property> 
<property> 
    <name>dfs.ha.fencing.ssh.private-key-files</name> 
    <value>/home/ikerlan/.ssh/id_rsa</value> 
</property> 
<property> 
    <name>dfs.ha.automatic-failover.enabled.auto-ha</name> 
    <value>true</value> 
</property> 
<property> 
    <name>ha.zookeeper.quorum</name> 
    <value>master1:2181,master2:2181,master3:2181</value> 
</property> 
<property> 
<name>dfs.client.failover.proxy.provider.auto-ha</name> 
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 
</property> 
<property> 
<name>dfs.namenode.datanode.registration.ip-hostname-check</name> 
<value>false</value> 
</property> 
</configuration> 

私は、ログをチェックして、問題がフェンスにしようとしたとき、私は次は失敗があります。

2017-02-24 12:46:29,389 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master2/172.16.8.232:8020. Already tried 0 time$ 
2017-02-24 12:46:49,399 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at master2/172.16.8.232:8020 $ 
org.apache.hadoop.net.ConnectTimeoutException: Call From master1/172.16.8.231 to master2:8020 failed on socket timeout exception: org.$ 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) 
     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751) 
     at org.apache.hadoop.ipc.Client.call(Client.java:1479) 
     at org.apache.hadoop.ipc.Client.call(Client.java:1412) 
     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
     at com.sun.proxy.$Proxy9.transitionToStandby(Unknown Source) 
     at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToStandby(HAServiceProtocolClientSideTran$ 
     at org.apache.hadoop.ha.FailoverController.tryGracefulFence(FailoverController.java:172) 
     at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:514) 
     at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:505) 
     at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61) 
     at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892) 
     at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:910) 
     at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:809) 
     at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:418) 
     at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) 
     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch :$ 
     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534) 
     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) 

答えて

0

私は次のプロパティを追加し、今では正常に動作します:

HDFS_SITE.XML

<property> 
    <name>dfs.ha.fencing.methods</name> 
    <value>shell(/bin/true)</value> 
</property> 

CORE-SITE.XML

<property> 
    <name>hs.zookeeper.quorum</name> 
    <value>master1:2181,master2:2181,master3:2181</value> 
</property> 

問題は、それはそれが正常に動作します(/ binに/真)シェルを使用してsshfenceを使用して接続することはできませんでした。

関連する問題