2016-07-10 17 views
0

Hadoop 2.7 2ノードクラスタを設定しようとしています。私がhadoopを開始するとき start-dfs.shとstart-yarn.shを使用しています。 。Hadoop 2ノードクラスタのUIが1つのライブノードを示しています

Here is my jps command on my master: 
23913 Jps 
22140 SecondaryNameNode 
22316 ResourceManager 
22457 NodeManager 
21916 DataNode 
21777 NameNode 

Here is my jps command on my slave: 
17223 Jps 
14225 DataNode 
14363 NodeManager 

しかし、私がHadoopクラスタのUIを見ると、ライブデータノードは1つしか表示されません。

Here is the dfs admin report : /bin/hdfs dfsadmin -report 
Live datanodes (1): 

Name: 192.168.1.104:50010 (nn1.cluster.com) 
Hostname: nn1.cluster.com 
Decommission Status : Normal 
Configured Capacity: 401224601600 (373.67 GB) 
DFS Used: 237568 (232 KB) 
Non DFS Used: 48905121792 (45.55 GB) 
DFS Remaining: 352319242240 (328.12 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 87.81% 

すべてのマシンでsshを実行できます。

Here is the sample of name node logs(i.p = 192.168.1.104) : 
2016-07-12 01:17:34,293 INFO BlockStateChange: BLOCK* processReport: from storage DS-d9ed40cf-bd5d-4033-a6ca-14fb4a8c3587 node DatanodeRegistration(192.168.1.104:50010, datanodeUuid=b702b518-5daa-4fa1-8e69-e4d620a72470, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-e86d0353-9f33-495b-88fa-16035abd3672;nsid=616310490;c=0), blocks: 24, hasStaleStorage: false, processing time: 0 msecs 
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* registerDatanode: from DatanodeRegistration(192.168.1.104:50010, datanodeUuid=37038a9f-23ac-42e2-abea-bdf356aaefbe, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-e86d0353-9f33-495b-88fa-16035abd3672;nsid=616310490;c=0) storage 37038a9f-23ac-42e2-abea-bdf356aaefbe 
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: BLOCK* registerDatanode: 192.168.1.104:50010 
2016-07-12 01:17:35,501 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/192.168.1.104:50010 
2016-07-12 01:17:35,501 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0 
2016-07-12 01:17:35,502 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.1.104:50010 
2016-07-12 01:17:35,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0 
2016-07-12 01:17:35,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Adding new storage ID DS-495b6b0e-f1fc-407c-bb9f-6c314c2fdaec for DN 192.168.1.104:50010 

here is the sample datanode logs (i.p = 192.168.1.104): 
2016-07-12 02:02:12,044 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state 
2016-07-12 02:02:12,045 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN 
2016-07-12 02:02:12,047 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN 
2016-07-12 02:02:12,050 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0x236119eb3082, containing 1 storage report(s), of which we sent 1. The reports had 24 total blocks and used 1 RPC(s). This took 0 msec to generate and 1 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5. 
2016-07-12 02:02:12,050 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934 
2016-07-12 02:02:15,049 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state 
2016-07-12 02:02:15,052 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN 
2016-07-12 02:02:15,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid b702b518-5daa-4fa1-8e69-e4d620a72470) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN 
2016-07-12 02:02:15,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0x2361cd4be40d, containing 1 storage report(s), of which we sent 1. The reports had 24 total blocks and used 1 RPC(s). This took 0 msec to generate and 2 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5. 
2016-07-12 02:02:15,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934 

Here is the sample of 2nd datanode logs(ip :192.168.35.128) 
2016-07-12 11:45:07,346 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state 
2016-07-12 11:45:07,349 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN 
2016-07-12 11:45:07,355 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN 
2016-07-12 11:45:07,364 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb0de42ec7c, containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 4 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5. 
2016-07-12 11:45:07,364 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934 
2016-07-12 11:45:10,360 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state 
2016-07-12 11:45:10,363 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN 
2016-07-12 11:45:10,370 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN 
2016-07-12 11:45:10,377 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb191ea9cb9, containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 3 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5. 
2016-07-12 11:45:10,377 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934 
2016-07-12 11:45:13,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action : DNA_REGISTER from nn1.cluster.com/192.168.1.104:8020 with active state 
2016-07-12 11:45:13,380 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 beginning handshake with NN 
2016-07-12 11:45:13,385 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-1235752202-192.168.1.104-1468159707934 (Datanode Uuid 37038a9f-23ac-42e2-abea-bdf356aaefbe) service to nn1.cluster.com/192.168.1.104:8020 successfully registered with NN 
2016-07-12 11:45:13,395 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xb245b893c4, containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 5 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5. 
2016-07-12 11:45:13,396 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Got finalize command for block pool BP-1235752202-192.168.1.104-1468159707934 

どうしてですか?本当に助けてくれてありがとう!

+0

hadoopのnamenodeとdatanodeのログを確認しましたか? – nat

+0

名前のノードとデータノードのログを確認しました。私は両方のログにどんな種類のエラーも見つけませんでした。私は名前ノードを再フォーマットしようとしましたが、手動でデータノードのディレクトリを削除してサービスを再開しましたが、UIライブノードではまだデッドノードが0を示しています。 –

+0

@nath - 質問namenodeとdatanodeのログを質問自体に追加しました。何か不足していますか? –

答えて

0

ガット解決策。問題はnamenode Ipaddressにありました。私はwlan0インターフェースからipaddressを考慮していました.を変更しています。VMwareワークステーションをインストールしたので、静的ipaddressだったvmnetインターフェイスからIpaddress を考えました。 。

0

解決策を得ました。スレーブノード/データノードが個別に、レポートに表示されていない場合は、hadoop dfsadmin-report ...通信に問題があります。データノードからマスタへの通信は利用できません。技術的に言えば、ファイアウォールに問題があります。マスターノードのFire wallが通信をブロックしています。

は「私たちはマスターで、ファイアウォールの接続を停止する必要があります」/「特定のIPは、マスターへのアクセスを許可

CentOSのにファイアウォールを停止するにはCentOSの

にコマンド下記行う
service iptables save 
service iptables stop 
chkconfig iptables off 
関連する問題