1.datanode起不来 2016-11-25 09:46:43,685 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /home/hadoop3/hadoop_data/data : org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /home/hadoop3/hadoop_data/data at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2272) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2314) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2296) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2188) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2235) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2411) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2435)
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hdmaster/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862) at java.lang.Thread.run(Thread.java:745) 2016-11-25 09:27:18,795 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hdmaster/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 2016-11-25 09:27:18,795 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/hadoop1/hadoop_data/data/ has already been used. 2016-11-25 09:27:18,818 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-994368505-192.168.30.223-1441944900262 2016-11-25 09:27:18,818 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-994368505-192.168.30.223-1441944900262 java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop1/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862) at java.lang.Thread.run(Thread.java:745) 2016-11-25 09:27:18,818 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop1/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 2016-11-25 09:27:18,818 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/hadoop2/hadoop_data/data/ has already been used. 2016-11-25 09:27:18,839 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-994368505-192.168.30.223-1441944900262 2016-11-25 09:27:18,839 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-994368505-192.168.30.223-1441944900262 java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop2/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862) at java.lang.Thread.run(Thread.java:745) 2016-11-25 09:27:18,840 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop2/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262 2016-11-25 09:27:18,840 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to namenode01/192.168.30.223:9000. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862) at java.lang.Thread.run(Thread.java:745) 2016-11-25 09:27:18,840 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to namenode02/192.168.32.124:9000. Exiting. org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 3, volumes configured: 4, volumes failed: 1, volume failures tolerated: 0 at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:247) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1335) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862) at java.lang.Thread.run(Thread.java:745)
---原因第三块盘坏了 解决步骤1:修复磁盘 [root@hdslave04 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_server03-lv_root 50G 27G 20G 58% / tmpfs 16G 68K 16G 1% /dev/shm /dev/sda1 477M 59M 393M 13% /boot /dev/mapper/vg_server03-lv_home 1.8T 1.5T 207G 88% /home /dev/sdb1 1.8T 1.5T 303G 83% /home/hadoop1 /dev/sdc1 1.8T 1.5T 286G 84% /home/hadoop2 /dev/sdd1 1.8T 1.5T 250G 86% /home/hadoop3
umount /dev/sdd1 如果出现umount: /dev/sdd1: device is busy, fuser -m /home/hadoop3 kill pid umount /dev/sdd1
[root@hdslave04 ~]# fsck -y /dev/sdd1 fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) fsck.ext4: 没有那个设备或地址 当尝试打开 /dev/sdd1 时 Possibly non-existent or swap device?
--以上说明磁盘已经损坏
解决步骤2:更换磁盘 Parted工具来实现对GPT磁盘进行分区 参考:http://soft.chinabyte.com/os/447/12439447.shtml parted /dev/sdd (parted) mklabel ----创建创建磁盘标签 New disk labeltype? gpt (parted) p ----查看分区状态 Model: VMware,VMware Virtual S (scsi)
Disk /dev/sde:2000GB
Sector size(logical/physical): 512B/512B
Partition Table:gpt
Number Start End Size File system Name Flags
(parted) mkpart
Partition name? []? sdd1 ---指定分区名称
File system type? [ext2]ext4 ----指定分区类型
Start? 1 ---指定开始位置
End? 2000GB ---指定结束位置
(parted) P ----显示分区信息
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sde: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 17.4kB 2000GB 2000GB sdd1
(parted) Q ---退出
步骤三:格式化最新磁盘 mkfs.ext4 /dev/sdd1 修改fstab挂载最新磁盘 /dev/sdd1 1.8T 1.5T 250G 86% /home/hadoop3 重启 shutdown -r now
步骤四:重启datanode ,nodeManager sh hadoop-daemon.sh start datanode sh yarn-daemon.sh start nodemanager
--步骤1附加 死马当活马医看看重启是否 重启服务器看看 shutdown -r now
发现机器卡住了 1.进入Linux单用户模式 执行 root# mount -o remount,rw / vim /etc/fstab #/dev/sdd1 1.8T 1.5T 250G 86% /home/hadoop3 2.重启服务器进入系统正常 shutdown -r now
|