{"id":12073,"date":"2022-11-03T22:18:04","date_gmt":"2022-11-04T05:18:04","guid":{"rendered":"https:\/\/www.apolonio.com\/blog\/?p=12073"},"modified":"2022-11-03T22:18:04","modified_gmt":"2022-11-04T05:18:04","slug":"hard-drive-failure","status":"publish","type":"post","link":"https:\/\/www.apolonio.com\/blog\/?p=12073","title":{"rendered":"Hard Drive Failure"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Lost a drive that had data that I did not have a full backup.  It sucks but the data was not critical.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There were couple lessons learned, first off the root cause, I think a fan died, I have not opened it up yet, but a fan heated up the drives causing them do fail.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So I have figured out how to monitor the drive temps.  I am using a utility called hddtemp.  It produces nice simple info on drive temps, but in addition it has a daemon which a nagios plugin can communicate with to produce an alert if temps go to high.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Furthermore, I set up  smartd to let me know if a drive is failing as well.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Another thing I wanted to know was what were on those drives exactly, so I used find to walk through those drives and produce info using file and stat.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I did replace the drive and I am watching temps.  I have a large fan cooling the system down for now.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The temp at 9:57PM was 59C for one drive after the fan was on it for about 15 minutes the temp dropped to 42C.  I have it checking if temps go above 35 to warn and 48 for critical.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Lets see if temps stay down.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Weight: 314,4<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lost a drive that had data that I did not have a full backup. It sucks but the data was not critical. There were couple lessons learned, first off the root cause, I think a fan died, I have not &hellip; <a href=\"https:\/\/www.apolonio.com\/blog\/?p=12073\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11,3,9],"tags":[],"class_list":["post-12073","post","type-post","status-publish","format-standard","hentry","category-technical","category-training","category-weighin"],"_links":{"self":[{"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/12073","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=12073"}],"version-history":[{"count":1,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/12073\/revisions"}],"predecessor-version":[{"id":12074,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/12073\/revisions\/12074"}],"wp:attachment":[{"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=12073"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=12073"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.apolonio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=12073"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}