Skip to main content

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange
Asked
Modified 8 months ago
Viewed 313 times
2

I have a RAID5 array configured with MDADM in Ubuntu 24.04 LTS. The array consists of five 8TB drives, and there have been no changes to it recently.

Today I noticed that the array is not accessible.

  • mdadm --detail /dev/md0 shows that the array is Inactive:

    $ mdadm --detail /dev/md0
    /dev/md0:
               Version : 1.2
            Raid Level : raid5
         Total Devices : 5
           Persistence : Superblock is persistent
                 State : inactive
       Working Devices : 5
    
                  Name : Europa:0  (local to host Europa)
                  UUID : 62595935:e04505fc:3e79426a:40326185
                Events : 76498
    
        Number   Major   Minor   RaidDevice
    
           -       8        1        -        /dev/sda1
           -       8       81        -        /dev/sdf1
           -       8       65        -        /dev/sde1
           -       8       49        -        /dev/sdd1
           -       8       33        -        /dev/sdc1
    
  • Using mdadm --examine of each drive, I find that they all say the state is Clean, and 4 of them show the array state is AAAAA (all drives are active), but one of them (sda1) shows the array state is ....A

  • Using cat /proc/mdstat I find that ONLY device sda1 appears!

    $ sudo cat /proc/mdstat
    Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [raid10] 
    md1 : active raid1 sdg1[1] sdh1[0]
          2930132992 blocks super 1.2 [2/2] [UU]
          bitmap: 0/22 pages [0KB], 65536KB chunk
    
    md0 : inactive sda1[4]
          7813893632 blocks super 1.2
    
  • By looking at mdadm --examine for the event count and update time, I can see that sda1 has slightly more events and a more recent update time than the other drives:

    $ mdadm --examine /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 | egrep 'Event|/dev/sd'
    /dev/sda1:
             Events : 76498
    /dev/sdc1:
             Events : 76490
    /dev/sdd1:
             Events : 76490
    /dev/sde1:
             Events : 76490
    /dev/sdf1:
             Events : 76490
    

    and

    $ mdadm --examine /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 | egrep 'Update Time|/dev/sd'
    /dev/sda1:
        Update Time : Mon Jan 13 14:51:59 2025
    /dev/sdc1:
        Update Time : Mon Jan 13 05:03:20 2025
    /dev/sdd1:
        Update Time : Mon Jan 13 05:03:20 2025
    /dev/sde1:
        Update Time : Mon Jan 13 05:03:20 2025
    /dev/sdf1:
        Update Time : Mon Jan 13 05:03:20 2025
    

So my question is how to interpret this?

When searching online I found most people reporting the inverse: one drive has fewer events than the others, and one drive has an earlier Update Time than the others. I can't imagine that 4 drives went bad at exactly the same time, especially since there was no power failure or anything else that might explain a widespread hardware issue.

So what does this mean, and how do I recover the array?

3
  • I haven't changed anything since I posted this but I ran --detail again to see if the status has changed, and now it shows state = active, FAILED, not started. It shows that all 5 drives have been removed, but somehow device 4 is still there. Number Major Minor RaidDevice State - 0 0 0 removed - 0 0 1 removed - 0 0 2 removed - 0 0 3 removed - 0 0 4 removed - 8 1 4 sync /dev/sda1
    T Shoaf
    –  T Shoaf
    2025-01-17 20:01:50 +00:00
    Commented Jan 17 at 20:01
  • I realized that all 4 failed drives are connected to one SATA controller and all the working drives in this machine are on another, so I suspect a failed SATA controller. The replacement is on its way, but what should I expect to happen after it's installed? I think there is metadata stored on each disk so once all the drives come up on the new controller, will the array automatically come online like normal? Or do I have to do something to cause that to happen?
    T Shoaf
    –  T Shoaf
    2025-01-17 21:20:12 +00:00
    Commented Jan 17 at 21:20
  • assemble manually using only the drives that agree with each other, or alternatively try your luck with mdadm --assemble --force. depending on how this happened, data corruption is possible in either case
    frostschutz
    –  frostschutz
    2025-01-17 21:39:19 +00:00
    Commented Jan 17 at 21:39

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.