When you receive notification that a disk has died. First run metastat to determine with disk has been swapped out
# metastat
d0: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d20
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 18876726 blocks
d10: Submirror of d0
State: Okay
Size: 18876726 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s0 0 No Okay
d20: Submirror of d0
State: Okay
Size: 18876726 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s0 0 No Okay
d1: Mirror
Submirror 0: d11
State: Okay
Submirror 1: d21
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 33555735 blocks
d11: Submirror of d1
State: Okay
Size: 33555735 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s1 0 No Okay
d21: Submirror of d1
State: Okay
Size: 33555735 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s1 0 No Okay
d3: Mirror
Submirror 0: d30
State: Okay
Submirror 1: d31
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 355604121 blocks
d30: Submirror of d3
State: Okay
Hot spare pool: hsp000
Size: 355604121 blocks
Stripe 0: (interlace: 64 blocks)
Device Start Block Dbase State Hot Spare
c2t0d0s0 0 No Okay
c2t1d0s0 2889 No Okay
c2t2d0s0 2889 No Okay
c2t3d0s0 2889 No Okay
c2t4d0s0 2889 No Okay
d31: Submirror of d3
State: Okay
Hot spare pool: hsp000
Size: 355604121 blocks
Stripe 0: (interlace: 64 blocks)
Device Start Block Dbase State Hot Spare
c2t5d0s0 0 No Okay c2t12d0s0
c2t8d0s0 2889 No Okay
c2t9d0s0 2889 No Okay
c2t10d0s0 2889 No Okay
c2t11d0s0 2889 No Okay
hsp000: 2 hot spares
c2t12d0s0 In use 71124291 blocks
c2t13d0s0 Available 71124291 blocks
#
OK, c2t5d0s0 is the sixth disk from the left in the array - or to put that another way it is the left one of the two middle disks!
You can run format->analyse->read which will determine whether the disk is really dead.
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde452,0
1. c1t1d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde367,0
2. c2t0d0
/pci@8,600000/pci@1/scsi@4/sd@0,0
3. c2t1d0
/pci@8,600000/pci@1/scsi@4/sd@1,0
4. c2t2d0
/pci@8,600000/pci@1/scsi@4/sd@2,0
5. c2t3d0
/pci@8,600000/pci@1/scsi@4/sd@3,0
6. c2t4d0
/pci@8,600000/pci@1/scsi@4/sd@4,0
7. c2t5d0
/pci@8,600000/pci@1/scsi@4/sd@5,0
8. c2t8d0
/pci@8,600000/pci@1/scsi@4/sd@8,0
9. c2t9d0
/pci@8,600000/pci@1/scsi@4/sd@9,0
10. c2t10d0
/pci@8,600000/pci@1/scsi@4/sd@a,0
11. c2t11d0
/pci@8,600000/pci@1/scsi@4/sd@b,0
12. c2t12d0
/pci@8,600000/pci@1/scsi@4/sd@c,0
13. c2t13d0
/pci@8,600000/pci@1/scsi@4/sd@d,0
Specify disk (enter its number): 7
selecting c2t5d0
[disk formatted]
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!
quit
format> anal
ANALYZE MENU:
read - read only test (doesn't harm SunOS)
refresh - read then write (doesn't harm data)
test - pattern testing (doesn't harm data)
write - write then read (corrupts data)
compare - write, read, compare (corrupts data)
purge - write, read, write (corrupts data)
verify - write entire disk, then verify (corrupts data)
print - display data buffer
setup - set analysis parameters
config - show analysis parameters
!
quit
analyze> read
Ready to analyze (won't harm SunOS). This takes a long time,
but is interruptable with CTRL-C. Continue? y
pass 0
24619/26/53
pass 1
24619/26/53
Total of 0 defective blocks repaired.
analyze> q
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!
quit
format> q
#
The excerpt above didn't show any repairs, but they may be some when you run the command.
As long as a hot spare has jumped in, the dead disk can just be removed from the array and the new one inserted. The disk will spin up immediately.
Wait for the green light to come on. And Bob's your uncle!
Run format to apply a disk label. You won't be able to format the disk until you do.
# format
Searching for disks...done
c2t5d0: configured with capacity of 33.92GB
AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde452,0
1. c1t1d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde367,0
2. c2t0d0
/pci@8,600000/pci@1/scsi@4/sd@0,0
3. c2t1d0
/pci@8,600000/pci@1/scsi@4/sd@1,0
4. c2t2d0
/pci@8,600000/pci@1/scsi@4/sd@2,0
5. c2t3d0
/pci@8,600000/pci@1/scsi@4/sd@3,0
6. c2t4d0
/pci@8,600000/pci@1/scsi@4/sd@4,0
7. c2t5d0
/pci@8,600000/pci@1/scsi@4/sd@5,0
8. c2t8d0
/pci@8,600000/pci@1/scsi@4/sd@8,0
9. c2t9d0
/pci@8,600000/pci@1/scsi@4/sd@9,0
10. c2t10d0
/pci@8,600000/pci@1/scsi@4/sd@a,0
11. c2t11d0
/pci@8,600000/pci@1/scsi@4/sd@b,0
12. c2t12d0
/pci@8,600000/pci@1/scsi@4/sd@c,0
13. c2t13d0
/pci@8,600000/pci@1/scsi@4/sd@d,0
Specify disk (enter its number): 7
selecting c2t5d0
[disk formatted]
Disk not labeled. Label it now? y
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!
quit
format> q
#
Enter a prtvtoc/fmthard command combination to ensure the disk has the same slices as the replaced disk. In the command below I use the disk that is in the equivalent position on the other side of the mirror as the source of the configuration.
# prtvtoc /dev/rdsk/c2t0d0s0 | fmthard -s - /dev/rdsk/c2t5d0s0
fmthard: New volume table of contents now in place.
#
Check the disk format:
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde452,0
1. c1t1d0
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cffde367,0
2. c2t0d0
/pci@8,600000/pci@1/scsi@4/sd@0,0
3. c2t1d0
/pci@8,600000/pci@1/scsi@4/sd@1,0
4. c2t2d0
/pci@8,600000/pci@1/scsi@4/sd@2,0
5. c2t3d0
/pci@8,600000/pci@1/scsi@4/sd@3,0
6. c2t4d0
/pci@8,600000/pci@1/scsi@4/sd@4,0
7. c2t5d0
/pci@8,600000/pci@1/scsi@4/sd@5,0
8. c2t8d0
/pci@8,600000/pci@1/scsi@4/sd@8,0
9. c2t9d0
/pci@8,600000/pci@1/scsi@4/sd@9,0
10. c2t10d0
/pci@8,600000/pci@1/scsi@4/sd@a,0
11. c2t11d0
/pci@8,600000/pci@1/scsi@4/sd@b,0
12. c2t12d0
/pci@8,600000/pci@1/scsi@4/sd@c,0
13. c2t13d0
/pci@8,600000/pci@1/scsi@4/sd@d,0
Specify disk (enter its number): 7
selecting c2t5d0
[disk formatted]
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!
quit
format> p
PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
!
quit
partition> p
Current partition table (original):
Total disk cylinders available: 24620 + 2 (reserved cylinders)
Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 24618 33.91GB (24619/0/0) 71124291
1 unassigned wu 0 0 (0/0/0) 0
2 backup wu 0 - 24619 33.92GB (24620/0/0) 71127180
3 unassigned wu 0 0 (0/0/0) 0
4 unassigned wu 0 0 (0/0/0) 0
5 unassigned wu 0 0 (0/0/0) 0
6 unassigned wu 0 0 (0/0/0) 0
7 unassigned wm 24619 - 24619 1.41MB (1/0/0) 2889
partition> q
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!
quit
format> q
#
Replace the hotspare disk with the original - the replacement command works on the mirror not the submirror which actually has the failed disk!:
# metareplace -e d3 c2t5d0s0
#
Use metastat | grep to check when the mirror has finished resync-ing
# metareplace -e d3 c2t5d0s0
d3: device c2t5d0s0 is enabled
# metastat
d0: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d20
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 18876726 blocks
d10: Submirror of d0
State: Okay
Size: 18876726 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s0 0 No Okay
d20: Submirror of d0
State: Okay
Size: 18876726 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s0 0 No Okay
d1: Mirror
Submirror 0: d11
State: Okay
Submirror 1: d21
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 33555735 blocks
d11: Submirror of d1
State: Okay
Size: 33555735 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s1 0 No Okay
d21: Submirror of d1
State: Okay
Size: 33555735 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s1 0 No Okay
d3: Mirror
Submirror 0: d30
State: Okay
Submirror 1: d31
State: Resyncing
Resync in progress: 0 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 355604121 blocks
d30: Submirror of d3
State: Okay
Hot spare pool: hsp000
Size: 355604121 blocks
Stripe 0: (interlace: 64 blocks)
Device Start Block Dbase State Hot Spare
c2t0d0s0 0 No Okay
c2t1d0s0 2889 No Okay
c2t2d0s0 2889 No Okay
c2t3d0s0 2889 No Okay
c2t4d0s0 2889 No Okay
d31: Submirror of d3
State: Resyncing
Hot spare pool: hsp000
Size: 355604121 blocks
Stripe 0: (interlace: 64 blocks)
Device Start Block Dbase State Hot Spare
c2t5d0s0 0 No Resyncing
c2t8d0s0 2889 No Okay
c2t9d0s0 2889 No Okay
c2t10d0s0 2889 No Okay
c2t11d0s0 2889 No Okay
hsp000: 2 hot spares
c2t12d0s0 Available 71124291 blocks
c2t13d0s0 Available 71124291 blocks
# metastat d3 | grep "Resync in progress"
Resync in progress: 5 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 5 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 6 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 6 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 8 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 51 % done
# metastat d3 | grep "Resync in progress"
Resync in progress: 60 % done
#
No comments:
Post a Comment