ZFS SATAディスク交換例 - atacontrol編

Posted on 2012/04/26(Thu) 20:51 in technical

1TB*5台で構成されているRAIDZの空き容量が不足しているので、全部2TBに交換してオンラインマイグレーションする。

その一部ログを単に貼り付けるだけですよ、っと。

現在の接続状況

# atacontrol list
ATA channel 0:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4  SATA revision 2.x
    Slave:       no device present
ATA channel 3:
    Master:  ad6  SATA revision 2.x
    Slave:       no device present
ATA channel 4:
    Master:  ad8  SATA revision 2.x
    Slave:       no device present
ATA channel 5:
    Master: ad10  SATA revision 2.x
    Slave:       no device present
ATA channel 6:
    Master: ad12  SATA revision 2.x
    Slave:       no device present
ATA channel 7:
    Master: ad14  SATA revision 2.x
    Slave:       no device present
# zpool status lib_01
  pool: lib_01
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        lib_01      ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad10    ONLINE       0     0     0
            ad8     ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            da0     ONLINE       0     0     0
            ad12    ONLINE       0     0     0

errors: No known data errors

対象をad10(ATA Channel 5)に決定。

まずオンラインで外す。

# atacontrol detach ata5
# atacontrol list
ATA channel 0:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4  SATA revision 2.x
    Slave:       no device present
ATA channel 3:
    Master:  ad6  SATA revision 2.x
    Slave:       no device present
ATA channel 4:
    Master:  ad8  SATA revision 2.x
    Slave:       no device present
ATA channel 5:
    Master:      no device present
    Slave:       no device present
ATA channel 6:
    Master: ad12  SATA revision 2.x
    Slave:       no device present
ATA channel 7:
    Master: ad14  SATA revision 2.x
    Slave:       no device present
# zpool status lib_01
  pool: lib_01
 state: DEGRADED
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        lib_01      DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            ad10    REMOVED      0     0     0
            ad8     ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            da0     ONLINE       0     0     0
            ad12    ONLINE       0     0     0

errors: No known data errors

無事に外れたので、物理作業。

ここは特に写真も何も無いんだけど、1分ほど待機して回転が落ち着くのを待ってエンクロージャから引き抜く。

ケージにHDDをセット。

同じ位置にHDDを挿し直して再度コンソール。

# atacontrol list
ATA channel 0:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4  SATA revision 2.x
    Slave:       no device present
ATA channel 3:
    Master:  ad6  SATA revision 2.x
    Slave:       no device present
ATA channel 4:
    Master:  ad8  SATA revision 2.x
    Slave:       no device present
ATA channel 5:
    Master:      no device present
    Slave:       no device present
ATA channel 6:
    Master: ad12  SATA revision 2.x
    Slave:       no device present
ATA channel 7:
    Master: ad14  SATA revision 2.x
    Slave:       no device present
# atacontrol attach ata5
Master: ad10  SATA revision 2.x
Slave:       no device present
# atacontrol list
ATA channel 0:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4  SATA revision 2.x
    Slave:       no device present
ATA channel 3:
    Master:  ad6  SATA revision 2.x
    Slave:       no device present
ATA channel 4:
    Master:  ad8  SATA revision 2.x
    Slave:       no device present
ATA channel 5:
    Master: ad10  SATA revision 2.x
    Slave:       no device present
ATA channel 6:
    Master: ad12  SATA revision 2.x
    Slave:       no device present
ATA channel 7:
    Master: ad14  SATA revision 2.x
    Slave:       no device present

挿し直して認識されることを確認。

この段階だと、まだzpoolは復帰を確認していない。

# zpool status lib_01
  pool: lib_01
 state: DEGRADED
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        lib_01      DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            ad10    REMOVED      0     0     0
            ad8     ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            da0     ONLINE       0     0     0
            ad12    ONLINE       0     0     0

errors: No known data errors

zpoolのreplaceを実施。

コマンド打ってから数秒待つ。

# zpool replace lib_01 ad10
# zpool status lib_01
  pool: lib_01
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.00% done, 869h49m to go
config:

        NAME            STATE     READ WRITE CKSUM
        lib_01          DEGRADED     0     0     0
          raidz1        DEGRADED     0     0     0
            replacing   DEGRADED     0     0     0
              ad10/old  REMOVED      0     0     0
              ad10      ONLINE       0     0     0  4.31M resilvered
            ad8         ONLINE       0     0     0
            ad6         ONLINE       0     0     0
            da0         ONLINE       0     0     0
            ad12        ONLINE       0     0     0

errors: No known data errors

後は終了を待つ。

zfs snapshotを取得すると中断されるので、crontabから該当するスクリプトを外しておく。

問題は同じ事をあと3回もやらなければならないことだ...。