#Docker #btrfs #systemd #服务器运维

磁盘与挂载结构

设备 容量 类型 挂载点 说明
sda (SSD) 119G ext4 / 系统盘,已用 93%
sdb (WD 4T) 3.6T exfat /smb 外部硬盘,影视/文件共享
sdc (WD 2T) 1.8T btrfs /nfs btrfs 多设备卷成员 1
sdd (2T) 1.8T btrfs —(内核接管) btrfs 多设备卷成员 2

btrfs 卷说明

sdcsdd 共享同一 UUID,组成一个 btrfs 多设备卷(RAID1 镜像),挂载为 /nfs,有效容量约 1.8T。sdd 不单独显示挂载点属正常现象,umount /nfs 会同时释放两块盘。系统无软件 RAID(mdstat 为空)。

磁盘设备名(sdb/sdc/sdd)在每次启动时可能因识别顺序不同而重新分配,但 fstab 使用 UUID 挂载,不受设备名影响


服务依赖关系

1
2
3
4
5
6
7
/smb (sdb)  ←── qbittorrent   下载目录 /smb/Downloads/
←── webdav 共享目录 /smb

/nfs (sdc+sdd) ←── mt-photos 数据 /nfs/mt_photos/
←── mt-photos-ai 无外部挂载(依赖 mt-photos 网络)
←── mt-photos-insightface 无外部挂载
←── webdav Joplin 目录 /nfs/joplin

自动化挂载与启停配置

系统通过两个自定义文件实现开机自动挂载外部磁盘、启动 Docker 容器,以及关机时自动停止容器并卸载磁盘。

/etc/systemd/system/ext-mount.service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[Unit]
Description=External Disk Mount Service with Retry
After=network-online.target
Requires=local-fs.target
After=local-fs.target

[Service]
Type=oneshot
RemainAfterExit=yes
TimeoutStartSec=300s
ExecStart=/usr/local/bin/mount-retry.sh
# 关机时:先停所有依赖磁盘的容器,再卸载磁盘
ExecStop=/bin/sh -c 'docker stop webdav qbittorrent mt-photos mt-photos-ai mt-photos-insightface-unofficial 2>/dev/null; /bin/umount -l /nfs 2>/dev/null; /bin/umount -l /smb 2>/dev/null; true'

[Install]
WantedBy=multi-user.target

/usr/local/bin/mount-retry.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#!/bin/bash

MAX_RETRY=6
RETRY_DELAY=15
MOUNT_POINT_PHOTOS="/nfs"
MOUNT_POINT_MEDIAS="/smb"
CONTAINERS=("qbittorrent" "mt-photos" "mt-photos-ai" "mt-photos-insightface-unofficial")

success=false

for attempt in $(seq 1 $MAX_RETRY); do
echo "Attempt ${attempt}: Triggering fstab mounts..."

if /bin/mount -a 2>&1; then
echo "✅ Mount command executed successfully"
else
echo "⚠️ Mount command returned error (continuing anyway)"
fi

# 直接检查两个挂载点是否均已挂载
if mountpoint -q "$MOUNT_POINT_PHOTOS" && mountpoint -q "$MOUNT_POINT_MEDIAS"; then
echo "✅ Both mount points accessible"
success=true
break
fi

[[ $attempt -lt $MAX_RETRY ]] && echo "⏳ Mount failed. Retrying in $RETRY_DELAY seconds..."
sleep $RETRY_DELAY
done

if $success; then
echo "✅ Mount succeeded! Managing Docker containers..."
for container in "${CONTAINERS[@]}"; do
if ! docker inspect "$container" >/dev/null 2>&1; then
echo "⚠️ Container '$container' does not exist. Skipping..."
continue
fi
status=$(docker inspect -f '{{.State.Status}}' "$container" 2>/dev/null)
case "$status" in
"running") echo "🔵 Container '$container' is already running" ;;
"exited"|"created"|"paused"|"")
echo "🟢 Starting container: $container"
docker start "$container" && echo "✅ Started $container" || echo "❌ Failed to start $container"
;;
*) echo "⚠️ Container '$container' in unknown state: $status" ;;
esac
done
exit 0
else
echo "❌ ERROR: Mount failed after $MAX_RETRY attempts" >&2
exit 1
fi

/etc/fstab 关键条目

1
2
UUID=<btrfs-uuid>  /nfs  btrfs  defaults,nofail,noatime,compress=zstd,x-systemd.device-timeout=120s  0  0
UUID=<exfat-uuid> /smb exfat defaults,nofail,x-systemd.requires=network-online.target,uid=1000,gid=1000,fmask=000,dmask=000,errors=remount-ro,x-systemd.device-timeout=120s 0 0

nofail 确保磁盘未就绪时系统正常启动;x-systemd.device-timeout=120s 给机械硬盘充足的冷启动识别时间。

修改后重载:

1
2
3
sudo systemctl daemon-reload
sudo systemctl restart ext-mount.service
systemctl status ext-mount.service

手动关机流程

正常情况下 ext-mount.service 停止时会自动处理容器与磁盘,若需手动操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 1. 停止 Docker 容器(webdav 也依赖 /nfs,需一并停止)
docker stop webdav qbittorrent mt-photos mt-photos-ai mt-photos-insightface-unofficial

# 2. 确认全部停止
docker ps

# 3. 卸载外部磁盘
sudo umount /nfs
sudo umount /smb

# 4. 确认卸载成功
df -h | grep -E 'nfs|smb'
# 期望:无输出

# 5. 关机
sudo shutdown -h now

手动启动流程

开机后 ext-mount.service 会自动挂载磁盘并启动容器,若需手动干预:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 1. 确认磁盘已挂载
df -h | grep -E 'nfs|smb'

# 2. 若未挂载,手动触发
sudo mount -a

# 3. 按依赖顺序启动容器
docker start qbittorrent
docker start webdav
docker start mt-photos-ai
docker start mt-photos-insightface-unofficial
docker start mt-photos

# 4. 确认全部运行
docker ps --format 'table {{.Names}}\t{{.Status}}'

已知问题与修复记录

问题:重启后 mt-photos 挂载失败导致服务异常

根本原因(三个叠加):

  1. mount-retry.sh 成功判断 bug:脚本中 $MOUNT_POINT 变量从未定义,导致 grep 检查永远为真,掩盖了真实的挂载失败

  2. 设备识别超时:机械硬盘冷启动识别时间超过 systemd 默认等待,/nfs/smb 均报 Timed out waiting for device

  3. ExecStop 漏掉 webdavwebdav 同时挂载 /nfs/smb,未被停止直接 umount 导致 target is busy

修复内容:

  • mount-retry.sh:成功判断改为直接检查两个 mountpoint -q,去掉无效的 grep "$MOUNT_POINT"

  • ext-mount.serviceExecStop 补入 webdav/smb 的 umount 也加入;TimeoutStartSec 从 180s 延长至 300s

  • fstab:两个外部磁盘条目均加入 x-systemd.device-timeout=120s


注意事项

  1. 系统盘使用率 93%:定期清理 /tmp、docker 悬空镜像(docker image prune

  2. btrfs sdd 无需单独操作umount /nfs 自动释放 sdc 和 sdd 两块盘

  3. umount 失败:若提示 target is busy,用 lsof +D /nfs 查找占用进程后 kill 再重试

  4. webdav 未在自动启动列表中mount-retry.shCONTAINERS 数组目前未包含 webdav,如需开机自动启动需手动添加