Be careful: Upgrading Debian Jessie to Stretch, with Pacemaker DRBD and an nested ext4 LVM hosted on VMware products

Detached DRBD (diskless)

In the past I setup some new Pacemaker clustered nodes with a fresh Debian Stretch installation. I followed our standard installation guide, created also shared replicated DRBD storage, but whenever I tried to mount the ext4 storage DRBD detached the disks on both node sides with I/O errors. After recreating it, using other storage volumes and testing my ProLiant hardware (whop I thought it had got a defect..) it still occurs, but somewhere in the middle of testing, a quicker setup without LVM it worked fine, hum..

Much later I found this (only post at this time about it) on the DRBD-user mailinglist: [0]
This means, if you use the combination of VMware-Product -> Debian Stretch -> local Storage -> DRBD -> LVM -> ext4 you will be affected by this bug. This happens, because VMware always publishs the information, that the guest is able to support the “WRITE SAME” feature, which is wrong. Since the DRBD version, which is also shipped with Stretch, DRBD now also supports WRITE SAME, so it tries to use this feature, but this fails then.
This is btw the same reason, why VMware users see in their dmesg this:

WRITE SAME failed.Manually zeroing.

As a workaround I am using now systemd, to disable “WRITE SAME” for all attached block devices in the guest. Simply run the following:

for i in `find /sys/block/*/device/scsi_disk/*/max_write_same_blocks`; do echo “w $i  –   –   –   –  0” ; done > /etc/tmpfiles.d/write_same.conf

[0]: http://lists.linbit.com/pipermail/drbd-user/2017-January/022931.html

Pacemaker failovers with DRBD+LVM do not work

If you use a DRBD with a nested LVM, you already had to add the following lines to your /etc/lvm/lvm.conf in past Debian releases (assuming that sdb and sdc are DRBD devices):

filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
write_cache_state = 0

Wit Debian Stretch this is not enough. Your failovers will result in a broken state on the second node, because it can not find your LVs and VGs. I found out, that killing lvmetad helps. So I also added a global_filter (it should be used for all LVM services):

global_filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]

But this also didn’t helped.. My only solution was to disable lvmetad (which I am also not using at all). So adding this all – in combination – works now for me and failovers are as smooth as with Jessie:

filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
global_filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
write_cache_state = 0
use_lvmetad = 0

Do not forget to update your initrd, so that the LVM configuration is updated on booting your server:

update-initramfs -k all -u

Reboot, that’s it :)

Leave a Reply