Hardware: 3 Raspberry, ETH0 @100MBit
DRBD, Active/Active, istgt para iscsi en 2 Raspberrys
ha-iscsi1(=Hostname), 10.250.30.202 ha-iscsi2(=Hostname), 10.250.30.203 Fuente de energía de Raspberry y Samsung DRBD: Un USB-Stick de 64 GB el DRBD-Node
LXC Host en un Raspberry
lxc1pi (Hostname lxc1), 10.250.30.204 Fuente de energía USB-Hub activo
LXC Guest
wiki
Instalación ha-iscsi
Hosts: ha-iscsi1 y ha-iscsi2, Raspbian
apt-get install istgt pacemaker crmsh drbd8-utils vim mtr fio sysbench sysstat tcpdump
Configuración
istgt
/etc/istgt/istgt.conf de ha-iscsi2
[Global] Comment "Global section" NodeBase "ha-iscsi2.203.istgt" PidFile /var/run/istgt.pid AuthFile /etc/istgt/auth.conf MediaDirectory /var/istgt LogFacility "local7" Timeout 30 NopInInterval 20 DiscoveryAuthMethod Auto MaxSessions 16 MaxConnections 4 MaxR2T 32 MaxOutstandingR2T 16 DefaultTime2Wait 2 DefaultTime2Retain 60 FirstBurstLength 262144 MaxBurstLength 1048576 MaxRecvDataSegmentLength 262144 InitialR2T Yes ImmediateData Yes DataPDUInOrder Yes DataSequenceInOrder Yes ErrorRecoveryLevel 0 [UnitControl] Comment "Internal Logical Unit Controller" AuthMethod CHAP Mutual AuthGroup AuthGroup10000 Portal UC1 127.0.0.1:3261 Netmask 127.0.0.1 [PortalGroup1] Comment "iscsi portal" Portal DA1 10.250.30.203:3260 [InitiatorGroup1] Comment "Initiator Group1" InitiatorName "iqn.xenx.debian:01:4a4e9e22cdb" Netmask 10.250.30.0/24 [LogicalUnit1] Comment "Hard Disk Sample" TargetName disk1 TargetAlias "Data Disk1" Mapping PortalGroup1 InitiatorGroup1 AuthMethod Auto AuthGroup AuthGroup1 UseDigest Auto UnitType Disk LUN0 Storage /dev/drbd0 Auto
Diferencias en ha-iscsi1 istgt.conf:
... NodeBase "ha-iscsi1.202.istgt" ... Portal DA1 10.250.30.202:3260 ...
/etc/istgt/auth.conf de ha-iscsi1 y ha-iscsi2
[AuthGroup1] Comment "x.xenx" Auth "xenx" "password" \ [AuthGroup9999] Auth "xenx" "password" \ [AuthGroup10000] Comment "Unit Controller's users" Auth "ctluser" "test" "mutualuser" "mutualsecret" Auth "onlysingle" "secret"
No systemd controla istgt, sino pacemaker lo hace. Por lo tanto desactivo istgt para systemd.
systemctl disable istgt
DRBD
/etc/drbd.conf de ha-iscsi1 yha-iscsi2 son identicos.
global { dialog-refresh 1; minor-count 5; usage-count no; } common { } resource r0 { protocol C; handlers { split-brain "/usr/lib/drbd/notify-split-brain.sh root"; } disk { on-io-error pass_on; } syncer { verify-alg sha1; rate 12M; } net { allow-two-primaries; after-sb-0pri discard-younger-primary; after-sb-1pri call-pri-lost-after-sb; after-sb-2pri call-pri-lost-after-sb; } startup { become-primary-on both; } on ha-iscsi1 { device /dev/drbd0; address 10.250.30.202:7788; meta-disk /dev/sda1[0]; disk /dev/sda2; } on ha-iscsi2 { device /dev/drbd0; address 10.250.30.203:7788; meta-disk /dev/sda1[0]; disk /dev/sda2; } }
Es una configuración de dos primaries y pacemaker es para controlar. Uso un propio metadisk con indexes. La ventaja es que puedo usar el meta disk para otro drbd disk.
En ha-iscsi1 y 2 necesito iniciar drbd ...
drbdadm create-md r0
... y activarlas
drbdadm up r0
Tengo esta salida de "cat /proc/drbd"
...ds: Inconsistent/inconsistent...
En esta situación no puedo usar drbd. Puedo resolver "inconsistent" con siguiente comando en ha-iscsi1:
drbdadm verify r0
Para 20 GByte necesita 20 minutos y después tengo siguiente salida de "cat /proc/drbd"
... ds:UpToDate/UpToDate ...
Corosync y Pacemaker
Primero corosync necesita una llave que produzco con ...
corosync-keygen
Después tengo el archivo /etc/corosync/auth.key que copio al otro Raspberry.
/etc/corosync/corosync.conf de ha-iscsi1 y 2
totem { version: 2 transport: udpu cluster_name: ha-iscsi token: 3000 token_retransmits_before_loss_const: 10 clear_node_high_bit: yes crypto_cipher: aes256 crypto_hash: sha1 interface { ringnumber: 0 bindnetaddr: 10.250.30.0 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: no to_syslog: yes syslog_facility: daemon debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } quorum { provider: corosync_votequorum two_node: 1 } nodelist { node { ring0_addr: 10.250.30.202 } node { ring0_addr: 10.250.30.203 } } Talvez se debe que reiniciar corosync {{{ systemctl stop corosync systemctl start corosync
Reinicio pacemaker tambien
systemctl stop pacemaker systemctl start pacemaker
Ahora puedo configurar pacemaker con crm
Uno puede configurar drbd como active/active solo con pacemaker. Los programas de drbd-utils no puede lograrlo.
Desactivar stonith
crm configure stonith-enabled=false
Configurar y iniciar DRBD Primary/Primary (significa Active/Active)
crm configure primitive drbd_r0 ocf:linbit:drbd params drbd_resource="r0" op monitor interval="20" role="Master" timeout="20" op monitor interval="30" role="Slave" timeout="20" crm configure ms ms_drbd_r0 drbd_r0 meta resource-stickiness="100" master-max="2" notify="true" interleave="true"
Salida de "cat /proc/drbd"
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:20971392
Salida de "crm status"
Stack: corosync Current DC: ha-iscsi1 (version 1.1.16-94ff4df) - partition with quorum Last updated: Sun Mar 25 06:33:25 2018 Last change: Sun Mar 25 06:27:48 2018 by root via cibadmin on ha-iscsi2 2 nodes configured 4 resources configured (2 DISABLED) Online: [ ha-iscsi1 ha-iscsi2 ] Full list of resources: Master/Slave Set: ms_drbd_r0 [drbd_r0] Masters: [ ha-iscsi1 ha-iscsi2 ]
Ahora la configuración de istgt
crm configure primitive istgt lsb:istgt crm configure clone istgt_clone istgt
Salida de "crm status"
... Clone Set: istgt_clone [istgt] Started: [ ha-iscsi1 ha-iscsi2 ] ...
Instalación lxc1pi
apt install screen nmon ufw libncurses5-dev git bc multipath-tools lxc open-iscsi
Para multipath se debe instalar nuevo kernel. Sin eso no puedo utilisar multipathing. Estes modules faltan:
- scsi_dh_alua
- scsi_dh_emc
- scsi_dh_rdac
- dm-multipath
El código de kernel se tiene que descarger con git.
cd /usr/src git clone --depth=1 https://github.com/raspberrypi/linux cd linux
Configurar el kernel.
KERNEL=kernel7 make bcm2709_defconfig make menuconfig
En el menu de "make menuconfig":
Device Drivers > SCSI device support > SCSI Device Handlers <M> LSI RDAC Device Handler < > HP/COMPAQ MSA Device Handler <M> EMC CLARiiON Device Handler <M> SPC-3 ALUA Device Handler Device Drivers > Multiple devices driver support (RAID and LVM) <M> Multipath target
Compilar el kernel
make -j4 zImage modules dtbs
Instalar modules
make modules_install
Instalar nuevo kernel y otros archivos
cp arch/arm/boot/dts/*.dtb /boot/ cp arch/arm/boot/dts/overlays/*.dtb* /boot/overlays/ cp arch/arm/boot/dts/overlays/README /boot/overlays/ cp arch/arm/boot/zImage /boot/$KERNEL.img
Reiniciar sistema
shutdown -r now
Configuración
Firewall
Proteger contra tráfico del router (igual internet) el servicio ssh. Las direcciones de ip son individuales y por supuesto otra persona necesita cambiarlas.
En mi casa el router tiene dirección 10.250.30.250 y mi red privado es 10.250.30.0/24
ufw default deny incoming ufw default allow outgoing ufw deny from 10.250.30.250 to any port 22 ufw allow from 10.250.30.0/24 to any port 22
open-iscsi
Uno pone nombre de initiator que se puede encontrar en ha-iscsi1 y 2 en el archivo /etc/istgt/istgt.conf
# etc/iscsi/initiatorname.iscsi InitiatorName=iqn.xenx.debian:01:4a4e9e22cdb
/etc/iscsi/iscsid.conf
iscsid.startup = /usr/sbin/iscsid node.startup = automatic node.startup = manual node.leading_login = No node.session.timeo.replacement_timeout = 120 node.conn[0].timeo.login_timeout = 15 node.conn[0].timeo.logout_timeout = 15 node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 5 node.session.err_timeo.abort_timeout = 15 node.session.err_timeo.lu_reset_timeout = 30 node.session.err_timeo.tgt_reset_timeout = 30 node.session.initial_login_retry_max = 8 node.session.cmds_max = 128 node.session.queue_depth = 32 node.session.xmit_thread_priority = -20 node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = Yes node.session.iscsi.FirstBurstLength = 262144 node.session.iscsi.MaxBurstLength = 16776192
Busqando de ha-iscsi1 y 2, login y inicio automático
ha-iscsi1
iscsiadm -m discovery -t st -p 10.250.30.202 iscsiadm --mode node --targetname ha-iscsi1.202.istgt:disk1 --portal 10.250.30.202:3260 --login iscsiadm -m node -T ha-iscsi1.202.istgt:disk1 --portal 10.250.30.202:3260 --op update -n node.startup -v automatic
ha-iscsi2
iscsiadm -m discovery -t st -p 10.250.30.203 iscsiadm --mode node --targetname ha-iscsi2.203.istgt:disk1 --portal 10.250.30.203:3260 --login iscsiadm -m node -T ha-iscsi2.203.istgt:disk1 --portal 10.250.30.203:3260 --op update -n node.startup -v automatic
Probar con "cat /proc/partitions"
... 8 0 20971520 sda 8 16 20971520 sdb ...
multipath-tools
/etc/multipath.conf
blacklist { devnode "ram" devnode "mmcblk0" devnode "sdc" } defaults { user_friendly_names yes path_selector "round-robin 0" }
Reiniciar mutlipath-tools
systemctl stop multipath-tools systemctl start multipath-tools
Probar con "lsblk"
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk └─mpatha 254:0 0 20G 0 mpath sdb 8:16 0 20G 0 disk └─mpatha 254:0 0 20G 0 mpath
Instalación de primer container publicwiki
pvcreate /dev/mapper/mpatha vgcreate cvg1 /dev/mapper/mpatha
Crear un LXC-Container
lxc-create -t debian -n wiki -B lvm --vgname cvg1 --fssize 8G --fstype ext4
Ahora lxc1 voy a descargar un sistema completo de 211 MByte. Se puede encontrar en:
/var/cache/lxc/debian/rootfs-stable-armhf
Se crea un nuevo container con ese commando ...
lxc-create -t debian -n proxy -B lvm --vgname cvg1 --fssize 8G --fstype ext4
... lxc usa el sistema de /var/cache/lxc/debian/rootfs-stable-armhf
La configuración de wiki container se encuentra aqui:
/var/lib/lxc/wiki/config
Contenido
#lxc.include = /usr/share/lxc/config/nesting.conf # (Be aware this has security implications) lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.ipv4 = 10.250.30.206/24 lxc.network.ipv4.gateway = 10.250.30.4 lxc.rootfs = /dev/cvg1/wiki lxc.rootfs.backend = lvm # Common configuration lxc.include = /usr/share/lxc/config/debian.common.conf # Container specific configuration lxc.tty = 4 lxc.utsname = wiki lxc.arch = armhf
Instalaciones en wiki
apt vim fio apache2 mediawiki php mariadb-server gnutls-bin ufw
Configuración mediawiki en container publicwiki
/etc/mediawiki/LocalSettings.php
Restrictar derechos
... # prevent editing and reading by anons (except for exception listed below): $wgGroupPermissions['*']['edit'] = false; $wgGroupPermissions['*']['createaccount'] = false; $wgGroupPermissions['*']['read'] = true; # same for normal registered users: $wgGroupPermissions['user']['edit'] = false; $wgGroupPermissions['user']['read'] = true; # allow everyone read access to these pages: #$wgWhitelistRead = array( "Main Page", "Public stuff" ); # allow sysops to read and edit normally: $wgGroupPermissions['sysop']['edit'] = true; $wgGroupPermissions['sysop']['read'] = true;
Visual Editor
... wfLoadExtension( 'VisualEditor' ); // Enable by default for everybody $wgDefaultUserOptions['visualeditor-enable'] = 1; // Optional: Set VisualEditor as the default for anonymous users // otherwise they will have to switch to VE $wgDefaultUserOptions['visualeditor-editor'] = "visualeditor"; // Don't allow users to disable it $wgHiddenPrefs[] = 'visualeditor-enable'; ....
Activar cargar
... $wgEnableUploads = true; ...
Configuración Apache2 en container publicwiki
Cambio este archivo:
/etc/apache2/sites-enabled/000-default.conf
Explicaciónes están en los commentarios
... <Virtualhost ... ... # Custom Configs, Redirect to mediawiki-app RewriteEngine on RewriteRule ^/$ "/mediawiki/" [R] ## Against DDOS-Attacks ## # https://httpd.apache.org/docs/2.4/mod/mod_reqtimeout.html#requestreadtimeout # 5/10 seconds (Header/Body) and every 1000 Bytes one second more RequestReadTimeout header=5,MinRate=1000 body=10,MinRate=1000 # Timeout für Get/Post/Put-Requests, ACKs bei TCP-Paketen TimeOut 10 # Einige Limits für Body und Header # # Upload bis 10 MByte LimitRequestBody 10485760 # Maximale Anzahl Request-Header je Client, default 100 LimitRequestFields 20 # Grösse des Reqest-Headers in Bytes, default 8190 LimitRequestFieldSize 2048 # Grösse der Anfragezeile des Clients in Bytes, default 8190 LimitRequestLine 512 # XML- Request Body in Bytes; default 1000000 LimitXMLRequestBody 100000 </Virtualhost>
Configuración Firewall en Container publicwiki
En este sistema el firewall está muy restrictiva. Por eso niego el tráfico saliendo.
Tambien di cuenta que tráfico del router 10.250.30.250 no puede atacar por ssh.
ufw default deny incoming ufw default deny outgoing ufw deny from 10.250.30.250 to any port 22 ufw allow from 10.250.30.0/24 to any port 22 ufw allow http ufw allow https ufw allow out to any port 53 ufw allow out to any port 80 ufw allow out to any port 443
Para probar el firewall por 30 secundos:
ufw enable; sleep 30; ufw disable
Configuración SSL para Apache2 en Container publicwiki
No está. ¿Let's encrypt?
Operaciónes
Apagar Cluster
En este orden
- lxc1pi
- ha-iscsi1
- ha-iscsi2
lxc1pi
* Apager containers
lxc-stop -n wiki
Se prueba con ...
lxc-ls --fancy
* Apagar sistema
shutdown -h now
ha-iscsi1 o ha-iscsi2
- Apagar istgt
crm resource stop istgt
- Apagar drbd
crm resource stop drbd_r0
* Apagar sistema
shutdown -h now
Encender Cluster
En este orden
- ha-iscsi1 y ha-iscsi2
- lxc1pi
ha-iscsi1 o 2
- Iniciar drbd
crm resource start drbd_r0
- iniciar istgt
crm resource start istgt
Probar con ...
crm status
y ...
cat /proc/drbd
y ...
netstat -ntul | grep 3260
lxc1pi
¡Encienda lxc1pi sin USB-stick!
Se prueba que iscsi sda y sdb existen
cat /proc/partitions
y ...
lsblk
y
lvs
Iniciar container wiki
lxc-start -n wiki
Recuperar cluster fallado (Workaround)
Problema: Si uno apaga uno de los drbd-nodes y enciende este drbd-node, los drbd-nodes iniciar a sync sus datos. Pero el sync no completa ( cat /prod/drbd -> stalled).
Node lxc1pi está apagado
Desactiva istgt y drbd_r0
crm resource stop istgt crm resource stop drbd_r0
Pare pacemaker en ambos nodes
systemctl stop pacemaker
Borra contenido de meta-disk en ambos nodes
drbdadm down r0 drbdadm wipe-md r0
Iniciar drbd de nuevo
En ambos nodes
drbdadm create-md r0 drbdadm up r0
En unos de nodes
drbdadm verify r0
Puede ser que uno o dos Raspberrys pierden sus USB-Sticks. Por eso de nuevo meta los USB-Sticks.
Iniciar pacemaker
systemctl start pacemaker
Espera algunos minutos
crm resource start drbd_r0
Espera algunos minutos
crm resource start istgt
Después encienda lxc1pi
Prueba si lvm wiki está existiendo en lxc1pi
lvs
En mi caso asi es
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert wiki cvg1 -wi-a----- 8.00g
Fallado cluster con drbd standalone
Si quieres hacer la recuperación otra vez, puede usar drbd standalone en el node sano, "UpToDate".
Primeramente apaga los containers en lxc1pi.
Solo "shutdown -h now" en el node "inconsistente" o desconectar el corriente de node malo.
Si el node sano ya perdió el disco fisico, apagar el node malo. Ademas, en el node sano, apagar los servicios con crm, luego apager pacemaker con systemctl y iniciar drbd manualmente:
drbdadm up r0 drbdadm primary r0 systemctl start istgt
Después encienda lxc1pi y sus containers.
crm: Borrar mensajes de error
ha-iscsi1 y ha-iscsi2
crm_resource --cleanup --node ha-iscsi1 --resource drbd_r0 crm_resource --cleanup --node ha-iscsi1 --resource istgt crm_resource --cleanup --node ha-iscsi2 --resource drbd_r0 crm_resource --cleanup --node ha-iscsi2 --resource istgt
Se puede usar estes comandos aunque no haya errores.
Causas pueden ser que el usb-stick desaparecer en Linux
Backup de LVM de lxc-guest publicwiki con lxc-Host lxc1pi
Para backups tengo otro USB-Stick.
El LXC-Host tiene por iscsi el LVM /dev/cvg1/wiki y tiene 8 GB. Por eso conecto el USB-Stick en Raspberry de LXC y creo otro vg
pvcreate /dev/sdc vgcreate drbdbackup /dev/sdc lvcreate -L +8G -n wiki drbdbackup
lvscan
ACTIVE '/dev/drbdbackup/wiki' [8.00 GiB] inherit ACTIVE '/dev/cvg1/wiki' [8.00 GiB] inherit
Con el siguiente comando se escribe datos de /dev/cvg1/wiki en /dev/drbdbackup/wiki
dd if=/dev/cvg1/wiki of=/dev/drbdbackup/wiki bs=4096 status=progress
Para recuperar:
dd if=/dev/drbdbackup/wiki of=/dev/cvg1/wiki bs=4096 status=progress
DRBD: Probar potencia de disco duro en LXC guest wiki: Datos random, escribiendo
fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G --numjobs=4 --time_based --runtime=600 --group_reporting
Resultado IOPS=90
fio-2.16 Starting 4 processes fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 4 (f=4): [w(4)] [5.2% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 03h:03m:59s] fio_test_file: (groupid=0, jobs=4): err= 0: pid=744: Mon Apr 9 02:09:20 2018 write: io=218920KB, bw=368930B/s, iops=90, runt=607633msec clat (msec): min=1, max=85404, avg=44.37, stdev=1583.74 lat (msec): min=1, max=85404, avg=44.38, stdev=1583.74 clat percentiles (usec): | 1.00th=[ 1656], 5.00th=[ 1720], 10.00th=[ 1752], 20.00th=[ 1800], | 30.00th=[ 1848], 40.00th=[ 1912], 50.00th=[ 1960], 60.00th=[ 2040], | 70.00th=[ 2096], 80.00th=[ 2224], 90.00th=[ 2448], 95.00th=[ 2704], | 99.00th=[ 3440], 99.50th=[ 4640], 99.90th=[7831552], 99.95th=[16711680], | 99.99th=[16711680] lat (msec) : 2=55.49%, 4=43.87%, 10=0.36%, 20=0.04%, 50=0.02% lat (msec) : 100=0.07%, 250=0.03%, >=2000=0.13% cpu : usr=0.07%, sys=0.69%, ctx=65968, majf=0, minf=81 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=54730/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=218920KB, aggrb=360KB/s, minb=360KB/s, maxb=360KB/s, mint=607633msec, maxt=607633msec Disk stats (read/write): dm-1: ios=2/57656, merge=0/0, ticks=10/8063580, in_queue=8794040, util=100.00%, aggrios=2/54916, aggrmerge=0/2752, aggrticks=10/4883320, aggrin_queue=5059740, aggrutil=100.00% dm-0: ios=2/54916, merge=0/2752, ticks=10/4883320, in_queue=5059740, util=100.00%, aggrios=16/27453, aggrmerge=0/0, aggrticks=249775/2020915, aggrin_queue=2358900, aggrutil=60.84% sdb: ios=20/29301, merge=0/0, ticks=278230/2786920, in_queue=3241630, util=60.84% sda: ios=13/25606, merge=0/0, ticks=221320/1254910, in_queue=1476170, util=39.36%
DRBD: Probar potencia de disco duro en LXC guest wiki: Datos random, leyendo
fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G --numjobs=4 --time_based --runtime=600 --group_reporting
Resultado: IOPS=487
fio-2.16 Starting 4 processes fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 4 (f=4): [r(4)] [100.0% done] [1704KB/0KB/0KB /s] [426/0/0 iops] [eta 00m:00s] fio_test_file: (groupid=0, jobs=4): err= 0: pid=753: Mon Apr 9 02:49:54 2018 read : io=1143.1MB, bw=1950.9KB/s, iops=487, runt=600009msec clat (msec): min=1, max=415, avg= 8.17, stdev= 2.92 lat (msec): min=1, max=415, avg= 8.18, stdev= 2.92 clat percentiles (usec): | 1.00th=[ 5088], 5.00th=[ 5920], 10.00th=[ 6368], 20.00th=[ 6944], | 30.00th=[ 7392], 40.00th=[ 7776], 50.00th=[ 8160], 60.00th=[ 8512], | 70.00th=[ 8896], 80.00th=[ 9280], 90.00th=[ 9792], 95.00th=[10432], | 99.00th=[11840], 99.50th=[12864], 99.90th=[16320], 99.95th=[17536], | 99.99th=[22400] lat (msec) : 2=0.01%, 4=0.11%, 10=91.65%, 20=8.23%, 50=0.01% lat (msec) : 500=0.01% cpu : usr=0.30%, sys=0.84%, ctx=297716, majf=0, minf=74 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=292631/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: io=1143.1MB, aggrb=1950KB/s, minb=1950KB/s, maxb=1950KB/s, mint=600009msec, maxt=600009msec Disk stats (read/write): dm-1: ios=292556/5, merge=0/0, ticks=2353450/40, in_queue=2353820, util=100.00%, aggrios=292631/5, aggrmerge=0/1, aggrticks=2368930/40, aggrin_queue=2368520, aggrutil=100.00% dm-0: ios=292631/5, merge=0/1, ticks=2368930/40, in_queue=2368520, util=100.00%, aggrios=146316/2, aggrmerge=0/0, aggrticks=1172205/5, aggrin_queue=1172085, aggrutil=100.00% sdb: ios=292633/4, merge=0/0, ticks=2344410/10, in_queue=2344170, util=100.00% sda: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
USB-Stick solo: Probar potencia de disco duro: Datos sequencial, escribiendo
Con Cache
dd if=/dev/zero of=zerofile bs=1M count=2048
Resultado
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 126.028 s, 17.0 MB/s
Sin Cache
dd if=/dev/zero of=zerofile bs=1M count=2048 oflag=direct
Resultado
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 107.316 s, 20.0 MB/s
USB-Stick solo: Probar potencia de disco duro: Datos random, escribiendo
fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G --numjobs=4 --time_based --runtime=600 --group_reporting
Resultado: IOPS=92
fio_test_file: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.10 Starting 4 processes fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 4 (f=4): [w(4)] [5.4% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 02h:58m:59s] fio_test_file: (groupid=0, jobs=4): err= 0: pid=2776: Wed Mar 14 18:02:10 2018 write: io=224596KB, bw=378306B/s, iops=92, runt=607937msec clat (msec): min=2, max=29861, avg=43.28, stdev=471.46 lat (msec): min=2, max=29861, avg=43.28, stdev=471.46 clat percentiles (msec): | 1.00th=[ 25], 5.00th=[ 26], 10.00th=[ 26], 20.00th=[ 27], | 30.00th=[ 28], 40.00th=[ 28], 50.00th=[ 28], 60.00th=[ 29], | 70.00th=[ 29], 80.00th=[ 30], 90.00th=[ 30], 95.00th=[ 31], | 99.00th=[ 44], 99.50th=[ 73], 99.90th=[ 8225], 99.95th=[ 9503], | 99.99th=[16712] bw (KB /s): min= 0, max= 164, per=37.09%, avg=136.85, stdev=24.83 lat (msec) : 4=0.01%, 10=0.01%, 20=0.07%, 50=99.05%, 100=0.58% lat (msec) : 250=0.09%, 500=0.02%, 750=0.02%, >=2000=0.15% cpu : usr=0.07%, sys=0.32%, ctx=56830, majf=0, minf=79 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=56149/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=224596KB, aggrb=369KB/s, minb=369KB/s, maxb=369KB/s, mint=607937msec, maxt=607937msec Disk stats (read/write): sda: ios=0/58010, merge=0/21112, ticks=0/6911164, in_queue=7051628, util=100.00%
USB-Stick solo: Probar potencia de disco duro: Datos random, leyendo
fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G --numjobs=4 --time_based --runtime=600 --group_reporting
Resultado IOPS=677
fio_test_file: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.10 Starting 4 processes fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) fio_test_file: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 4 (f=4): [r(4)] [100.0% done] [2718KB/0KB/0KB /s] [679/0/0 iops] [eta 00m:00s] fio_test_file: (groupid=0, jobs=4): err= 0: pid=3482: Wed Mar 14 18:33:42 2018 read : io=1588.3MB, bw=2710.6KB/s, iops=677, runt=600006msec clat (msec): min=2, max=1794, avg= 5.88, stdev=13.72 lat (msec): min=2, max=1794, avg= 5.88, stdev=13.72 clat percentiles (usec): | 1.00th=[ 4768], 5.00th=[ 4960], 10.00th=[ 5088], 20.00th=[ 5216], | 30.00th=[ 5280], 40.00th=[ 5408], 50.00th=[ 5600], 60.00th=[ 5792], | 70.00th=[ 6048], 80.00th=[ 6304], 90.00th=[ 6624], 95.00th=[ 7008], | 99.00th=[ 7648], 99.50th=[ 7904], 99.90th=[ 8512], 99.95th=[ 8640], | 99.99th=[13888] bw (KB /s): min= 42, max= 731, per=25.39%, avg=688.09, stdev=52.28 lat (msec) : 4=0.01%, 10=99.99%, 20=0.01%, 50=0.01%, 2000=0.01% cpu : usr=0.33%, sys=1.04%, ctx=407412, majf=0, minf=73 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=406591/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: io=1588.3MB, aggrb=2710KB/s, minb=2710KB/s, maxb=2710KB/s, mint=600006msec, maxt=600006msec Disk stats (read/write): sda: ios=406550/4, merge=0/1, ticks=2372436/84, in_queue=2371860, util=100.00%
Troubleshooting
No sufficiente Volt significa muchas sintomas
Solución: Fuente de corriente de Raspberry Foundation o de Samsung
Sintomas:
- DRBD-Nodes: Muchos IO-Errors en sda. Los observé con "dmesg"
- LXC-guest: Rootfs en Ext4 está read only.
DRBD-Node: uno es "inconsistent". Lo observé con "cat /proc/drbd". "UpToDate" es normal.
Como realizé lo que no hay suficiente Volt:
/var/log/kern.log ... ... Voltage normalised ... ... Under-voltage detected ...
Split-Brain en DRBD
Causa: Ambos DRBD-Nodes no puede communicar en Port 7788. Había mucho tráfico en iscsi, port 3260, y drbd, port 7788, porque probé la potencia de discos con fio y dd. El tráfico de iscsi pusiera ser demasiado alto. Supongo que en consecuencia los dos drbd nodes no pusieran contactarse en port 7788.
Otra causa posible: No hay suficiente Volt.
Sintoma: Casi siempre drbd no funcionó cuando probaba la potencia de discos con fio o dd.
Solución: Traffic shaping con este script en /root/tc.sh en ha-iscsi1 y ha-iscsi2
DEV=eth0 IPT=/sbin/iptables TC=/sbin/tc $IPT -t mangle -F $TC qdisc del dev $DEV ingress > /dev/null 2>&1 $TC qdisc del dev $DEV root > /dev/null 2>&1 $TC qdisc del dev lo root > /dev/null 2>&1 $TC qdisc add dev $DEV root handle 1:0 htb default 12 r2q 6 $TC class add dev $DEV parent 1:0 classid 1:1 htb rate 100mbit ceil 100mbit $TC class add dev $DEV parent 1:1 classid 1:10 htb rate 10mbit ceil 100mbit prio 0 $TC class add dev $DEV parent 1:1 classid 1:11 htb rate 50mbit ceil 100mbit prio 1 $TC class add dev $DEV parent 1:1 classid 1:12 htb rate 40mbit ceil 100mbit prio 2 $IPT -A POSTROUTING -t mangle -o $DEV -p tcp -m length --length :64 -j MARK --set-mark 10 $IPT -A POSTROUTING -t mangle -o $DEV -p udp --dport 5405 -j MARK --set-mark 10 $IPT -A POSTROUTING -t mangle -o $DEV -p udp --sport 5405 -j MARK --set-mark 10 $IPT -A POSTROUTING -t mangle -o $DEV -p tcp --dport 7788 -j MARK --set-mark 11 $IPT -A POSTROUTING -t mangle -o $DEV -p tcp --sport 7788 -j MARK --set-mark 11 $TC filter add dev $DEV parent 1:0 prio 0 protocol ip handle 10 fw flowid 1:10 $TC filter add dev $DEV parent 1:0 prio 0 protocol ip handle 11 fw flowid 1:11 $TC qdisc add dev $DEV parent 1:10 handle 10: sfq perturb 10 $TC qdisc add dev $DEV parent 1:11 handle 11: sfq perturb 10 $TC qdisc add dev $DEV parent 1:12 handle 12: sfq perturb 10
Que significa eso:
* Respuesta de TCP tiene la más alta prioridad.(prio 0 y marca de mangletable 10). 10 MBit velocidad. * Port de DRBD tiene la segunda más alta prioridad y corosync tambien (prio 1 y marca de mangletable 11). 50 MBit velocidad. * Otro Ports y en consecuencia tambien iscsi, 3260, tienen la tercera más prioridad (prio 2 y marca de tc "default 12"). 40 MBit velocidad.
Resync de DRBD está "stalled"
Causa: Apago un drbd-node para mantener. El sync funcionó para aproxidamente 8 GByte (a veces más, a veces menos)
Workaround: en este documento: "Recuperar cluster fallado (Workaround)"
Solución posible: Disco de DRBD con propia fuente de corriente y USB. Carcasa de disco 3,5 " . ( Carcasa de disco 2,5 " normalmente no tiene propia fuente de corriente)
To_Do
- Dyndns
- Let's encrypt