HomeLab – Upgrade Main Storage Server from TrueNAS Core to TrueNAS Scale May Not Be a Good Idea

After waiting for a few months, I finally upgraded my main storage server from TureNAS core to TureNAS Scale during my server upgrade. Now I am asking myself, do I really need this upgrade?

Homelab – Basic VM Disk IO Benchmark

I have had a storage server upgrade last few weeks, and just did some IO benchmark in my hypervisor.

I am using VMware ESXi as my hypervisor on a dedicated machine. it hooks on my Truenas Scale storage server with a 10G iSCSI connection and standard 1500 MTU (no jumble frame.)

Tests are performed with fio, and Google provides a simple guideline for VM disk IO benchmark https://cloud.google.com/compute/docs/disks/benchmarking-pd-performance

Test write throughput by performing sequential writes with multiple parallel streams (8+), using an I/O block size of 1 MB and an I/O depth of at least 64:

sudo fio --name=write_throughput --directory=$TEST_DIR --numjobs=8 \--size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \--direct=1 --verify=0 --bs=1M --iodepth=64 --rw=write \--group_reporting=1
# sudo fio --name=write_throughput --directory=$TEST_DIR --numjobs=8 \--size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \--direct=1 --verify=0 --bs=1M --iodepth=64 --rw=write \--group_reporting=1
write_throughput: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=64
...
fio-3.19
Starting 8 processes
write_throughput: Laying out IO file (1 file / 10240MiB)
write_throughput: Laying out IO file (1 file / 10240MiB)
write_throughput: Laying out IO file (1 file / 10240MiB)
write_throughput: Laying out IO file (1 file / 10240MiB)
write_throughput: Laying out IO file (1 file / 10240MiB)
Jobs: 7 (f=7): [W(4),_(1),W(3)][52.5%][w=64.9MiB/s][w=64 IOPS][eta 00m:58s]
write_throughput: (groupid=0, jobs=8): err= 0: pid=2502: Thu Jun 16 17:30:06 2022
  write: IOPS=224, BW=229MiB/s (240MB/s)(13.7GiB/61466msec); 0 zone resets
    slat (usec): min=47, max=1918.2k, avg=22289.12, stdev=138620.30
    clat (msec): min=19, max=49855, avg=1843.07, stdev=5712.01
     lat (msec): min=176, max=50052, avg=1865.65, stdev=5811.18
    clat percentiles (msec):
     |  1.00th=[  180],  5.00th=[  184], 10.00th=[  300], 20.00th=[  735],
     | 30.00th=[  776], 40.00th=[  793], 50.00th=[  827], 60.00th=[  860],
     | 70.00th=[  919], 80.00th=[ 1301], 90.00th=[ 1569], 95.00th=[ 2072],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=28617, max=1160657, per=100.00%, avg=252403.65, stdev=41662.93, samples=472
   iops        : min=   21, max= 1129, avg=240.85, stdev=40.74, samples=472
  lat (msec)   : 20=0.01%, 100=0.01%, 250=9.83%, 500=0.87%, 750=13.68%
  lat (msec)   : 1000=53.79%, 2000=18.23%, >=2000=5.37%
  cpu          : usr=0.20%, sys=0.38%, ctx=13534, majf=0, minf=467
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=0.6%, 32=1.2%, >=64=98.1%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,13827,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=229MiB/s (240MB/s), 229MiB/s-229MiB/s (240MB/s-240MB/s), io=13.7GiB (14.8GB), run=61466-61466msec

Disk stats (read/write):
    dm-0: ios=0/16134, merge=0/0, ticks=0/12396824, in_queue=12396824, util=99.90%, aggrios=0/16134, aggrmerge=0/1, aggrticks=0/12424846, aggrin_queue=12424845, aggrutil=99.87%
  sda: ios=0/16134, merge=0/1, ticks=0/12424846, in_queue=12424845, util=99.87%
[root@localhost Data]#

Test write IOPS by performing random writes, using an I/O block size of 4 KB and an I/O depth of at least 64:

sudo fio --name=write_iops --directory=$TEST_DIR --size=10G \--time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \--verify=0 --bs=4K --iodepth=64 --rw=randwrite --group_reporting=1
[root@io-test ~]# clear
[root@io-test ~]# sudo fio --name=write_iops --directory=$TEST_DIR --size=10G \
> --time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \
> --verify=0 --bs=4K --iodepth=64 --rw=randwrite --group_reporting=1
write_iops: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.19
Starting 1 process
fio: io_u error on file /Data/fiotest/write_iops.0.0: No space left on device: write offset=303255552, buflen=4096
fio: pid=2596, err=28/file:io_u.c:1803, func=io_u error, error=No space left on device

write_iops: (groupid=0, jobs=1): err=28 (file:io_u.c:1803, func=io_u error, error=No space left on device): pid=2596: Thu Jun 16 17:32:46 2022
  write: IOPS=60.0k, BW=238MiB/s (250MB/s)(5709MiB/23972msec); 0 zone resets
    slat (usec): min=3, max=36791, avg= 8.45, stdev=39.28
    clat (usec): min=184, max=177493, avg=1040.16, stdev=2205.25
     lat (usec): min=198, max=177499, avg=1048.82, stdev=2206.88
    clat percentiles (usec):
     |  1.00th=[   635],  5.00th=[   717], 10.00th=[   750], 20.00th=[   799],
     | 30.00th=[   832], 40.00th=[   857], 50.00th=[   889], 60.00th=[   922],
     | 70.00th=[   963], 80.00th=[  1020], 90.00th=[  1139], 95.00th=[  1647],
     | 99.00th=[  3294], 99.50th=[  4686], 99.90th=[  9765], 99.95th=[ 15008],
     | 99.99th=[160433]
   bw (  KiB/s): min=82464, max=294088, per=100.00%, avg=245998.02, stdev=67413.87, samples=47
   iops        : min=20616, max=73522, avg=61499.38, stdev=16853.44, samples=47
  lat (usec)   : 250=0.01%, 500=0.01%, 750=9.57%, 1000=67.11%
  lat (msec)   : 2=18.91%, 4=3.75%, 10=0.55%, 20=0.05%, 50=0.02%
  lat (msec)   : 100=0.01%, 250=0.02%
  cpu          : usr=14.15%, sys=45.55%, ctx=156240, majf=0, minf=68
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1461581,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=238MiB/s (250MB/s), 238MiB/s-238MiB/s (250MB/s-250MB/s), io=5709MiB (5987MB), run=23972-23972msec

Disk stats (read/write):
    dm-0: ios=0/1592185, merge=0/0, ticks=0/1429699, in_queue=1429699, util=99.56%, aggrios=0/1591823, aggrmerge=0/362, aggrticks=0/1422162, aggrin_queue=1422162, aggrutil=99.56%
  sda: ios=0/1591823, merge=0/362, ticks=0/1422162, in_queue=1422162, util=99.56%
[root@io-test ~]#

Test read throughput by performing sequential reads with multiple parallel streams (8+), using an I/O block size of 1 MB and an I/O depth of at least 64:

sudo fio --name=read_throughput --directory=$TEST_DIR --numjobs=8 \
--size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \
--direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read \
--group_reporting=1
[root@io-test ~]# sudo fio --name=read_throughput --directory=$TEST_DIR --numjobs=8 \
> --size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \
> --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read \
> --group_reporting=1
read_throughput: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=64
...
fio-3.19
Starting 8 processes
Jobs: 2 (f=2): [_(1),R(2),_(5)][100.0%][r=14.0GiB/s][r=15.3k IOPS][eta 00m:00s]
read_throughput: (groupid=0, jobs=8): err= 0: pid=1881: Thu Jun 16 17:48:29 2022
  read: IOPS=21.4k, BW=20.9GiB/s (22.4GB/s)(1259GiB/60292msec)
    slat (usec): min=13, max=76074, avg=197.95, stdev=422.19
    clat (nsec): min=1404, max=693868k, avg=23722148.51, stdev=59989033.56
     lat (usec): min=86, max=693925, avg=23920.38, stdev=60025.44
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   12], 10.00th=[   12], 20.00th=[   12],
     | 30.00th=[   12], 40.00th=[   13], 50.00th=[   13], 60.00th=[   13],
     | 70.00th=[   13], 80.00th=[   13], 90.00th=[   13], 95.00th=[   14],
     | 99.00th=[  342], 99.50th=[  347], 99.90th=[  384], 99.95th=[  414],
     | 99.99th=[  456]
   bw (  MiB/s): min=18584, max=26532, per=100.00%, avg=21483.09, stdev=176.33, samples=954
   iops        : min=18579, max=26529, avg=21478.41, stdev=176.35, samples=954
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 100=0.01%, 250=0.01%
  lat (usec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.07%, 20=96.31%, 50=0.02%
  lat (msec)   : 250=0.01%, 500=3.57%, 750=0.01%
  cpu          : usr=0.82%, sys=49.54%, ctx=41872, majf=0, minf=466
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=1288207,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=20.9GiB/s (22.4GB/s), 20.9GiB/s-20.9GiB/s (22.4GB/s-22.4GB/s), io=1259GiB (1351GB), run=60292-60292msec

Disk stats (read/write):
    dm-0: ios=47642/87, merge=0/0, ticks=15825386/36345, in_queue=15861731, util=100.00%, aggrios=47652/75, aggrmerge=1189/12, aggrticks=15817689/23298, aggrin_queue=15840987, aggrutil=100.00%
  sda: ios=47652/75, merge=1189/12, ticks=15817689/23298, in_queue=15840987, util=100.00%
[root@io-test ~]#

Test read IOPS by performing random reads, using an I/O block size of 4 KB and an I/O depth of at least 64:

sudo fio --name=read_iops --directory=$TEST_DIR --size=10G \
--time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \
--verify=0 --bs=4K --iodepth=64 --rw=randread --group_reporting=1
[root@io-test ~]# sudo fio --name=read_iops --directory=$TEST_DIR --size=10G \
> --time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \
> --verify=0 --bs=4K --iodepth=64 --rw=randread --group_reporting=1
read_iops: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.19
Starting 1 process
read_iops: Laying out IO file (1 file / 10240MiB)
fio: ENOSPC on laying out file, stopping
Jobs: 1 (f=1): [r(1)][100.0%][r=1061MiB/s][r=272k IOPS][eta 00m:00s]
read_iops: (groupid=0, jobs=1): err= 0: pid=1913: Thu Jun 16 17:50:33 2022
  read: IOPS=271k, BW=1057MiB/s (1108MB/s)(61.9GiB/60001msec)
    slat (nsec): min=1569, max=697200, avg=2087.86, stdev=1413.17
    clat (usec): min=2, max=11769, avg=233.84, stdev=26.11
     lat (usec): min=5, max=11886, avg=236.04, stdev=26.43
    clat percentiles (usec):
     |  1.00th=[  182],  5.00th=[  204], 10.00th=[  217], 20.00th=[  225],
     | 30.00th=[  225], 40.00th=[  227], 50.00th=[  229], 60.00th=[  233],
     | 70.00th=[  237], 80.00th=[  243], 90.00th=[  260], 95.00th=[  273],
     | 99.00th=[  330], 99.50th=[  359], 99.90th=[  441], 99.95th=[  562],
     | 99.99th=[  816]
   bw (  MiB/s): min=  946, max= 1085, per=100.00%, avg=1058.05, stdev=22.12, samples=119
   iops        : min=242382, max=277826, avg=270859.52, stdev=5662.33, samples=119
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
  lat (usec)   : 250=85.50%, 500=14.43%, 750=0.05%, 1000=0.02%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=48.33%, sys=51.25%, ctx=281, majf=0, minf=58
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=16234920,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=1057MiB/s (1108MB/s), 1057MiB/s-1057MiB/s (1108MB/s-1108MB/s), io=61.9GiB (66.5GB), run=60001-60001msec

Disk stats (read/write):
    dm-0: ios=2831/16, merge=0/0, ticks=1340/15, in_queue=1355, util=6.47%, aggrios=2833/10, aggrmerge=0/6, aggrticks=1338/9, aggrin_queue=1346, aggrutil=6.45%
  sda: ios=2833/10, merge=0/6, ticks=1338/9, in_queue=1346, util=6.45%
[root@io-test ~]#

Homelab – Mikrotik CCR1009-8G-1S-1S+ Noctua Fan Replacement

I bought a Mikrotik CCR1009-8G-1S-1S+ several years ago and use as my main router at home. It was relatively quiet with lower load when put on a shelf, but after I move to the new apartment and put it into a network cabinet, the noise was much higher especially in hot summer. So I decided to replace the factory fan with something quilter.

Homelab – How to Organise Equipment

Title Photo by Magnus Engø on Unsplash

Group your equipment together

Group your equipment together, make sure you can access most of your equipment at the same location at the same time. This will reduce lots of effort when tracing an unknown device issue – you don’t have to run between different places.

You can have your equipment group in two or three locations, but you should not put them everywhere at home, except some special network device such as WiFi access point, or communication device, for example, VoIP phone.

Put in a fixed location

The system should be located on a fixed location at home. Depends on the country you located, house structure you stay and types of equipment you have, the location of your home lab can be different.

Choose good locations is one of the most challenge part of your home lab setup. there are many variables have to consider

Power accessibility

First of all, there must be a power source close to your homelab device. it is not a good idea of running long extension power cable without a certified electrician. Using extension power cable with high power equipment can be a fire hazard with, and not secured cable can cause accidents.

Provide dedicated circuit and power lets for your equipment if possible. Do not share a circuit with other high power electrical appliances such as air-conditioning, electrical oven or electrical heater. You don’t want your server lost power because of an overcurrent or short circuit in a different room far from your homelab.

Network accessibility

Make sure you have enough existing network cable access to your homelab location. You should also have a backup line for your core network in case of a cable problem.

Evaluate your cable for future upgrade, for example, it will be very difficult to run a 10G network on a CAT5e cable. Consider the difference between a copper-based network or a fibre-based network, and which type of network your device supports.

Server noise

Many enterprise equipments are designed for running inside a server room or datacenter, in most of the cases server noise is not a concern. Several new server models from HP and DELL is much quieter than before, but running noise may sill too loud for most of the people, especially at night.

Thermal control and ventilation

Homelab equipment consumes electric and produces heat. Now new server cloud operates under higher environment temperature such as 28 Celsius, but some parts such as hard-drive may lose service life.

Heat can also cause server fan running at very high speed, and cause extra noise. We are human, we can not work in an extremely hot environment.

Working space and accessibility

You should keep enough workspace for your homelab and make sure all equipment is easy to access. This should including enough spaces when not maintain your server and when you are moving your equipment.

Most servers are flat and long, they may difficult to go through narrow corner or doors. A full hight server cabinet can easily over 2 meter high, which makes them not fit in some basement or attic.

Servers are generally made by steal, which means they are heavy, very heavy. An empty 2U DELL or HP rack server can be easily over 20+ kg without any hard drive installed. Fully loaded 4U storage boxes can be heavy as 50+ kg, and you need at least 2 person to move those boxes, or using lift equipment. so make sure you still have space for your friends and carry equipment for your server.

A fully loaded 42U server cabinet can be heavy as 2000 kg, with footage less than 2 square meters. make sure your floor support a heavy load, and not make holes on it.

The place for your homelab

There are some commonplace for your homelab. they have different pros and cons.

Dedicate server room

Pros: This could be the best place for your home lab.

  • Easy to access because inside of your house or apartment, and providing a better work environment.
  • Generally, quilt if the room is surrounded with a thick wall or using soundproof material.
  • Generally, cool if veneration equipment is properly installed and used.
  • Blend in, hide your generally ugly server cabinet.
  • Dedicate room reduce risk from accident damage and natural hazard such as flood, heatwave, bugs, etc… Also void damage from and to kids and pets.

Cons: This could be the most expensive plan.

  • Using precious space of your house or apartment.
  • High cost for renovation from the existing room, and equipment installation.
  • Limited workspace if the room is small.

Study room or home office room

Pros:

  • Usually enough space to work.
  • Acceptable environment temperature.
  • Close to the working desk or your development space. Easy to access because inside of your house or apartment.
  • Generally better network cable set up in the study room or home office room.

Cons:

  • Noise is an issue.
  • Could be ugly.

Living room

Pros:

  • Usually lots of working space.
  • Acceptable environment temperature.
  • Easy to access because inside of your house or apartment.
  • Show off, every visitor knows you have a big rack.

Cons:

  • Could be ugly.
  • Noise is a major issue.
  • You wife (or husband) may hate you.

Garage

Pros:

  • Generally quilt, noise is not a problem in the garage.
  • Enough space for work, garage is a large space.
  • Hide your generally ugly server cabinet.

Cons:

  • Running network cable to other room can be difficult.
  • Can be far from your working desk, may require lots of walking.
  • May not have animal control. Be careful of bugs, cat, mouse, etc..

Basement

Pros:

  • Generally quilt, noise is not a problem in the basement.
  • Enough space for work, the basement is a large space.
  • Hide your generally ugly server cabinet.
  • Generally, the temperature is low in the basement.

Cons:

  • Risk of flood and water damage.
  • Access to the basement can be difficult, and moving equipment to the basement may also difficult.
  • Can be far from your working desk, may require lots of walking.
  • May not have animal control. Be careful of bugs, cat, mouse, etc..

Attic

Pros:

  • Generally quilt, noise is not a problem in the attic.
  • Hide your generally ugly server cabinet.

Cons:

  • The temperature could be very hot in summer.
  • Access to the basement can be difficult, and moving equipment to the basement may also difficult.
  • Can be far from your working desk, may require lots of walking.
  • Attic floor may not strong enough for holding heavy equipment.

Closet

Pros:

  • Easy to access because inside of your house or apartment, and providing a better work environment.
  • Can be quilt, depending on the material of the closet.
  • Blend in, hide your generally ugly server cabinet.
  • Reduce risk from accident damage and natural hazard such as flood, heatwave, bugs, etc… Also void damage from and to kids and pets.

Cons:

  • Serious heat problem due to closed environment.

You can have your equipment in different place of your home, for example, I have my network gear in a small network cabinet just above the front door and my main servers located in a larger cabinet in my study room.

Be in a fixed location not means you can not move them. You can have your server sets on a cabinet with wheels, or have a cluster of Raspberry Pi on a box you can move from one place to another. but your homelab equipment should not like your laptop or iPad that you can take where you goes to.

Use server cabinet and rack

Use server cabinet or rack can significantly make your homelab clean and make your device easily accessible because most of the industry equipment is design for rack mount.

But use or not use a server cabinet can depend on type and quantity of the equipment you have. For user with several heavy server and equipment, use standard server cabinet cloud be a better decision. The smaller size network rack is also a good choice for network device only user.