Getting Started: Hardware.

You can put lipstick on a .cnf – Part 2

The foundation for any change in variables is the hardware, the OS, and the file system. After that, you start looking at the workload – which means understanding what it is you have, are wanting, and are doing, over time.

Tools:

  • pt-summary
  • sysbench
  • fio
  • hdparm
  • dd

Let’s start with the hardware:

Each MariaDB server is installed with Ubuntu 16.04 on HP Prodesk 600 G1. Using pt-summary from the Percona Toolkit I get the following (Parts removed for compactness):

sysadmin@tpd81:~$ sudo pt-summary
# Percona Toolkit System Summary Report ######################
    Hostname | tpd81
      System | Hewlett-Packard; HP ProDesk 600 G1 DM; vNot Specified (Desktop)
    Platform | Linux
     Release | Ubuntu 16.04.6 LTS (xenial)
      Kernel | 4.4.0-169-generic
Architecture | CPU = 64-bit, OS = 64-bit
   Threading | NPTL 2.23
     SELinux | No SELinux detected
 Virtualized | No virtualization detected
# Processor ##################################################
  Processors | physical = 1, cores = 4, virtual = 4, hyperthreading = no
      Speeds | 1x2000.156, 1x2000.390, 1x2059.531, 1x2079.296
      Models | 4xIntel(R) Core(TM) i5-4590T CPU @ 2.00GHz
      Caches | 4x6144 KB
# Memory #####################################################
       Total | 7.7G
        Free | 6.5G
        Used | physical = 394.2M, swap allocated = 976.0M, swap used = 0.0, virtual = 394.2M
     Buffers | 820.6M
      Caches | 7.0G
  Swappiness | 60
  Locator   Size     Speed             Form Factor   Type          Type Detail
  ========= ======== ================= ============= ============= ===========
  DIMM1     4096 MB  1600 MHz          SODIMM        DDR3          Synchronous
  DIMM3     4096 MB  1600 MHz          SODIMM        DDR3          Synchronous
# Mounted Filesystems ########################################
  Filesystem  Size Used Type     Opts                                                                                                 Mountpoint
  /dev/sda1   511M   1% vfat     rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro /boot/efi
  /dev/sda2   457G   1% ext4     rw,relatime,errors=remount-ro,data=ordered                                                           /
	/*SNIP*/
# Disk Schedulers And Queue Size #############################
         sda | [deadline] 128
# Disk Partioning ############################################
Device       Type      Start        End               Size
============ ==== ========== ========== ==================
/dev/sda     Disk                             500107862016
/dev/sda1    Part       2048    1050623                  0
/dev/sda2    Part    1050624  974772223                  0
/dev/sda3    Part  974772224  976771071                  0
# Kernel Inode State #########################################
dentry-state | 59370    46232   45      0       0       0
     file-nr | 896      0       777238
    inode-nr | 55090    399
# RAID Controller ############################################
  Controller | No RAID controller detected
# Network Config #############################################
  Controller | Intel Corporation Ethernet Connection I217-LM (rev 04)
 FIN Timeout | 60
  Port Range | 60999
# Interface Statistics #######################################
  interface  rx_bytes rx_packets  rx_errors   tx_bytes tx_packets  tx_errors
  ========= ========= ========== ========== ========== ========== ==========
  eno1     1750000000    6000000          0 2000000000    1500000          0
# The End ####################################################

This is the starting point for my configuration, and really – the important parts are memory, CPU, filesystem type.

Note: Hyperthreading is off because Maria 10 on 14.4 and below at least had issues with hyperthreading.  

Note: I know about the “vm.swappiness = 0” on every mysql server – except, I believe that it has been fixed? I am currently researching.

from Percona

 

Sysbench:

We’re going to use sysbench 0.4 to give us a better idea of the I/O provided by the hd. This is the older version of sysbench available on Ubuntu. For the newer version, connect to the Percona Repository. I plan to use Sysbench 1.0 soon. Using the fileio test, I test with 20GB file. The importance of the total file size is that it should be bigger than your RAM (as seen above, 8GB). Complete the prepare:

sysbench --test=fileio --file-total-size=150G prepare

Once the prepare is complete, we can do a test on random read/write to get an idea of the numbers we want to look at:

sysadmin@tpd31:~$ sysbench --test=fileio --file-total-size=20G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from timer.


Extra file open flags: 0
128 files, 160Mb each
20Gb total file size
Block size 16Kb
Number of random requests for random IO: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Time limit exceeded, exiting...
Done.

Operations performed: 26557 Read, 17704 Write, 56576 Other = 100837 Total
Read 414.95Mb Written 276.62Mb Total transferred 691.58Mb (2.3053Mb/sec)
147.54 Requests/sec executed

Test execution summary:
total time: 300.0007s
total number of events: 44261
total time taken by event execution: 147.3784
per-request statistics:
min: 0.00ms
avg: 3.33ms
max: 56.85ms
approx. 95 percentile: 10.95ms

Threads fairness:
events (avg/stddev): 44261.0000/0.00
execution time (avg/stddev): 147.3784/0.00

This has interesting stuff in it. For example: fsyncs() done every 100 requests. Block size of 16KB. 

FIO Tests

 

hdParm and dd
(This next section is probably not useful for NVME and SAN!)

Ok. I’m only adding this part to give you an understanding of what I do when figuring out the configurations of a server instance. The following is only useful for a sandbox system. If you’re using NVME’s and a SAN, or have a RAID controller and 8 hd’s in a RAID 10+5 configuration . . . you gotta figure out what it’s doing and probably use dd. Keep scrolling.

Info and READ Test

The second piece of information that I like to have is for the disk. This is an ATA disk, so I’m using hdparm which is useful for obtaining information and controlling ATA/IDE controllers and hard drives. Notice that it says ATA/IDE controllers. I believe NVME’s have a similar utility called nvme-cli. As for other types of drives . . . /shrug . . . maybe smartctl?

sysadmin@tpd31:~$ sudo hdparm -I /dev/sda
/dev/sda:
ATA device, with non-removable media
        Model Number:       TOSHIBA MQ01ACF050
        Serial Number:      98EZC1XVT
        Firmware Revision:  AV0A3E
        Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
        Supported: 8 7 6 5
        Likely used: 8
Configuration:
        Logical         max     current
        cylinders       16383   0
        heads           16      0
        sectors/track   63      0
        --
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors:  976773168
        Logical  Sector size:                   512 bytes
        Physical Sector size:                  4096 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:      476940 MBytes
        device size with M = 1000*1000:      500107 MBytes (500 GB)
        cache/buffer size  = unknown
        Form Factor: 2.5 inch
        Nominal Media Rotation Rate: 7200
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Advanced power management level: 254
        DMA: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    DOWNLOAD_MICROCODE
           *    Advanced Power Management feature set
                SET_MAX security extension
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
           *    64-bit World wide name
           *    IDLE_IMMEDIATE with UNLOAD
                Write-Read-Verify feature set
           *    WRITE_UNCORRECTABLE_EXT command
           *    {READ,WRITE}_DMA_EXT_GPL commands
           *    Segmented DOWNLOAD_MICROCODE
           *    Gen1 signaling speed (1.5Gb/s)
           *    Gen2 signaling speed (3.0Gb/s)
           *    Gen3 signaling speed (6.0Gb/s)
           *    Native Command Queueing (NCQ)
           *    Host-initiated interface power management
           *    Phy event counters
           *    Idle-Unload when NCQ is active
           *    Host automatic Partial to Slumber transitions
           *    Device automatic Partial to Slumber transitions
           *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
           *    DMA Setup Auto-Activate optimization
                Device-initiated interface power management
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Write Same (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
           *    DOWNLOAD MICROCODE DMA command
Security:
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
                frozen
        not     expired: security count
                supported: enhanced erase
        92min for SECURITY ERASE UNIT. 92min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50000398c4405499
        NAA             : 5
        IEEE OUI        : 000039
        Unique ID       : 8c4405499
Checksum: correct

This also has a boat load of info, but then I do the following:

sysadmin@tpd31:~$ sudo hdparm -Tt /dev/sda
/dev/sda:
 Timing cached reads:   22436 MB in  1.99 seconds = 11254.34 MB/sec
 Timing buffered disk reads: 350 MB in  3.02 seconds = 116.01 MB/sec

sysadmin@tpd31:~$ sudo hdparm -t --direct /dev/sda
/dev/sda:
 Timing O_DIRECT disk reads: 348 MB in  3.00 seconds = 115.82 MB/sec

FYI: 115 MB/s aren’t that bad for an HDD.

WRITE Tests (and read too, but I like the above better)

Write tests are done with dd and are pretty simple. As with most test, do it a couple of times and then take an average.

/* *** tempfile does not exist on the disk WRITE is fast (sequential) *** */
sysadmin@tpd31:~$ sync; dd if=/dev/zero of=tempfile bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.29691 s, 828 MB/s

/* *** tempfile exists on the disk - write speed drops to 134 MB/s *** */
sysadmin@tpd31:~$ sync; dd if=/dev/zero of=tempfile bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.01597 s, 134 MB/s

sysadmin@tpd31:~$ sync; dd if=/dev/zero of=tempfile bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.80114 s, 138 MB/s

sysadmin@tpd31:~$ sync; dd if=/dev/zero of=tempfile bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.91887 s, 136 MB/s

sysadmin@tpd31:~$ sudo rm tempfile
sysadmin@tpd31:~$ sync; dd if=/dev/zero of=tempfile bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.5276 s, 2.0 GB/s

/** READ TEST Check tempfile is in cache **/
sysadmin@tpd31:~$ sync; dd if=tempfile of=/dev/null bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.205524 s, 5.2 GB/s

/** DROP cache and then do it again. This is real READ speed **/
sysadmin@tpd31:~$ sudo /sbin/sysctl -w vm.drop_caches=3
vm.drop_caches = 3

sysadmin@tpd31:~$ sync; dd if=tempfile of=/dev/null bs=1M count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.10899 s, 118 MB/s

The above is going to take some discussion, so the real calculations will happen in Part 3.