Or rather, I’m going to show you how I organize my .cnf files. The configuration files do what they do – whether they are organized or not. I don’t actually see much out in the www about “my.cnf organization”.
Why you do that?
I don’t know if it’s a thing to have disorganized my.cnf files. I’ve been a DBA at 3 different large operations, and prior to “owning” the configuration, it seemed like each and every one was haphazardly put together: written as variables were changed, typically at the very top of the [mysqld] block. Sometimes randomly. Sometimes with duplicates that cancelled the earlier entry. Sometimes missing in other servers. Instead of a strategically defined file, it was 20 or 30 different files. What a mess!
And it was this mess that caused me to think about how to define the my.cnf when I had control of the file. It’s probably much ado about nothing, especially when you have automated configuration. If you have automated configuration. I never really did, so I needed a way to work with about 60 servers at a time. I’ve found this organization method to be essential for the sweatshop DBA work of standing up or reseeding many servers at a time.
Other resources for a my.cnf file.
Truthfully, you shouldn’t – I repeat should not – have a good idea of what your my.cnf file looks like. If it’s memorized – and you do not have a photographic memory – it’s because you have other, more substantial problems. I would suggest enforcing policies of segregation and security instead of tinkering with the my.cnf. Or spend money on NVME’s. Big ones. If it’s because there’s nothing in your my.cnf, it’s probably missing stuff.
The my.cnf file should be something you may deal with every couple of years – and usually because of patching issues or upgrades. Really, there’s should be just enough stuff in a my.cnf that it’s not memorizable. And it should be set once and forget until something requires you to take a look at it.
If you don’t currently have a my.cnf file, or can’t remember what one looks like, there are some resources on the www:
- FromDual: https://www.fromdual.com/mysql-configuration-file-sample This one is good to start with as well, and breaks things into blocks of configuration for both MySQL and MariaDB
- This Guy on Github: https://gist.github.com/rhtyd/d59078be4dc88123104e Simple start similar to what is on the defaults.
- The one they add to the packages from Oracle or Maria: I don’t like them, and they like to make things confusing. That’s why people add random stuff to it.
The following is from my Sandbox servers on Ubuntu 16.04 LTE using the default mysql directories (to a point) and MariaDB 10.4. These are little physical mini-pc’s networked together to simulate a cluster of servers, but with 8GB of memory and single HD’s. Not for production, definitely, but let’s me test out ideas when I have them. If you’re interested in what I did to make them, check out the TPD Sandbox. I’ll discuss more in detail what’s going on following the configuration.
# MariaDB database server configuration file.
#
[mysqld]
# Replication
report_host = 'tpd81'
server-id = 101181
auto_increment_increment = 10
auto_increment_offset = 1
binlog_format = ROW
# Networking
bind-address = 10.1.1.81
# GTID
gtid_domain_id=81
gtid_strict_mode=1
gtid_ignore_duplicates=1
# Host Specific Directories
pid-file = /var/run/mysqld/tpd81-mysqld.pid
socket = /var/run/mysqld/tpd81-mysqld.sock
# cd /var/log/mysql
# sudo mkdir error slow bin relay audit
log_error = /var/log/mysql/error/tpd81-mysql-error.log
slow_query_log_file = /var/log/mysql/slow/tpd81-slow.log
log_bin = /var/log/mysql/bin/tpd81-bin
relay_log = /var/log/mysql/relay/tpd81-relay
# * Basic Settings
default_storage_engine = InnoDB
user = mysql
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
max_connections = 151
max_user_connections = 145
max_allowed_packet = 16M
# * Query Cache Configuration
query_cache_limit = 1M
query_cache_size = 16M
# * Session Variables
sort_buffer_size = 2M
tmp_table_size = 32M
read-buffer_size = 128K
read_rnd_buffer_size = 256k
join_buffer_size = 128k
# Table Buffers
table_definition_cache = 1400
table_open_cache = 2000
# MyISAM variables
thread_stack = 192K
thread_cache_size = 8
# * Logging
#general_log_file = /var/log/mysql/mysql.log
#general_log = 1
#
# Slow Log
long_query_time = 0
#log_slow_rate_limit = 1000
log_slow_verbosity = query_plan
log-queries-not-using-indexes
# Bin Log
expire_logs_days = 3
max_binlog_size = 100M
log_slave_updates
binlog_annotate_row_events
replicate_annotate_row_events
#* InnoDB
innodb_buffer_pool_size=6GB
innodb_buffer_pool_instances=1
innodb_log_buffer_size=128MB
innodb_log_file_size=128MB
innodb_flush_method=O_DIRECT
#
# * Security Features
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#
# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem
#
# * Character sets
character-set-server = utf8mb4
collation-server = utf8mb4_general_ci
# * Unix socket authentication plugin is built-in since 10.0.22-6
[mysqld_safe]
socket = /var/run/mysqld/tpd81-mysqld.sock
nice = 0
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port = 3306
socket = /var/run/mysqld/tpd81-mysqld.sock
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
no_auto_rehash
max_allowed_packet = 16M
default-character-set = utf8mb4
prompt = '\u@\h [\d]> '
[isamchk]
key_buffer = 16M
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!include /etc/mysql/mariadb.cnf
!includedir /etc/mysql/conf.d/
It’s all about that Hostname and IP.
# Replication
report_host = 'tpd81'
server-id = 101181
auto_increment_increment = 10
auto_increment_offset = 1
binlog_format = ROW
The point of organizing the my.cnf file in this way is to make use of patterns. The hostname and IP serve as the pattern for all other identifying variables in the configuration file.
The top is where we identify the instance against other instances. I use report_host at the beginning to tell me what this host is. Okay – it’s good for “SHOW SLAVE HOSTS” – but truthfully, I’ve never had to use that command. If I’m running through 18 different hosts, I use this pattern.
The hostname is important for log analysis into a singular system, and it is very helpful if you are downing a server for patching/upgrading that requires a data wipe – but still need to keep the logs on a different server.
The server-id (UINT32) is the IPV4 at the time of creation. If you’re using IPV6, then maybe you choose the numeric values within the address, but my implementation is simple and I’m using the private 10.1.1.* addresses for 13 instances total within its own switch. Even in a corporate environment we used IPV4, so I didn’t worry about how an IPV6*pattern worked. The reason why this is “at time of creation”, is that sometimes networking admins just change things for no reason (at least from my point of view), and your IP changes – it’s harder to change the server id at that point. I use the IP or parts of the IP throughout the server variables to help identify everything from rows in a table, to where a GTID event originated.
For all master writes (and since I’m setting up a 3 node multi-primary asynchronous cluster), each instance will have an offset. The id’s for the nodes are 101181, 101182, 101183, so the offset only needs to be 1, 2 and 3 respectively with an increment of 10. This gives me up to 9 offsets to choose from. In a production environment, I’ve used an increment of 100 – but you must also use SERIAL for auto_increment columns’s. I think an increment of 1000 is too much. If you’ve never worked with auto_increment offset and increment, I have a quick post write up on it somewhere. Some places don’t use them. Suffice it to say: every auto_increment surrogate key that ends with a 1 originated from 101181 (it would look like 11,21,31,41), every auto_increment surrogate key that ends in a 2 would come from 101182 (it would look like 12,22,32,42), and so on.
I am playing with ROW replication at this point, and want to make sure that is at the top of the list.
# Networking
bind-address = 10.1.1.81
# GTID
gtid_domain_id=81
gtid_strict_mode=1
gtid_ignore_duplicates=1
The bind-address is the IP address unless this will be a utility mysql instance that only requires localhost (though inevitably end up being replicated in corporate policy) or binding to all interfaces (0.0.0.0).
The GTID domain id (UINT32) is separate for each write server (unless it truly is a multi-source cluster). Since domain id is a UINT32, I can use the last octet as the domain id. If you don’t understand MariaDB’s GTID – I have a quick writeup on what it is. For now, don’t worry about the non-identifying variables as those will be discussed in detail in another post.
Host Specific Directories
pid-file = /var/run/mysqld/tpd81-mysqld.pid
socket = /var/run/mysqld/tpd81-mysqld.sock
# cd /var/log/mysql
# sudo mkdir error slow bin relay audit
log_error = /var/log/mysql/error/tpd81-mysql-error.log
slow_query_log_file = /var/log/mysql/slow/tpd81-slow.log
log_bin = /var/log/mysql/bin/tpd81-bin
relay_log = /var/log/mysql/relay/tpd81-relay
Lastly, host specific directories round out the top. I like the idea of separating the individual log directories, so within /var/log/mysql I mkdir error, slow, bin, relay and audit. I am not using the audit log plugin on this server, but will in the future.
Now – I know that there are some defaults which will include <hostname>- to the name of a file if it is not explicitly defined, but by explicitly defining it in the my.cnf, I always know (or should know) what server I’m working with.
I will discuss the values for each of the configurations in a part 2 of this post.
Now for the why!
I’m sure that there are more automated ways to recreate or reseed a server, however, sometimes doing a manual run-through is also enough. Optimization includes the idea that if it is more expensive to automate a task than it is to do it manually, it is not worth the optimization. The my.cnf file may not change that much, but through time, you may start with a small number of servers and gradually increase the number of servers as requirements change or growth occurs.
Seeding a new server becomes simple. I typically request for homogeneous servers – this allows me to have a good idea of past performance (and mistakes) without a great deal of research prior to burn-in and a quick move to production. Since the servers are identical (or near identical), then to bring up a new server it’s just a matter of performing the following:
- sudo rsync -av /etc/mysql dba@newmachine:/sometempdir
- sudo rsync -av /var/lib/mysql dba@newmachine:/sometempdir
- sudo rsync -av -f”+ */” -f”- *” /var/log/mysql dba@newmachine:/sometempdir
I use vim, so within vim, it’s a matter of issueing a search-replace for the hostname, and then changing out the values at the top to match the IP without having to search all over the my.cnf to find it because everything is at the top.
Let the server replicate and do some test runs, and boom – into production. EZPZ.
An argument against default directories in Ubuntu (or anywhere else)
In my experience, I’ve never kept the mysql directories in the default /var/lib, /var/log and /var/run directories. There is often some risk of deletion during patching and upgrades, especially OS Major and Mysql major upgrades. Although in this example I have kept the default dirs, I did not on production machines, keeping all of the mysql dirs together under /opt/mysql.
I also think it’s a good idea to have the my.cnf (and the other files in /etc/mysql) symlinked in /etc/mysql to the physical file in /opt/mysql/scripts.
The reasoning also had to do with seeding. It was always necessary at some point to reseed a server, and sometimes Corporate Retention Policy required that logs followed (one of the reasons why the hostname is included with the log name explicitly). Also, it allowed for some different partitions to be used for different sections of the directory structure. The directory structure would be as follows:
- /opt/mysql : parent, typically mounted with / on Raid 15 platters
- /opt/mysql/data: the data directory, mounted on NVME
- /opt/mysql/logs: the log directory which would also include: audit, error, relay, bin and slow, typically mounted on larger RAID 5 platters
- /opt/mysql/run: where the socket and pid lived
- /opt/mysql/tmp: typically a partition on the NVME
- /opt/mysql/scripts: cron scripts for ETL, init scripts, configuration, etc.
With everything in /opt/mysql – it becomes a matter for just rsyncing the whole dir structure to the new machine, then rsyncing data and scripts. A quick cleanup script later (again, watch out for retention policies), and some search and replace on the my.cnf and boom – done.
The title is a play on the colloquialism: You can put lipstick on a pig, but it’s still a pig. It essentially means: even though you are dressing something up to be prettier – it doesn’t matter because it’s sole function is not changed.
- *I don’t really understand IPv6 on a deep level, but since server-id is UINT32, a lot of numbers can go into a server-id. From a conversion perspective, if I have the following IP address: 2001:0DB8:AC10:FE01::, then my best bet would be to convert each to it’s corresponding decimal and go from there, using the rightmost non-zero hex. AC10:FE01 = 172162541, which fits in a UINT32. If there were other 1’s in the network, then my best bet would be to autoincrement_increment at 41 and autoincrement offset at 100. Or I could just say it’s 1, keep it at 1 and continue on with the world. Lipstick on a pig.