Database Backups can take several formats. Backup size, backup time, and restore time will vary for each of these formats.

In this post, we will use a 13 GB database with a total of 66 million rows to compare and evaluate the backup" /> Database Backups can take several formats. Backup size, backup time, and restore time will vary for each of these formats.

In this post, we will use a 13 GB database with a total of 66 million rows to compare and evaluate the backup" /> \n Database Backups can take several formats. Backup size, backup time, and restore time will vary for each of these formats.

In this post, we will use a 13 GB database with a total of 66 million rows to compare and evaluate the backup" />

Loader

Sql Backups vs CSV Backups

Sql Backups vs CSV Backups

Database Backups can take several formats. Backup size, backup time, and restore time will vary for each of these formats.

In this post, we will use a 13 GB database with a total of 66 million rows to compare and evaluate the backup and restore time of the CSV and SQL backup format on a 4 Gig RAM and 2 cpu cores with mysql 5.5 cloud instance.

SQL Backups

SQL dump file is a dump of data or schema or both in SQL executable format, which can be inserted directly into MySQL using:

[code]shell> mysql < sqldump.sql[/code]

Or from within MySQL:

[code]mysql> SOURCE sqldump.sql;[/code]

Either way, it will execute SQL statements directly into MySQL.

One of the most popular tools for taking MySQL backups is “mysqldump”. “mysqldump” can be used to take SQL or CSV backups for all databases, certain databases or tables in addition to multiple functionalities.

[code]# time mysqldump –single-transaction testdb tbl1  > dump.sql[/code]

The usage of –single-transaction ensures that a consistent backup of the InnoDB tables will be taken. –single-transaction works by putting mysql mode to REPEATABLE READ and issue a START TRANSACTION statement before taking the backup.

The output of the previous command is:

[code]real    1m44.418s
user    0m48.184s
sys    0m11.621s
[/code]

The size of the sql dump is:

[code]# du -h dump.sql
2.5G    dump.sql
[/code]

Delimited text backup

CSV or (Comma Separated Values) backup is a file contain only the data of tables separated by specific delimiter (comma for example). There are two methods that can be used to take CSV backups from MySQL:

The first method is to use mysqldump with –tab option which generates the CSV file plus the schema file The second method is to use LOAD DATA INTO OUTFILE command from within mysql which results in generating normal text file with specific delimiter specified by FIELDS TERMINATED BY ‘,’

Testing the same database but with different kind of the backup, it results in:

[code]# time mysqldump –single-transaction –tab=/tmp/ dbtest
real    0m45.793s
user    0m0.006s
sys     0m0.000s[/code]

[code] # du -h /tmp/*
4.0K /tmp/table1.sql
2.2G /tmp/table1.txt
[/code]

The total size of the database CSV dump is  2.2G, which is 12% less than the SQL backup.

Restoring the dump

Restoring is a straight-forward process with both types of dumps. Initially, without any tweaks to the MySQL configuration, the restoring time will be horrifying – it took more than a day to restore the dump.

Tweaking The Restore Performance

There are some tweaks steps that needs to be done before importing data into MySQL:

– Since we use InnoDB tables, the innodb_buffer_pool_size needs to be increased to contain the biggest chunk of the database when performing insert operations.

[code]innodb_buffer_pool_size = 3000M[/code]

– Stopping the autocommit will result into improving the restore time by not committing each transaction.

[code]autocommit = 0[/code]

– Setting innodb_flush_log_at_trx_commit to 2, will result into flushing the data from the buffer to the log file at each commit. Combining this option with the stopping of autocommit will improve the restore performance dramatically.

[code]innodb_flush_log_at_trx_commit = 2[/code]

– Also turning off the unique and foreign key checks temporarily (turns off the check for uniqueness and foreign key constraints on insert operations).

[code]> SET unique_checks=0;[/code]

[code]> SET foreign_key_checks=0;[/code]

After configuring the server,  we re-compare the restore time between the SQL and CSV backups.

Restore SQL Backup

[code]# time mysql dbtest < dump.sql
<span class="message_content">real    146m36.868s
user    0m37.566s
sys     0m2.672s </span>[/code]

which means an average of 7500 row insertion per second.

Restore CSV Backup

[code]# mysql dbtest < /tmp/table1.sql
# time mysql -e ‘LOAD DATA INFILE "/tmp/table1.txt" INTO TABLE dbtest.tbl1;’
[/code]

[code]real 120m34.836s
user 0m35.566s
sys 0m2.752s[/code]

Conclusion

Logical backups are expensive and slow compared to the raw data backup but more reliable. By comparing between the SQL logical backups and the delimited text file logical backup it turns out that the size of the CSV backup is less than the sql backup and takes less time in backup and restore.

Backup Type Backup Size Backup Time Restore Time
SQL 2.5 G 1min 44sec 146 min
CSV 2.2 G (12%) 0min 45sec (42%) 120 min (14%)