Archive for the “mySQL” Category

Whilst there indeed seems to be a veritable plethora of SQL profiling / benchmarking tools, most of them with insane commercial license fees (>= $400 per annum on most)

I have found it intriguing that as of mySQL community edition >= 5.0.37 mySQL offers an inbuilt method for profiling SQL queries, as can be see here the downside is that this is session based, meaning it can only provide profiling information for the current connection, almost useless for trying to profile a running web app (that is without code modification to set profiling and harvest the data).

However it can be useful if you have a known slow query.

So lets work on the basis that we have a known slow SQL query we’d like profiling information for,

check to see if profiling is enabled:

1
SELECT  @@profiling;

The returned value is generally 0 so lets enable it.

1
2
SET profiling_history_size=100;
SET profiling=1;

This tells mySQL to retain the profile of 100 queries in memory, and to enable profiling.

Now at this point this can also be used to diagnose slow loading datases, simply

1
2
3
USE <dbname>;
SHOW profiles;
SHOW profile FOR 1;

Upon running the above you will now be using your database and will see an output similar to

1
2
3
4
5
6
7
+----------+------------+-------------------+
| Query_ID | Duration   | Query             |
+----------+------------+-------------------+
|        1 | 0.00011400 | SELECT DATABASE() |
|        2 | 0.00048900 | show databases    |
|        3 | 0.00026600 | show tables       |
+----------+------------+-------------------+

Followed by

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000053 |
| checking permissions | 0.000004 |
| Opening tables       | 0.000009 |
| init                 | 0.000011 |
| optimizing           | 0.000004 |
| executing            | 0.000017 |
| end                  | 0.000003 |
| end                  | 0.000002 |
| query end            | 0.000002 |
| freeing items        | 0.000005 |
| logging slow query   | 0.000002 |
| cleaning up          | 0.000002 |
+----------------------+----------+

In my case here nothing really eventful, lets assume for the moment you are using a wordpress database, and you have numerous posts

1
2
SELECT count(*) FROM wp_posts WHERE ID > 100
SELECT count(ID) FROM wp_posts WHERE ID > 100

in my case I got the following results:

1
2
0.00072600 | select count(*) from wp_posts where ID &gt; 100
0.00069900 | select count(ID) from wp_posts where ID &gt; 100

a simple demonstration showing the difference between a count() on an indexed field vs *, in this case the saving is ~4%.

1
2
SHOW profiles;
SHOW profile FOR query <n>;

Will give you an output similar to:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
+--------------------+----------+
| Status             | Duration |
+--------------------+----------+
| starting           | 0.000079 |
| Opening tables     | 0.000014 |
| System lock        | 0.000005 |
| Table lock         | 0.000008 |
| init               | 0.000025 |
| optimizing         | 0.000012 |
| statistics         | 0.000049 |
| preparing          | 0.000012 |
| executing          | 0.000006 |
| Sending data       | 0.000461 |
| end                | 0.000004 |
| end                | 0.000003 |
| query end          | 0.000003 |
| freeing items      | 0.000007 |
| closing tables     | 0.000005 |
| logging slow query | 0.000003 |
| cleaning up        | 0.000003 |
+--------------------+----------+
17 rows in set (0.00 sec)

this is very similar to a stack trace you may run on a problematic script, or xdebug + webgrind, and will gain futher insight into your SQL should EXPLAIN no give you enough of an insight.

I’ll post more information on this as I get time to work with it more, this is still knew to me, and aside from knowing how to use it I know relatively little about this profiling functionality, please feel free to post references / examples in the comments.

Cheers

Buzz

Tags: , , , , , , ,

Comments No Comments »

I’ve no idea to this day why my bash script would not work with a CSV export from mysql by simply using mysql -e “SQL COMMAND HERE”.

So I had to come up with a workaround quickly.

This lead to using expect, scripting in this method can be used for numerous purposes, I am currently in the process of writing a few test scripts using tcl and this package for pop,imap,smtp testing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/usr/bin/expect -f
set DB "<database>"
set USER "<user>"
set PASS "<password>"

spawn mysql -u $USER -p $DB
match_max 100000
expect -exact "assword: "
send -- "$PASS\r"

set SQL "SELECT * INTO OUTFILE '/tmp/csvfile.csv' FROM table";

expect -exact "mysql> "
send -- "$SQL;\r"
expect -exact "mysql> "
sent -- "exit;/r"

Pretty simple realy once you have the hang of it, you tell it what to expect and what to reply with, there are more advanced methods going on from here, including conditional sends based on response.

I’ll be covering those soon.

Tags: , , ,

Comments No Comments »

    The problem

We’ve all been in this position at some point, working for a company who wants to internationalize their website, and so their mySQL CMS data …

But all is not so well as just using ‘SET NAMES utf8′ and changing all ‘charset’ on tables to utf8,

You may fall foul of seeing content like Á£

This is due to the fact in this case the latin1 encoded £ has not properly been converted to utf8 and as such is not rendering correctly, this is true of most ‘multibyte’ characters.

    The solution

What you need to do is actually convert the character set to utf8, in order to do this your going to need to run it through a program you could use iconv if you are already familiar with it, however if your system has python installed you can grab a copy of my sysadmin program which has iconv like functionality but is far more user friendly.

    What you will need
    Preparing the file

This assumes the database is currently using latin1, in theory this could be any encoding.

Get a dump of the database:

1
mysqldump --set-character-set=latin-1 --set-charset -u user -pPASSWORD databasename > databasename-latin1.sql

Now you have to be aware of what you need to replace using SED, you can’t just replace all instances of ‘latin1′ as Murphy’s law being as it is means that somewhere there will be ‘latin1′ in the physical content, especially for instance if I was using a mysql dump from this blog.

As such you need to replace the following:

1
/*!40101 SET NAMES latin1 */;

If your database dump is small enough (sub 100mb) you can edit this line directly in your text editor, alternatively you can do the following.

1
2
3
cat ./databasename-latin1.sql | sed 's/SET NAMES latin1/SET NAMES utf8/g' > tmp
cat ./tmp > ./databasename-latin1.sql
rm -f ./tmp

Now you need to replace all instances of ‘CHARSET=latin1′

1
2
3
cat ./databasename-latin1.sql | sed 's/CHARSET=latin1/CHARSET=utf8/g' > tmp
cat ./tmp > ./databasename-latin1.sql
rm -f ./tmp

Now we have to run the file through the charset converter

1
sysadmin -c iconv -d ./databasename-latin1.sql,latin-1,utf-8

If your sql dump is over 30mb, you will be prompted to confirm you wish to proceed, please remember that this will load the entire file into memory, so ensure you have enough available system memory before proceeding, I also suggest not running this on a production server.

If any characters could not be converted you will be alerted to their exact position within the file, from there you will either need to use sed to replace the character or use your text editor.

If all went well you now have ./databasename-latin1.sql.utf-8 (note the utf-8 extension), you now have a complete utf8 mySQL dump, all you need do now is import the dump.

    Further reading

  1. Force mySQL utf8 connections
  2. mySQL backup script
Tags: , , , , , , , , , ,

Comments 2 Comments »

Title for a cheesy sysadmin novel I know.

But over the last month or so I have been plagued by a mySQL server that was reporting a full root partition, when it wasn’t full …

Causing me some headaches

Allow me to explain:

1
2
3
4
5
df -h
Filesystem                     Size    Used    Avail   Use%   Mounted On
/dev/sda1                     20G   17G      1.8G    91%    /  

...

Looks simple enough right? I just need to free up some space?

Afraid not.

1
2
3
du -mcs /
2264 /
2264 Total

Just for some clarification this small partition is in use for the operating system only, the mysql instance itself is infact mounted on a much larger partition using the same method as detailed in mysql moving /var/lib/mysql and error121

So here’s the problem, df and therfor all the monitoring systems are reporting the disk as full where as du clearly shows it is not …

Leaving me in the position of if I can not find where the disk usage is with du I have no way of freeing the disk space and bringing the service back online …

Or do I?

After talking with Matthew Ife of ukfast he suggests there must be a an unclosed file IO (aka a file descriptor) that is using up the diskspace, these descriptors do not show up using du

After some searching around I find the command lsof this command will list the open file descriptors for a process including their current size …

1
2
3
4
5
6
7
8
psa ux | grep mysqld
mysql     8131  2.8  1.5 304088 63668 ?        Sl   09:37   0:36 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --socket=/var/lib/mysql/mysql.sock

lsof -p 8131

...
mysqld  27878 mysql    3w   REG                8,2 14930490713   3290408 /var/log/mysql-slow.log.1 (deleted)
...

As you can see above the open file descriptor flagged as (deleted) was increasing in size until the diskspace ran out, for the time being I have since disabled mysql slow query logging whilst I sort out the log rolling as described in Mysql slow query log rotation

Comments No Comments »

I ment to note this down yesterday but everything is going ten to the dozen at the moment.

basically I have now authored a nagios addon for monitoring master-master replication between two servers, this carries out 4 stages of checks

  1. Validates all required data is passed by servers
  2. Slave IO is running on both servers
  3. Seconds_Behind_Master check, args can be passed to vary warn and critical thresholds
  4. (slave) Master_Log_File == (master) File

The 5th check was a comparison on the binlog positions themselves, comparing (slave) Read_Master_Log_Pos and (master) Position

Here in lies the problem, which took a while to track down, the problem is that no matter what I tried the slave was ALWAYS behind the master position … but why?

The reason is why I designed the High Availability solution in the first place … Very high traffic level, in the region of 20,800 transactions per second.

Why was this the problem? the two queries run to gather the data are done sequentially per server, using the python time library I was able to find that there is a 0.02s interval between gathering datasets (20 milliseconds) … in that time 416 transactions had take place.

i.e.

time: binlog pos

Slave A

0.000: 100

Master B

0.020: 516

This unfortunately has now lead to some 32 lines of code being commented out, as I can see no way to reliably use the binlog positions for monitoring the replication in this situation, if any delay occurs anywhere at any point during the dataset collection i.e. network latency, delay in query processing due to traffic peak on one server … etc. the collected samples will always be different

The only way I ever see this working is if you can validate that the datasets came from the same exact point in time down to the nanosecond, this however is again not possible, on the network the servers currently reside there is a 0.13 millisecond ping response time this works out to 13,000 nanoseconds (0.00013 * 10^9)

If anyone has any theories on how to overcome this please let me know.

NOTE: At present due to the programming of this addon being done during working hours the nagios addons are not for public release at this time, this may be subject to change in the future should my employers allow their release.

Tags: , , ,

Comments No Comments »

In on of those “oh ffs” moments I found myself writing a BASH script to quickly dump all database on a mySQL server.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/bash
MYSQL=`which mysql`;
MYSQLDUMP=`which mysqldump`;
GZIP=`which gzip`;
DEST="/path/to/dump/folder"

USER="root";
PWD="XXXXXX";

DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases'`);

for db in ${DBS[@]};
do
        `$MYSQLDUMP --default-character-set=utf8 --set-charset -u $USER -p$PWD $db | $GZIP -9 > $DEST/$db.sql.gz`
        echo "$db - DONE";
done;

This script gets a list of all databases, dumps them out with UTF8 encoding, and gzip compresses the SQL file into the given “DEST” folder.

If you want to skip over certain databases i.e. “mysql”

Change this line:

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases'`);

To:

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases' | grep -v "database_to_exclude"`);

Or for multiple exclusions

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases' | grep -v "database_to_exclude" | grep -v "another_database_to_exclude" | grep -v "etc"`);

I may re-write this in Python, if I get time.

Tags: , , ,

Comments 1 Comment »

One of the issues facing log rotation in mySQL is that mySQL doesn’t seem to have the ability to perform a “reload”.

Meaning standard methods of rotating logs using logrotate leave mySQL logging an error in the syslog saying that the log file could not be found, and refusing to to any logging until the server is restarted, not fun if like me you manage high availability solutions and restarting a service is never the best option.

As such I have written this log rotate script, this does however make some assumptions.

  1. Your mysql user is “mysql”
  2. Your mysql slow query log is /var/log/mysql-slow.log

The script is written to perform the following actions:

  1. Rotate weekly
  2. Retain 3 rotations (3 files + live log)
  3. Compress on rotate (gzip)
  4. Create new logfile with 660 permissions chowned to mysql:mysql
  5. Run: mysqladmin flush-logs

Please be aware that the flush logs command will also rotate any binary logging currently in place, please ensure this will not adversely affect your deployment prior to use

Please ensure you carry out your own testing prior to deploying this script into a live environment.

http://svn.saiweb.co.uk/branches/linux-the-sysadmin-script/branches/logrotate.d/mysql

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#
# mySQL slow log rotation script by D.Busby
# place this script in /etc/logrotate.d/ or your appropriate logrotate dir.
# http://creativecommons.org/licenses/by-sa/2.0/uk/ CC BY-SA
#
# NOTE: if you are reliant on binlogs i.e. for replication, 'flush logs' closes the current binlog and moves to the next itteration of log files
# You should test this does not cause an issue with your deployment before using this script
#
/var/log/mysql-slow.log {
    weekly
    rotate 3
    compress
    missingok
    notifempty
    sharedscripts
    create 660 mysql mysql
    postrotate
        /usr/bin/mysqladmin flush-logs
    endscript
}
Tags: , ,

Comments 2 Comments »

If you’ve seen the new twitter feed to the right you may of seem some ramblings about ‘cura’.

Cura is a PHP class I have authored in Co-operation with Psycle Interactive (The company I now work for, so be sure to thank them for allowing me to publish this write up!)

So what does it do?
Cura sets several call back objects in your PHP application that re-directs all session data to a mySQL database.

But why do I need that?
The average 1 server end user can stop reading here, as I can tell you now that Cura is not for you.

If however you are a business fielding multiple web servers then read on.

By passing all your PHP sessions to a database you remove the work around requirements for a load balanced solution.

i.e. web1 web2

1) Shopper arrives at web1 and logs in.
2) Shopper adds item to cart, which is logged against their session.
3) web1 is subjected to a search engine index.
4) web2 is now serving the shopper, shoppers basket is now logged out as their session id has changed …

There are numerous work around methods for this, such as having a single shared mount point for the PHP session files, the use of cookies etc …

The problem is in a high availability solution that a single mount point is just that, it’s singular and therefor a single point of failure.

Then there is the use of cookies, which is fine until you start to store a lot of data during your users session, at which point on each server change you are reliant on the cookie data being transmitted back to the server each time, raising the question what is the point of adding a load balanced solution if the user experience becomes degraded due to it’s deployment?

So secret option number 3 is to use a database, you can remove the single point of failure by having a mySQL cluster, and you haven’t got to worry too much about how much data you are storing.

Because everything is in a database whenever your web application is run (web1, web2) it will read the data from one source, allowing persistent sessions across your whole platform without the need for single mount points or session replication.

The source files are available from: http://svn.saiweb.co.uk/branches/cura-php/trunk/

1
svn co http://svn.saiweb.co.uk/branches/cura-php/trunk/

To deploy this solution simply add the following lines to any file that calls session_start();

1
2
3
4
5
require_once('/path/to/cura.class.php');
$cura = new cura($db, $user, $password, $host);
session_start();
...
the rest of your file...

Ensure that you have created a ‘sessions’ table as per the provided sessions.sql file in your database.

I will be adding simplified support for wordpress and joomla shortly these will become available from: http://svn.saiweb.co.uk/branches/cura-php/trunk/

Tags: , , , , , ,

Comments 5 Comments »

The problem that most people face when setting up a UTF-8 database in mySQL is that without calling ‘SET NAMES’ in the mySQL client prior to issuing any queries (PHP, C++ etc …) that the client connection will actually in most cases default to  latin-1.

However as of mySQL 5.x or higher you can issue a statement in the my.cnf file calling init_connect.

This will trigger a series of defined commands / queries every time a non super user connects (So if you are using root to connect to your mySQL database, stop reading now and slap yourself HARD).

i.e.

1
2
3
4
5
6
7
[mysqld]
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server=utf8
collation-server=utf8_general_ci
skip-character-set-client-handshake

UPDATE 04/09/09

my mySQL version 5.0.45 x64 only picks up the last entry of init_connect

Use this example in this case:

1
2
3
4
5
[mysqld]
init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;'
default-character-set=utf8
character-set-server=utf8
collation-server=utf8_general_ci

Restart mySQL and check the mysqld.log has not returned any errors (Or your event viewer if you are using windows).

Every client connection will now default to utf-8 encoding and not latin-1, removing the need to add a SET NAMES call on every connection.

This will work for PHP, C++, ruby etc… as the client encoding is now handeled server side, rather that waiting on the client to issue a SET NAMES command.

UPDATE 30/03/09: Added “skip-character-set-client-handshake” this ignores the clients request to set the connection charset, this info courtesy of “wardo” http://word.wardosworld.com/?p=164

UPDATE 10/09/09

Been having some issues with this working the workaround is to add this config as a single line:

1
init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;'
Tags: , , , , ,

Comments 9 Comments »

The downside of a development server is … it’s for development.

It is  not always cost effective to have the exact same setup as you you have in your production environment …

Especially if you have a multi server setup.

So I find myself today moving /var/lib/mysql … being as the OS drive is very small, and pulling down a near 20GB database backup and then trying to rebuild the database on the same drive … well as you can imagine caused a few problems *doh*

1
2
/etc/init.d/mysql stop
mv /var/lib/mysql /raid_5/

So surely you just symlink … right?

1
2
ln -s /raid_5/mysql /var/lib/mysql
/etc/init.d/mysql start

Well then answer would be no … upon importing the backup

1
2
mysql &lt; backup.sql
Can't create table './database/table.frm' (errno: 121)

A nice errorno: 121

1
2
3
4
5
6
/etc/init.d/mysql stop
rm -rf /var/lib/mysql
mkdir /var/lib/mysql
chown mysql:mysql /var/lib/mysql
mount --bind /raid_5/mysql /var/lib/mysql
/etc/init.d/mysql start

et voila …

Data directory is relocated and the import working smoothly. Feel free to suggest any “cleaner” methods.

UPDATE: Please rememeber to add the ‘mount’ line into your rc.local otherwise when you reboot this mount will be gone!

Tags: , ,

Comments 1 Comment »

Creative Commons License