Posts Tagged “high availability”

I ment to note this down yesterday but everything is going ten to the dozen at the moment.

basically I have now authored a nagios addon for monitoring master-master replication between two servers, this carries out 4 stages of checks

  1. Validates all required data is passed by servers
  2. Slave IO is running on both servers
  3. Seconds_Behind_Master check, args can be passed to vary warn and critical thresholds
  4. (slave) Master_Log_File == (master) File

The 5th check was a comparison on the binlog positions themselves, comparing (slave) Read_Master_Log_Pos and (master) Position

Here in lies the problem, which took a while to track down, the problem is that no matter what I tried the slave was ALWAYS behind the master position … but why?

The reason is why I designed the High Availability solution in the first place … Very high traffic level, in the region of 20,800 transactions per second.

Why was this the problem? the two queries run to gather the data are done sequentially per server, using the python time library I was able to find that there is a 0.02s interval between gathering datasets (20 milliseconds) … in that time 416 transactions had take place.

i.e.

time: binlog pos

Slave A

0.000: 100

Master B

0.020: 516

This unfortunately has now lead to some 32 lines of code being commented out, as I can see no way to reliably use the binlog positions for monitoring the replication in this situation, if any delay occurs anywhere at any point during the dataset collection i.e. network latency, delay in query processing due to traffic peak on one server … etc. the collected samples will always be different

The only way I ever see this working is if you can validate that the datasets came from the same exact point in time down to the nanosecond, this however is again not possible, on the network the servers currently reside there is a 0.13 millisecond ping response time this works out to 13,000 nanoseconds (0.00013 * 10^9)

If anyone has any theories on how to overcome this please let me know.

NOTE: At present due to the programming of this addon being done during working hours the nagios addons are not for public release at this time, this may be subject to change in the future should my employers allow their release.

Tags: , , ,

Comments No Comments »

If you’ve seen the new twitter feed to the right you may of seem some ramblings about ‘cura’.

Cura is a PHP class I have authored in Co-operation with Psycle Interactive (The company I now work for, so be sure to thank them for allowing me to publish this write up!)

So what does it do?
Cura sets several call back objects in your PHP application that re-directs all session data to a mySQL database.

But why do I need that?
The average 1 server end user can stop reading here, as I can tell you now that Cura is not for you.

If however you are a business fielding multiple web servers then read on.

By passing all your PHP sessions to a database you remove the work around requirements for a load balanced solution.

i.e. web1 web2

1) Shopper arrives at web1 and logs in.
2) Shopper adds item to cart, which is logged against their session.
3) web1 is subjected to a search engine index.
4) web2 is now serving the shopper, shoppers basket is now logged out as their session id has changed …

There are numerous work around methods for this, such as having a single shared mount point for the PHP session files, the use of cookies etc …

The problem is in a high availability solution that a single mount point is just that, it’s singular and therefor a single point of failure.

Then there is the use of cookies, which is fine until you start to store a lot of data during your users session, at which point on each server change you are reliant on the cookie data being transmitted back to the server each time, raising the question what is the point of adding a load balanced solution if the user experience becomes degraded due to it’s deployment?

So secret option number 3 is to use a database, you can remove the single point of failure by having a mySQL cluster, and you haven’t got to worry too much about how much data you are storing.

Because everything is in a database whenever your web application is run (web1, web2) it will read the data from one source, allowing persistent sessions across your whole platform without the need for single mount points or session replication.

The source files are available from: http://svn.saiweb.co.uk/branches/cura-php/trunk/

1
svn co http://svn.saiweb.co.uk/branches/cura-php/trunk/

To deploy this solution simply add the following lines to any file that calls session_start();

1
2
3
4
5
require_once('/path/to/cura.class.php');
$cura = new cura($db, $user, $password, $host);
session_start();
...
the rest of your file...

Ensure that you have created a ’sessions’ table as per the provided sessions.sql file in your database.

I will be adding simplified support for wordpress and joomla shortly these will become available from: http://svn.saiweb.co.uk/branches/cura-php/trunk/

Tags: , , , , , ,

Comments 5 Comments »

Creative Commons License