Archive for August, 2010

Sounds simple enough, right?

Use a cache to serve pages faster, well yes that is true but people often do not realize the fundamentals of caching and how if not done properly it can lead to a detriment in performance.

The first thing you need to realize that by caching your content is no longer dynamic, … (short pause while we wait for the outrage in the back to die down).

The whole point behind your cache is that it will be used instead of processing all your code, why this is beneficial?

You have to remember that PHP is an interpreted language, meaning it takes the following I/O flow:

Apache -> mod_php -> Script -> Interpreter -> Bytecode -> Execution -> Output Buffer

Now there are two types of caching to consider, the first is completion output caching, this also yields the best performance, the second is opcode caching, this caches the byte code generated by the interpreter thus removing that step from the chain of execution.

With me so far? Ok take a deep breath because here we go …

Output caching

This option often yields the best performance, but at the cost of removing the dynamic element from your web app.
But this can be summed up in a single line: What good is dynamic content if you can serve all of 5% of your audience at a given time?

Another turn of phrase is “The slashdot effect”, there are many options for output caching, and you should ideally provide gziped and plain cache files to your end user, for instance on this blog I use WP Super Cache, and can high recommend it, as new content is posted the relevant caches are regenerated, if you are writing your own WebApp check for the “Accept-Encoding:gzip” header being sent via the users browser.

For end user transparency couple this with some mod_rewrite voodoo

1
2
3
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz" [L]

1: If gzip is supported
2: and the cache file exists
3: Redirect visitor to compressed cached file

You “chain of execution” is now

Apache -> readfile

To serve non gziped content:

1
2
3
RewriteCond %{HTTP:Accept-Encoding} !gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME} -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}" [L]

Now to clarify a point you should not be caching images,css,js etc, we’re only covering dynamic content here, and the above are only examples to get you started, you should write rules to exclude certain content specific to your needs.

And before going of at any more of a tangent, here are some figures for you!

ab -c 100 -n 500 -g ./saiweb-nocache-nogzip.bpl http://www.saiweb.co.uk/

  • No caching
  • No Gzip

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 123.304 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54831652 bytes
HTML transferred: 54692607 bytes
Requests per second: 4.06 [#/sec] (mean)
Time per request: 24660.828 [ms] (mean)
Time per request: 246.608 [ms] (mean, across all concurrent requests)
Transfer rate: 434.26 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 57 423 225.5 374 1837
Processing: 2331 20460 16701.2 17232 115192
Waiting: 270 1835 4155.8 576 38549
Total: 2656 20882 16648.1 17692 115421

Percentage of the requests served within a certain time (ms)
50% 17692
66% 20700
75% 24063
80% 25770
90% 35157
95% 53328
98% 82957
99% 101497
100% 115421 (longest request)

As can be seen as the number of requests grew the response time began to increase sharply and the overall performace of the site degrade, bare in mind these benchmarks are being made on my home DSL for the time being.


ab -c 100 -n 500 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 79.212 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54889292 bytes
HTML transferred: 54705058 bytes
Requests per second: 6.31 [#/sec] (mean)
Time per request: 15842.342 [ms] (mean)
Time per request: 158.423 [ms] (mean, across all concurrent requests)
Transfer rate: 676.70 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 56 314 112.5 322 1341
Processing: 2545 14721 5116.7 14296 36677
Waiting: 216 1283 2228.2 351 13776
Total: 2647 15035 5108.9 14624 36897

Percentage of the requests served within a certain time (ms)
50% 14624
66% 16675
75% 18058
80% 19093
90% 21608
95% 23489
98% 27684
99% 29972
100% 36897 (longest request)

A much more consistent line here, however as you can clearly see response times are roughly equal this is due to my DSL connection, so lets run these tests from somewhere with a little more bandwidth say the webserver itself using a loop back connection.


ab -c 100 -n 500 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 0.262199 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54945406 bytes
HTML transferred: 54761172 bytes
Requests per second: 1906.95 [#/sec] (mean)
Time per request: 52.440 [ms] (mean)
Time per request: 0.524 [ms] (mean, across all concurrent requests)
Transfer rate: 204642.27 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 2.6 0 9
Processing: 4 45 10.3 49 58
Waiting: 1 38 9.9 41 50
Total: 9 47 9.5 50 64

Percentage of the requests served within a certain time (ms)
50% 50
66% 51
75% 52
80% 52
90% 54
95% 56
98% 59
99% 61
100% 64 (longest request)

In this case the response times rise and then plateau, no after which no further degradation occurs.


ab -c 100 -n 500 -g ./saiweb-nocache.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 8.919565 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54680788 bytes
HTML transferred: 54543000 bytes
Requests per second: 56.06 [#/sec] (mean)
Time per request: 1783.913 [ms] (mean)
Time per request: 17.839 [ms] (mean, across all concurrent requests)
Transfer rate: 5986.73 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 14 30.7 0 85
Processing: 246 1556 714.3 1365 6735
Waiting: 241 1539 707.8 1360 6731
Total: 250 1571 708.0 1368 6735

Percentage of the requests served within a certain time (ms)
50% 1368
66% 1451
75% 1550
80% 1700
90% 2658
95% 3121
98% 3491
99% 3638
100% 6735 (longest request)

Oh dear of dear lets cut to the hard facts shall we?

We’ve gone from serving 1906.95 requests a second to 56.06

  • a 97.1% decrease in performance when removing caching
  • or a 3401.1% increase in performance when implementing caching

We’ve gone from a response time of ~50ms to ~2000ms

  • a 97.5% decrease in performance when removing caching
  • or a 4000% increase in performance when caching is on

Then there is the CPU an memory overheads to consider, in this case a more prolonged test is required to gain the relevant sar data,
now let me tell you that intentionally trying to get a test like this to run over a 10 minute period with the correct caching on is a lot harder than it sounds, the tests infact were completing far too quickly …

The problem I face is to make ab perform a long enough timed duration of results cached, I know for a fact uncached the server will fail under the load, so I have no way at present of grabbing this reliably,

what I can tell you is that this command: ab -c 300 -n 1000000 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

caused a load average of 2.96, 1.9,0.93 cache, and got as high as 21 before I killed it uncached.

Now I am going to bring this post to an end as it is getting quiet long, I plan to cover the following in a 2nd part.

  1. Opcode caching
  2. CPU & Memory usage, Cached vs. UNcached

Tags: , , ,

Comments No Comments »

For some background you may want to read the Original Story leading to this write up.

The first thing that caught my attention was the fact Logwatch was reported login failures in the order of 1000′s from unassigned.psychz.net without an accompanying fail2ban email notifying me the offender had been banned.

And this as it would turn out was because the attack was clearly intended to defeat such protection methods, this is due to the logged host being unassigned.psychz.net, when the authentication failure is logged, a reverse lookup is made within vsftpd to resolve the host this PTR record returns unassigned.psychz.net, and as such is written into the log.

fail2ban no uses regex to extract the host from the logs, and attempts to make a forward lookup on unassigned.psychz.net (A/CNAME records required) to resolve the ip address, and ban the offending ip, this is where things go awry.

psychz.net maintains their own DNS servers,

  1. DNS1.PSYCHZ.NET
  2. DNS2.PSYCHZ.NET

These provide a PTR but no A/CNAME record, as such fail2ban can not resolve an IP and the attacking ip is left to run their attack unhindered, see this log file: fail2ban name resolution failure log

The only way therefor to gain the attacking ip was to match the ftp connection times to those of the reported login failures using iptables to log all accesses to ftp, quickly get a count of connecting ip’s using:

1
grep kernel /var/log/messages | awk '{print $9}' | sed 's/SRC=//' | uniq -c | sort
1
390 173.224.217.41

A complete log can be found here: iptables.log, and a whois can be found here: whois.txt

Disclosure steps taken:

  1. 26/07/10 psychz support informed given deadline of 09/08/10 for resolution
  2. Same day standard reply of “thanks for contacting support we are looking into this” …
  3. 27/07/0 Attacks continue 173.224.208.0/20 network black holed as a result
    1
    iptables -A INPUT -s 173.224.208.0/20 -j DROP
  4. 09/08/10 deadline passes without update
  5. 25/08/10 this blog post published

Tags: , , ,

Comments 5 Comments »

This blog entry here: http://rackerhacker.com/2010/08/25/a-nerds-perspective-on-cloud-hosting/ prompted me to write this blog post, after I realized I’d filled the comment field, without ending my “monologue”, anyway I thought it would be better to voice my opinions here, to you lot who are daft enough to read this blog.

I think the problem mainly is the term “cloud” has been massively over marketed and possibly long since lost it’s original meaning, with providers trying to jump on the marketing bandwagon.

I’ve not made the jump to “the Cloud” yet, as frankly I can’t see a benefit to them over properly configured HA installations, for example I would much rather be using several pre-configured servers using RHCS to handle the migration of critical services (mySQL etc..).

I begin to see the benefits for large hosting providers, where customers what the power of a dedicated server but only pay for what they actually use, in this instance a provider ensures up time through live migration,

Some other misconceptions through over marketing I’d like to point out,

1) The “cloud” is not always on

Don’t get me wrong it can be configured to be close, using distributed VM’s for your critical services (i.e. apache), coupling this with loadbalancing and clustering setups.

The misconception for most “end users” is that if you buy a single cloud instance, through magic/voodoo it will always be on 100% of the time!

Simply put if the hardware it was running on dies, it will go down, regardless of live migration measures in place, there will be downtime, do not pass go do not collect http 200 go directly to > /dev/null

2) The “cloud” is not secure

If you insist on putting your 5 year old joomla website on a cloud VM, it can and will become compromised quickly, security is only going to be as good as the configuration you have in place, you have mitigation measures such as

  • selinux
  • webapp updates/patches
  • fail2ban/banhosts packages

Whilst in itself a VM is largely seen as secure as it protects the host machine should the VM become compromised, it is not always the case, for instance there have been several occurrences of VMWare ESXI servers allowing code execution on the host (long since patched Don’t panic!), allowing attackers who have compromised a VM on the cloud to root the host machine and as a cascading effect every other VM instace on the box.

Let me point out a worst case scenario here:

  1. Hypervisor running on Host A with 30 Vm’s
  2. Host A is part of a resilient set with live migration in place, Hosts B,C,D
  3. VM A’s 5 year old joomla app is subject to an XSS bug, and an attacker places the r57 shell on the webapp,
  4. attacker proceeds to deploy backdoors (i.e. meterpreter)
  5. VM A is subject to remote code execution on host
  6. Attacker compromises Host
  7. Host A is now root’ed
  8. Attacker forces Migration of VM A onto Host B
  9. Host B rooted using same method
  10. Rinse & repeat for C & D

In summary, if you are looking at a cloud solution and your web presence is important take an informed decision from one of the larger providers, and NEVER EVER go with the cheapest option you could find, probably on ebay …

The cloud is not some magical being created by the hosting fairies that will take all your hosting and maintenance woes away, it may or may not be the right thing for your business / web app, and in certain instances can lower TCO, I for one will be sticking with my Cluster services and high Availability designs for a while yet.

Tags: , , ,

Comments 1 Comment »

Time was when a photo was just a captured moment in time, /end nostalgia

Nowadays though what people do not realize is the shear amount of “extra” information is embedded in “that picture you just uploaded to flikr/facebook/photo bucket” especially if you are uploading from a “smart phone” as more and more people are now.

Most photos now contain GPS data embedded in them, this information will survive a resize / upload process, at the time of writing images tested from Facebook appear to have the exif data stripped out (thumbs up for facebook maybe), and it appears php GD by default replaces all EXIF data with it’s own (bug maybe?).

For non sanitized images however you can discern a wealth of information such as:

  1. Make of camera
  2. Model of camera
  3. Software version
  4. Unix timestamp of time taken
  5. DateTime stamp of time taken
  6. Focal length used
  7. Shutter speed
  8. if flash used

And if GPS is embedded:

  1. Longitude
  2. Latitude
  3. Altitude
  4. GPS timestamp
  5. Direction facing when photo taken

There is yet more data such as the colour profile used, and image resolutions, in my tests photos taken from my iPhone 4 were within 10 meters of where I was actually standing when I took the picture, and in which direction I was facing when I took them.

So one more thing to note in your applications “data sanity” is to strip EXIF tags from uploaded images, lest your contributors private details be leaked from your application.

For example:

  1. User uploads photo for competition
  2. Site uses resized photo on competition page to allow visitor voting
  3. malicious user, saves image from site (or just uses the copy from thier browser cache), gets gps data from photo
  4. malicious user now knows exact whereabouts photo was taken aswell as the time.

And it doesn’t have to be a malicious user, it could be anyone/anything, if you want to check your images for EXIF data you can use my tool here: http://www.saiweb.co.uk/tools/exif_data.php

No data is stored, and images are deleted immediately after processing, you use this at your own risk however, if you misuse the tool you accept all liability for the legal action to follow, you have been warned.

Tags: , , , ,

Comments No Comments »