Archive for the “Hosting” Category

If you haven’t tried boxgrinder then you are missing out, it makes it extremely easy to script the generation of a virtual machine for output to Rackspace (Well not yet), ec2, vmware, virtualbox, KVM etc.

In this post I will cover the basic generation of a LAMP (Linux Apache MySQL PHP) stack CentOS appliance, nothing to complicated I assure you, and no magic like auto deployment spin up etc … that’s for later … no skipping ahead!

First of all you’re going to need boxgrinder I recommend downloading the Meta appliance, as it has all the tools you need already.

Now I am covering the following.

  1. basic use of boxgrinder-build on the meta appliance
  2. creation of centos lampstack basic
  3. deploying the image to KVM

I’m going to have to assume that you are capable of downloading and starting up the meta appliance yourself, and focus more on the stack setup.

Grinding your VM

Ok so you are going to need a YAML file defining the CentOS lamp stack, save this on your meta appliance as CentOS-lamp.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
name: CentOS-lamp
summary: Generic CentOS 5.6 LAMP stack, with some apache & php tuning
version: 1
release: 0
hardware:
cpus: 2
memory: 1024
partitions:
"/":
size: 5
"/var/www":
size: 15
os:
name: centos
version: 5
password: changeme

On your Meta appliance run.

1
boxgrinder-build -d CentOS-lamp.appl

This process will take a while, so go and get a coffee, this will produce ./build/appliances/x86_64/centos/5/CentOS-lamp/CentOS-lamp-sda.raw once complete, if you run into issues the -d flag is “debug” paste your log output int the comments and I will do my best to diagnose and fix your issue.

Deploying to KVM

boxgrinder has SFTP support for pushing to remote servers, you can use this if you like to automate the “push” of the image to your KVM server, at the moment automated deployment to KVM is not support but may be coming soon.

Assuming you have placed you image in /var/lib/libvirt/images/

1
virt-install -n "Saiweb - CentOS-lamp Demo" -r 1024 --arch=x86_64 --vcpus=1 --os-type=linux --os-variant=rhel5.4 --disk path=/var/lib/libvirt/images/CentOS-lamp.raw,size=20,cache=none,device=disk --accelerate --network=bridge:br0 --vnc --import

Post startup

this is a VERY basic setup I have not covered any of the post install options in this post (but I will in future posts), so.

1
2
chkconfig httpd on && service httpd start
chkconfig mysqld on && service mysqld start

This will set your services to automatically start at startup, and start them.

Tags: , , , ,

Comments 5 Comments »

If you tie in your web application to automatically PURGE content when you modify it, thus keeping the content “fresh” while using Varnish you may notice if you made the jump from 2.x to 3.x that your PURGE VCL is no longer working, I refer you to: https://www.varnish-software.com/blog/bans-and-purges-varnish-30

In short replace your usual

1
2
3
4
5
6
7
8
9
10
11
sub vcl_hit {
        if (req.request == "PURGE") {
                set obj.ttl = 0s;
                error 200 "Purged."; #uses error function to return simple confirmation
        }
}
sub vcl_miss {
        if (req.request == "PURGE") {
                error 404 "Not in cache."; #request to purge none existant item
        }
}

with

1
2
3
4
5
6
7
8
9
sub vcl_recv {
        if (req.request == "PURGE") {
                if (!client.ip ~ purge) {
                        error 405 "Not allowed.";
                }
                ban("req.url ~ "+req.url+" && req.http.host == "+req.http.host);
                error 200 "Purged.";
        }
...

Substituting “~ purge” with your ACL name, the above implement wild card purging aswell, if you do not want this and only want PURGE for the exact passed URL replace

“req.url ~ “+req.url

with

“req.url == “+req.url

Tags: , , , ,

Comments No Comments »

Ok, so following up on PHP & Caching with Varnish, let’s cut to the hard facts shall we?

Using the same tests as

ab -c 100 -n 500 -g ./saiweb-nocache-nogzip.bpl http://www.saiweb.co.uk/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking www.saiweb.co.uk (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Finished 500 requests

Server Software: Apache
Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 92719 bytes

Concurrency Level: 100
Time taken for tests: 0.184 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 47597095 bytes
HTML transferred: 47379409 bytes
Requests per second: 2716.92 [#/sec] (mean)
Time per request: 36.806 [ms] (mean)
Time per request: 0.368 [ms] (mean, across all concurrent requests)
Transfer rate: 252573.13 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 4 1.1 4 6
Processing: 9 31 7.0 32 47
Waiting: 2 7 5.7 4 26
Total: 15 35 6.8 36 53

Percentage of the requests served within a certain time (ms)
50% 36
66% 38
75% 39
80% 39
90% 41
95% 44
98% 48
99% 51
100% 53 (longest request)

ab -c 100 -n 500 -g ./saiweb-nocache-nogzip.bpl http://www.saiweb.co.uk/

2716.92 requests per second with a server load average of 0.1, and in this case varnish is serving cache from disk.

Caching using varnish (Or even nginx / mod_cache) means that PHP does not get executed at all, the cache system grabs the cache content and serves it.

This of course has the benefit of reducing the CPU and memory resources needed for the running of your application, but it does have some caveats.

  • This only works for GET requests, and content not reliant on Cookies (Truely dynamic content will not cache)
  • But on the “flipside” Varnish supports ESI, which when setup correctly you can target the dynamic sections of a pag for “passthrough” and have the rest cached
    1. More details to come, as I have time to add them I have have a lot of posts to make on boxgrinder, KVM, libvirtd etc.

      Tags: , ,

Comments No Comments »

Pre-req reading: Part 1

In this part we will cover setting up a backend. A backend is your application server, whether this be apache / nginx / iis (IIS – Is Inherently Stupid) you are telling varnish where it should sends it’s requests to.

Very basic configuration

1
2
3
4
.backend app1 {
    .host = "127.0.0.1";
    .port = "8080;"
}

For a quick start that’s it really you tell varnish a backend and the port to connect to it on … just make sure you use it in vcl_recv, but you’re not here for simple and quick start are you? lets add the following.

  • timeout settings
  • probe settings

Timeout settings

Your timeout settings deinf how long varnish should wait for a response from your backend

1
2
3
4
5
6
7
.backend app1 {
    .host = "127.0.0.1";
    .port = "8080;"
    .connect_timeout = 0.05s;
    .first_byte_timeout = 2s;
    .between_bytes_timeout = 2s;
}
  • connect_timeout wait 50ms for a tcp connection to take place
  • first_byte_timeout wait 2s for the first byte of data to be sent from the backend
  • between_bytes_timeout wait 2s if there is a pause mid data stream

Timeouts are a basic way of determining if a backend is down / miss behaving if you have multiple backends if timeouts occur then the backend is marked as sick and the other backends will be used.

probe settings – Trust me I’m a doctor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
.backend app1 {
    .host = "127.0.0.1";
    .port = "8080;"
    .connect_timeout = 0.05s;
    .first_byte_timeout = 2s;
    .between_bytes_timeout = 2s;
    .probe = {
    .url = "/status.html";
    .timeout = 0.05s;
    .window = 5;   
    .threshold = 3; #60% of last checks must of been OK for this backend to be healthy
    .interval = 2s; #how often to run the checks
    }
}
  • url the URL to to query this must return a 200 OK response, you could use a php script to return a 500 on say a mySQL outage
  • timeout how long to wait for a 200 OK response from the URL
  • window keep the result of the last 5 probes in memory
  • threshold how many of the window total must be OK for the backend to be “healthy”
  • interval how often to run the probe

And that about wraps up this post.

Tags: , , ,

Comments No Comments »

Part 1, what is varnish?

The varnish cache project is one you really need to get familiar with if you manage any high volume websites, it can mean the difference between a self destructing web app that buckles under it’s own load, and an apparently seamless web app serving 1000′s of concurrent connections per second with relative ease.

How does it work?

Varnish acts as a proxy server, in that when a use sends a GET request varnish will lookup in its internal database for a cached version and if it can not find one it will pass the request to the “back end” or in this case an apache server, varnish will then cache the response for subsequent accesses.

Now you may ask yourself why do you need this? this boils down to what you are trying to achieve with your web application, if your application is heavily reliant on dynamic content and regularly gets some 400 concurrent users for example, lets assume the following:

  1. 400 concurrent unique users
  2. Average page render time is 0.85s

The Math

Based on this if you were to place varnish in front of your application with a 60second ttl (time to live, length of time varnish will hold an object in cache):

  1. Varnish ttl 60 seconds
  2. 400/0.85 = 470.59/second
  3. 28235.29/minute
  4. Factor of reduction to “back end”: x28235.29

So in the example above simply by caching a page for as little as 60 seconds, the requests/minute as reduced from 28235.29 to 1, now even reducing the cache times to 10 seconds in this example would give a x4705.88 reduction.

How is this reduction a good thing, well time on cpu for one, varnish when configured correctly is very very fast, and even with an out of the box configuration it’s still going to be much faster than your dynamic web application.

Summary

So here ends a brief introduction to varnish and why you realy want to start using it, in the following parts we will cover

  • Configuration overview
    • brief overview of each sub section based on the 2.1 syntax
    • Advanced configuration
      • Load balancing
      • Failover handling
      • Raising cache hitrate
      • Pros and cons of each setup
      • Benchmarks

Tags: , , , ,

Comments 3 Comments »

Sounds simple enough, right?

Use a cache to serve pages faster, well yes that is true but people often do not realize the fundamentals of caching and how if not done properly it can lead to a detriment in performance.

The first thing you need to realize that by caching your content is no longer dynamic, … (short pause while we wait for the outrage in the back to die down).

The whole point behind your cache is that it will be used instead of processing all your code, why this is beneficial?

You have to remember that PHP is an interpreted language, meaning it takes the following I/O flow:

Apache -> mod_php -> Script -> Interpreter -> Bytecode -> Execution -> Output Buffer

Now there are two types of caching to consider, the first is completion output caching, this also yields the best performance, the second is opcode caching, this caches the byte code generated by the interpreter thus removing that step from the chain of execution.

With me so far? Ok take a deep breath because here we go …

Output caching

This option often yields the best performance, but at the cost of removing the dynamic element from your web app.
But this can be summed up in a single line: What good is dynamic content if you can serve all of 5% of your audience at a given time?

Another turn of phrase is “The slashdot effect”, there are many options for output caching, and you should ideally provide gziped and plain cache files to your end user, for instance on this blog I use WP Super Cache, and can high recommend it, as new content is posted the relevant caches are regenerated, if you are writing your own WebApp check for the “Accept-Encoding:gzip” header being sent via the users browser.

For end user transparency couple this with some mod_rewrite voodoo

1
2
3
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz" [L]

1: If gzip is supported
2: and the cache file exists
3: Redirect visitor to compressed cached file

You “chain of execution” is now

Apache -> readfile

To serve non gziped content:

1
2
3
RewriteCond %{HTTP:Accept-Encoding} !gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME} -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}" [L]

Now to clarify a point you should not be caching images,css,js etc, we’re only covering dynamic content here, and the above are only examples to get you started, you should write rules to exclude certain content specific to your needs.

And before going of at any more of a tangent, here are some figures for you!

ab -c 100 -n 500 -g ./saiweb-nocache-nogzip.bpl http://www.saiweb.co.uk/

  • No caching
  • No Gzip

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 123.304 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54831652 bytes
HTML transferred: 54692607 bytes
Requests per second: 4.06 [#/sec] (mean)
Time per request: 24660.828 [ms] (mean)
Time per request: 246.608 [ms] (mean, across all concurrent requests)
Transfer rate: 434.26 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 57 423 225.5 374 1837
Processing: 2331 20460 16701.2 17232 115192
Waiting: 270 1835 4155.8 576 38549
Total: 2656 20882 16648.1 17692 115421

Percentage of the requests served within a certain time (ms)
50% 17692
66% 20700
75% 24063
80% 25770
90% 35157
95% 53328
98% 82957
99% 101497
100% 115421 (longest request)

As can be seen as the number of requests grew the response time began to increase sharply and the overall performace of the site degrade, bare in mind these benchmarks are being made on my home DSL for the time being.


ab -c 100 -n 500 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 79.212 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54889292 bytes
HTML transferred: 54705058 bytes
Requests per second: 6.31 [#/sec] (mean)
Time per request: 15842.342 [ms] (mean)
Time per request: 158.423 [ms] (mean, across all concurrent requests)
Transfer rate: 676.70 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 56 314 112.5 322 1341
Processing: 2545 14721 5116.7 14296 36677
Waiting: 216 1283 2228.2 351 13776
Total: 2647 15035 5108.9 14624 36897

Percentage of the requests served within a certain time (ms)
50% 14624
66% 16675
75% 18058
80% 19093
90% 21608
95% 23489
98% 27684
99% 29972
100% 36897 (longest request)

A much more consistent line here, however as you can clearly see response times are roughly equal this is due to my DSL connection, so lets run these tests from somewhere with a little more bandwidth say the webserver itself using a loop back connection.


ab -c 100 -n 500 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 0.262199 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54945406 bytes
HTML transferred: 54761172 bytes
Requests per second: 1906.95 [#/sec] (mean)
Time per request: 52.440 [ms] (mean)
Time per request: 0.524 [ms] (mean, across all concurrent requests)
Transfer rate: 204642.27 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 2.6 0 9
Processing: 4 45 10.3 49 58
Waiting: 1 38 9.9 41 50
Total: 9 47 9.5 50 64

Percentage of the requests served within a certain time (ms)
50% 50
66% 51
75% 52
80% 52
90% 54
95% 56
98% 59
99% 61
100% 64 (longest request)

In this case the response times rise and then plateau, no after which no further degradation occurs.


ab -c 100 -n 500 -g ./saiweb-nocache.bpl http://www.saiweb.co.uk/

Server Hostname: www.saiweb.co.uk
Server Port: 80

Document Path: /
Document Length: 109086 bytes

Concurrency Level: 100
Time taken for tests: 8.919565 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 54680788 bytes
HTML transferred: 54543000 bytes
Requests per second: 56.06 [#/sec] (mean)
Time per request: 1783.913 [ms] (mean)
Time per request: 17.839 [ms] (mean, across all concurrent requests)
Transfer rate: 5986.73 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 14 30.7 0 85
Processing: 246 1556 714.3 1365 6735
Waiting: 241 1539 707.8 1360 6731
Total: 250 1571 708.0 1368 6735

Percentage of the requests served within a certain time (ms)
50% 1368
66% 1451
75% 1550
80% 1700
90% 2658
95% 3121
98% 3491
99% 3638
100% 6735 (longest request)

Oh dear of dear lets cut to the hard facts shall we?

We’ve gone from serving 1906.95 requests a second to 56.06

  • a 97.1% decrease in performance when removing caching
  • or a 3401.1% increase in performance when implementing caching

We’ve gone from a response time of ~50ms to ~2000ms

  • a 97.5% decrease in performance when removing caching
  • or a 4000% increase in performance when caching is on

Then there is the CPU an memory overheads to consider, in this case a more prolonged test is required to gain the relevant sar data,
now let me tell you that intentionally trying to get a test like this to run over a 10 minute period with the correct caching on is a lot harder than it sounds, the tests infact were completing far too quickly …

The problem I face is to make ab perform a long enough timed duration of results cached, I know for a fact uncached the server will fail under the load, so I have no way at present of grabbing this reliably,

what I can tell you is that this command: ab -c 300 -n 1000000 -g ./saiweb-cached.bpl http://www.saiweb.co.uk/

caused a load average of 2.96, 1.9,0.93 cache, and got as high as 21 before I killed it uncached.

Now I am going to bring this post to an end as it is getting quiet long, I plan to cover the following in a 2nd part.

  1. Opcode caching
  2. CPU & Memory usage, Cached vs. UNcached

Tags: , , ,

Comments No Comments »

This blog entry here: http://rackerhacker.com/2010/08/25/a-nerds-perspective-on-cloud-hosting/ prompted me to write this blog post, after I realized I’d filled the comment field, without ending my “monologue”, anyway I thought it would be better to voice my opinions here, to you lot who are daft enough to read this blog.

I think the problem mainly is the term “cloud” has been massively over marketed and possibly long since lost it’s original meaning, with providers trying to jump on the marketing bandwagon.

I’ve not made the jump to “the Cloud” yet, as frankly I can’t see a benefit to them over properly configured HA installations, for example I would much rather be using several pre-configured servers using RHCS to handle the migration of critical services (mySQL etc..).

I begin to see the benefits for large hosting providers, where customers what the power of a dedicated server but only pay for what they actually use, in this instance a provider ensures up time through live migration,

Some other misconceptions through over marketing I’d like to point out,

1) The “cloud” is not always on

Don’t get me wrong it can be configured to be close, using distributed VM’s for your critical services (i.e. apache), coupling this with loadbalancing and clustering setups.

The misconception for most “end users” is that if you buy a single cloud instance, through magic/voodoo it will always be on 100% of the time!

Simply put if the hardware it was running on dies, it will go down, regardless of live migration measures in place, there will be downtime, do not pass go do not collect http 200 go directly to > /dev/null

2) The “cloud” is not secure

If you insist on putting your 5 year old joomla website on a cloud VM, it can and will become compromised quickly, security is only going to be as good as the configuration you have in place, you have mitigation measures such as

  • selinux
  • webapp updates/patches
  • fail2ban/banhosts packages

Whilst in itself a VM is largely seen as secure as it protects the host machine should the VM become compromised, it is not always the case, for instance there have been several occurrences of VMWare ESXI servers allowing code execution on the host (long since patched Don’t panic!), allowing attackers who have compromised a VM on the cloud to root the host machine and as a cascading effect every other VM instace on the box.

Let me point out a worst case scenario here:

  1. Hypervisor running on Host A with 30 Vm’s
  2. Host A is part of a resilient set with live migration in place, Hosts B,C,D
  3. VM A’s 5 year old joomla app is subject to an XSS bug, and an attacker places the r57 shell on the webapp,
  4. attacker proceeds to deploy backdoors (i.e. meterpreter)
  5. VM A is subject to remote code execution on host
  6. Attacker compromises Host
  7. Host A is now root’ed
  8. Attacker forces Migration of VM A onto Host B
  9. Host B rooted using same method
  10. Rinse & repeat for C & D

In summary, if you are looking at a cloud solution and your web presence is important take an informed decision from one of the larger providers, and NEVER EVER go with the cheapest option you could find, probably on ebay …

The cloud is not some magical being created by the hosting fairies that will take all your hosting and maintenance woes away, it may or may not be the right thing for your business / web app, and in certain instances can lower TCO, I for one will be sticking with my Cluster services and high Availability designs for a while yet.

Tags: , , ,

Comments 1 Comment »

Welcome to part one of the ‘zen of secured shared hosting’ series.

In this part I will be covering the concepts of secured shared hosting, and why you as a shared hosting provider should be taking steps to ensure this is how you deploy your hosting environments.

Let’s first take a typical L.A.M.P setup:

PHP Compiled from source as apache module.
mySQL installed from RPM or update package (yum / up2date).
HTTPD installed as RPM or update package (yum / up2date).

Please note at the time of writing if you yum / apt-get / up2date install your PHP package you will have varying results when attempting to compile and install suPHP, as such grab the source code from php.net, and follow this series.

As a shared hosting provider lets say you have 5 clients all hosted from the one server, each client using vsftpd is chrooted() into their home directory, and their ssh access disabled, supposedly secure enough.

Unfortunatly not so, due to the L.A.M.P configuration the ‘apache’ user needs a minimum of read and execute permissions over all the PHP files on the system, why is this a problem?

This is a problem largely due to human nature of the client, your ‘joe bloggs’ client doesn’t care about the technical aspects of web hosting or websites, they just want an easy pretty interface to get their corner of the internet online, downloading something like drupal or joomla.

Now this isn’t a dig at open source CMS, this is an insight into human nature, look at the changelog for any open CMS and you will see ‘security fixes’, unfortunatly all ‘joe bloggs’ cares about is that their website is working, and this is wher things take a turn for the worse.

Joe Bloggs never updates his open CMS platform, meaning any vulnerabilities patched in subsequent releases are still exploitable on his website, worst case scenario that this is an XSSI (Cross Server Script Includes) vulnerbility.

An attacker finds this website and idetifies the security hole, using XSSI to install a PHP interactive shell, giving the attacker SSH like access to the hosting environment, most people at this point think so the attacker has compromise one site … so what we can restore that site from backups and it’s only one site that’s affected, the other 4 users either do not use open CMS or are up to date with all the security patches.

Well that’s where you would be wrong, with the hosting setup outlined above the SSH like PHP shell is now running as the apache user, meaning the attacker can go anywhere and read anything apache can, and with the hosting setup oulined above that mean reading things like datbase connection files, suddenly all the clients on the hosting environment have their websites compromised as the attacker gains mySQL access and starts changing content on thewebsites, despite the fact that the other 4 sites themselves were never exploited.

One clients error just became a cascading exploit on your hosting platform, now make that a more realistic platform say 30 clients on the box, some are online shops, the issue just became a whole lot bigger there is lost revenue due to downtime of the shop sites, and worse still the attacker now has access to any customer details those shops were storing! but it’s not Joe Bloggs that’s accountable it’s YOU as the hosting provider, you can take steps to prevent one exploited site becoming 30, and this web series will tell you host to do it.

coming in part 2:

an introduction to suPHP
compiling php as a cgi binary, and why you need to do so

Tags: , , ,

Comments 3 Comments »

So I have a concept for a 24 node cluster, I want to build to run folding@home and other cancer research projects

The spec is to make it as “green” as possible, the lower the watts/ghz the better, whilst still maintinaing performance.

The downside is the pricing of pico-ITX at the moment, I’m getting quotes for about £200 / node … and that’s without ram, or storage …

What I need is the following.

pico-ITX form factor motherboard with ~1ghz CPU, in quantities of 5 or more

So please leave a comment or use the contact me form if you think you can help, or have any information on suppliers.

Tags: , , ,

Comments No Comments »

Due to latency issues, and the lack of multi site support I have ditched my old web-host.

 In favour of an all singing all dancing NEW ONE! nativespace thus-far I have had excellent ticket turn around (all in 30 mins or less), and my initial sales enquiry (consisting of a lot of lengthy questions)  responded to in …. 6 minutes!

So thus far definitely on my recommended list

Tags:

Comments 1 Comment »