Archive for the “Bash Script” Category

This issue has come about whilst having to migrate a positively huge number of files, and have to check the integrity of the transfer.

Build the manifest

1
find /path/to/folder -type f -print0 | xargs --null md5sum > /path/to/manifest
  • -type f : This flag tells find to only return files
  • -print0: This flag tells find to null terminate strings, this allows us to take files with spaces
  • –null: This flag tells xargs to accept null terminated strings
  • NOTE: PUT THE MANIFEST OUTSIDE THE FOLDER YOU ARE INDEXING!

Checking the manifest

1
md5sum --check /path/to/manifest | grep FAILED

The above will return all failed checks, if you want a simple count (maybe for automated reporting) just add | wc -l

FAQ

How big is the manifest?

This depends entirely on the length of your filepaths, taking UTF-8 as an encoding example each char is 8bits or 1byte, each manifest line consists of the md5hash, a space and the filepath as the filepath length varies there is no exact way to estimate the filesize of the manifest.

However each line is always 32 + 1 + len(path) bytes.

The more sub directories you have the larger the manifest size will be.

How long does the manifest take to build?

This depends on the number of files you have to index, along with any other factors such as network shares, in test runs 2819 files indexed in 1.493 seconds.

Comments 1 Comment »

In on of those “oh ffs” moments I found myself writing a BASH script to quickly dump all database on a mySQL server.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/bash
MYSQL=`which mysql`;
MYSQLDUMP=`which mysqldump`;
GZIP=`which gzip`;
DEST="/path/to/dump/folder"

USER="root";
PWD="XXXXXX";

DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases'`);

for db in ${DBS[@]};
do
        `$MYSQLDUMP --default-character-set=utf8 --set-charset -u $USER -p$PWD $db | $GZIP -9 > $DEST/$db.sql.gz`
        echo "$db - DONE";
done;

This script gets a list of all databases, dumps them out with UTF8 encoding, and gzip compresses the SQL file into the given “DEST” folder.

If you want to skip over certain databases i.e. “mysql”

Change this line:

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases'`);

To:

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases' | grep -v "database_to_exclude"`);

Or for multiple exclusions

1
DBS=(`$MYSQL  -u $USER -p$PWD  -Bse 'show databases' | grep -v "database_to_exclude" | grep -v "another_database_to_exclude" | grep -v "etc"`);

I may re-write this in Python, if I get time.

Tags: , , ,

Comments 1 Comment »

How to write a bash ‘hello world’ script in 60 seconds, admitedly it could of been faster … damn typos

Also the first line you can add as an alias, if your going to be writing a lot of bash scripts.

Or you can copy paste and have it done in about 5 seconds :-P

1
BPATH=`which bash`; echo "#! $BPATH" | awk '{print $1$2}' > script.sh

The reason for the echo and awk is when trying to do echo “#!$BPATH” > script.sh my shell wouldn’t cooperate so all the awk does is take out the space :-) .

Tags: ,

Comments No Comments »

In part 4, I am going to cover more of an improvement than anything else to part 3

Part 3 itself is not incorrect, it correctly takes a memory footprint for each process running, the same as VIRT in top …

However in processes such as APACHE the VIRT memory is the size of all shared libraries, as correctly shown by pmap …

So what does this mean realy?

The memory usage is infact the following VIRT + RSS, where RSS is the resident set size, the RSS is a representation of the memory in use by the PID, and VIRT is shared between the child processes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[buzz@buzz_srv ~]# ps aux | grep httpd | grep -v 'grep'
root     16378  0.0  0.1 148640  3024 ?        Ss   Nov13   0:00 /usr/sbin/httpd
apache   20088  0.0  0.1 148640  3304 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20101  0.0  0.1 148640  3304 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20756  0.0  0.1 148640  3312 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20759  0.0  0.1 148640  3300 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20790  0.0  0.1 148640  3284 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20792  0.0  0.1 148640  3312 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20798  0.0  0.1 148640  3308 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20804  0.0  0.1 148640  3308 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20886  0.0  0.1 148640  3304 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20906  0.0  0.1 148640  3300 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20907  0.0  0.1 148640  3308 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20912  0.0  0.1 148640  3304 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20915  0.0  0.1 148640  3312 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20959  0.0  0.1 148640  3304 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20969  0.0  0.1 148640  3300 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20994  0.0  0.1 148640  3320 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20995  0.0  0.1 148640  3288 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20996  0.0  0.1 148640  3320 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20997  0.0  0.1 148640  3320 ?        S    Nov13   0:00 /usr/sbin/httpd
apache   20999  0.0  0.1 148640  3296 ?        S    Nov13   0:00 /usr/sbin/httpd

As can be seen above the ‘VIRT’ does not change between the child processes, where as the RSS does dependant on what the thread is doing at that time.

So below is an improved appmem function to allow for this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
function appmem {
        if [ -z "$1" ]; then
                echo "Usage: sysadmin appmem app_name i.e. (sysadmin appmem apache)";
        else
                RRES=(`ps aux | grep "$1" | grep -v 'grep' | grep -v "$0" | awk '{print $6}'`);
                VRES=(`ps aux | grep "$1" | grep -v 'grep' | grep -v "$0" | awk '{print $5}'`);
                COUNT=0;
                VMEM=0;
                RMEM=0;
                for RSS in ${RRES[@]}
                do
                        RMEM=$(($RSS+$RMEM));
                done;
                for VIRT in ${VRES[@]}
                do
                        VMEM=$(($VIRT+$VMEM));
                        COUNT=$(($COUNT+1));
                done;
                VMEM=$(($VMEM/$COUNT));
                VMEM=$(($VMEM/1024));
                RMEM=$(($RMEM/1024));
                echo -e "$YELLOW ----- MEMORY USAGE REPORT FOR '$1' ----- $CLEAR";
                echo "PID Count: $COUNT";
                echo "Shared Mem usage: $VMEM MB";
                echo "Total Resident Set Size: $RMEM MB";
                echo "Mem/PID: $(($RMEM/$COUNT)) MB";
        fi
}

Example output:

1
2
3
4
5
 ----- MEMORY USAGE REPORT FOR 'httpd' -----
PID Count: 41
Shared Mem usage: 140 MB
Total Resident Set Site: 95 MB
Mem/PID: 2 MB
Tags: ,

Comments 2 Comments »

PART 3 IS INACCURATE, THE BELOW SCRIPT IS FOR REFERENCE ONLY, IT HAS BEEN REPLACED IN PART 4

In part 3, I am going to cover a bash function that will allow you to profile the memory usage of any application by name.

By adding the function below into your script you can execute a command such as: sysadmin appmem apache

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
function appmem {
if [ -z "$1" ]; then
echo "Usage: sysadmin appmem app_name i.e. (sysadmin appmem apache)";
else
if [ -x '/usr/bin/pmap' ]; then
APID=(`ps aux | grep "$1" | grep -v 'grep' | grep -v "$0" | awk '{print $2}'`);
COUNT=0;
AMEM=0
for PID in ${APID[@]}
do
TMP=$((`pmap -x $PID | grep "total" | awk '{print $3}'`));
AMEM=$(($AMEM+$TMP));
COUNT=$(($COUNT+1));
done
AMEM=$(($AMEM/1024));
echo -e "$YELLOW ----- MEMORY USAGE REPORT FOR '$1' ----- $CLEAR";
echo "PID Count: $COUNT";
echo "Mem usage: $AMEM MB";
echo "Mem/PID: $(($AMEM/$COUNT)) MB";
echo -e "$RED"
echo -e "For more information run: pmap -x $PID $CLEAR";
else
echo 'Could not execute /usr/bin/pmap ... aborting';
exit;
fi
fi
}

Sample output:

1
2
3
4
5
6
<span style="color: #ffcc00;">----- MEMORY USAGE REPORT FOR 'apache' -----</span>
PID Count: 6
Mem usage: 1134 MB
Mem/PID: 189 MB
<span style="color: #ff0000;">
For more information run: pmap -x 123456</span>

You can of course replace ‘apache’ with the application or daemon name you want to profile the memory usage of.

This script does require that pmap is installed, if the script can not find it, it will abort.

As always any problems, post a comment.

UPDATE: Apparently I need to point out that if you haven’t read PART 2! then the colored output will not work … That’s why this entry is titled part 3, it does assume a degree of competence on your part in realizing part’s 1 and 2 may just be required reading …

NOTE: The above provides a complete memory footprint of the indvidual PID, the same as VIRT in top.

VIRT — Virtual Image (kb)
* The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out.
* VIRT = SWAP + RES

Tags: , ,

Comments 3 Comments »

Part 2 has finally arrived …. don’t all cheer at once now …

In part two I will cover how to run an IP range scan using bash script, and if the host can be pinged retrieve the MAC address of the connected host.

Now bare in mind this script was written to run from a MAC running OSX Leopard.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/bin/bash
#colours
function colours {
CLEAR='\e[00m';
GREEN='\e[0;32m';
RED='\e[0;31m';
YELLOW='\e[1;33m';
}
#ipscan
function ipscan {
IPS_START=1;
IPS_END=254;
IPS_RANGE=192.168.1.
echo "Now running IPSCAN $IPS_RANGE$IPS_START - $IPS_RANGE$IPS_END"
for ((i=$IPS_START;i&lt;=$IPS_END;i+=1)); do
RESULT=`ping -c 1 -t 1 $IPS_RANGE$i | grep "bytes from"`;
if [ -z "$RESULT" ]; then
echo -e "$IPS_RANGE$i:$RED DEAD $CLEAR";
# If you comment out the above to report just the alive hosts, bash gets a bit funny about not processing anything here, so uncomment the below to keep it happy
#holder=$i;
else
MAC=`arp $IPS_RANGE$i | awk '{ print $4 }';`;
echo -e "$IPS_RANGE$i:$GREEN ALIVE $CLEAR ($MAC)";
fi
done
}
colours;
$1 $2

To make this work on your Linux distro replace -t in the ping command with -W and check the awk entry for the arp output, a display of (no) means that no ARP entries could be found for the host, and change the IP range to that of your network.

That’s it for this part, dump this is a file and chmod +x as useual and run with ./script.sh ipscan.

Tags: , , ,

Comments 1 Comment »

Prompted by the following remarks today …

Kerm: “;) there is always an abbreviation in the CLI as all sysadmins are lazy feckers”

Kerm: “Someone might think you actually do work occasionally, god forbid!”

Sysadmins are NOT inherently lazy, we just know how to save time, and are quite adept at doing so …ok?

You cheeky sods!

So let me clear up one instance in which I take a lot of information, and make it quickly and easily accessible using a “Lazy feckers” abbreviation …

Be warned this is a very jaded write up, read on at your own peril.

Right then, onto the point of this post, the sysadmin script part 1, this is going to cover how to check how many connections to a specific port you have on your server.

Trust me this becomes very useful when you have exhausted all other options when trying to figure out why your web server is running like a dog with no legs …

1
netstat -ant

After running the above on your SSH session you will see lines, and lines … and yet more lines of network connection information, especially if you just run this on a busy server.

Example (colours added):

tcp 0 0 ***.***.***.***:25 ***.***.***.***:32794 ESTABLISHED

Key:

PROTOCOL Tx Rx LOCALHOST:PORT FOREIGN_HOST:PORT CONNECTION STATE

From this information it’s pretty easy to spot this is an inbound SMTP connection.

(If you can’t see why, don’t worry it’s ok maybe it’s genetic)

Now this may be handy, but other than taking all this information and dumping it into a spreadsheet (god knows you love those spreadsheets !!! ), how are you going to figure out how many connections are occurring from that external host?

How infact are you going to be able to easily see how many total connections to that port you have ?!?!

Bash script, now for some history, Bash is the Bourne Again Shell, or as I like to think of it, it is the verb for what I will do to your head if you ask me what BASH / SSH / Shell is again …

Now create a directory:

1
2
mkdir ~/.sysadmin
cd ~/.sysadmin

Note the prefixing dot, this will create a “hidden” directory in your home directory (~), the reason for this is so you don’t have system admin script sat in your home directoy, as if you are like me, all sorts of crap moves in an out of that directory on a daily basis, and the last thing you want to do is to have to rummage through backups trying to find “that script you wrote to diagnose connection problems a year ago“.

The point is these scripts will become part of your workflow, once written they will rarely need updating, and should never be called directly, (I mean we’re lazy right? WTH do we want to be typing the full script path for? … oh yeh it saves time!).

In this case:

1
vi ~/.sysadmin/buzz.sh

You can of course call your script whatever you want, and use any text editor you want, if you don’t like / know vi …

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Sysadmin script PART 1 http://www.saiweb.co.uk
# Provided under the MIT license (http://www.opensource.org/licenses/mit-license.php)
# © D.Busby
function usage {
echo "Usage: portcon port";
echo "i.e. portcon 80";
}
function portcon {
echo "----- Active Connections For Port $1 -----";
netstat -ant | grep "ABC.DEF.HIJ.KLM:$1 " | wc -l
netstat -ant | grep "ABC.DEF.HIJ.KLM:$1 " | awk '{ print $5 }'  | awk -F \: '{ print $1  }' | sort | uniq -c  | sort -n
}
if [ -z "$1" ]; then
usage;
exit
fi
$1 $2

Ok so the above code is provided with two functions usage and portcon.

MAKE SURE YOU REPLACE “ABC.DEF.HIJ.KLM” WITH YOUR LOCAL IP ADDRESS

CHMOD this file to allow execution.

1
chmod +x ~/.sysadmin/buzz.sh

Now edit your bashrc file.

1
vi ~/.bashrc

And add the following:

alias buzz=’~/.sysadmin/buzz.sh’

Now exit (logout) your SSH session and log back in (or SU root > SU your_user for testing).

1
2
3
4
[buzz@buzz_srv ~]$ buzz
Usage: portcon port
i.e. portcon 80
[buzz@buzz_srv ~]$

Now run the portcon check …

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[buzz@buzz_srv ~]$ buzz portcon 80
----- Active Connections For Port 80 -----
505
1 ***.***.***.***
3 ***.***.***.***
3 ***.***.***.***
4 ***.***.***.***
4 ***.***.***.***
5 ***.***.***.***
11 ***.***.***.***
14 ***.***.***.***
16 ***.***.***.***
76 ***.***.***.***
373 ***.***.***.***

(Yes before you ask ***.***.***.*** does display the correct IP address, I have purposely removed them for security).

So, I have taken something that would of resulted in netstat output > spreadsheet to formulas > at a estimate 30mins a time analysis to something that now takes less than 5 seconds to type, and get the relevant output, for roughly the same initial effort (30 mins scripting time).

You could argue you can keep a spreadsheet pre-setup with the right formulas / pivot tables and just dump the data each time, well yes you could but that’s no where near as quick as this …

And no trying to convince me it is as quick and better than the script above, for

  1. You have to wait for excel to open the spreadsheet
  2. You have to copy paste the data
  3. You have to wait for excel to process the formulas

If you have a machine that can do that in time equal to or less than the time it takes the script above to output the data, the only thing I have to say is, stop spending such a budget on desktops and get a better server.

Final Thoughts:

This write up is in jest, and is intended to be read as such, the code and methods provided above are factual. etc …

Tags: , ,

Comments No Comments »

Creative Commons License