Tuesday, December 14, 2010

Setup Link Aggregation for Fault tolerance and build fat network pipes from most network interfaces without any trunking software.

So anyway, one of the things I'd been wanting to do was figure out how to bond multiple network interfaces to give me some resilience on the net connection. Turns out it's pretty easy in Solaris using 'dladm' and ifconfig. Solaris 10 introduced the nemo framework for drivers and Solaris Nevada has more projects which build on said framework. In Update 2 of Solaris 10 support for data link aggregation was added.

From the manual for dladm

The dladm command is used to configure data-links. A config-ured data-link is represented in the system as a STREAMS DLPI (v2) interface which may be plumbed under protocol stacks such as TCP/IP. Each data-link relies on either a single network device or an aggregation of devices to send packets to or receive packets from a network.

The instructions here are correct for a server with at least 4 e1000 interfaces, where the first one was configured during the install as the network interface. You'll also want to do this from the console, because the first thing we're going to do is take down the network interface:

harrysolvm# pfexec ifconfig e1000g0 unplumb

harrysolvm# dladm show-dev

e1000g0         link: up        speed: 1000  Mbps       duplex: full e1000g1         link: up        speed: 1000  Mbps       duplex: full e1000g2         link: up        speed: 1000  Mbps       duplex: full e1000g3         link: up        speed: 1000  Mbps       duplex: full

Now we know what devices are available for aggregation. Lets use `dladm` to bond the two network interfaces together and bring the bonded interface back up.

harrysolvm# pfexec dladm create-aggr -d e1000g0 -d e1000g3 1

harrysolvm# pfexec dladm show-aggr 1

key: 1 (0x0001) policy: L4      address: 0:14:4f:1:c8:b0 (auto)            device       address                 speed           duplex  link    state            e1000g0      0:14:4f:1:c8:b0   1000  Mbps    full    up      standby            e1000g3      0:14:4f:1:c8:b3   1000  Mbps    full    up      standby

Agrregation completes and the link is in standby mode, next we need to plumb it.

harrysolvm# pfexec ifconfig aggr1 plumb

Now we give it an IP address:

harrysolvm# pfexec ifconfig aggr1 netmask up

regular ifconfig to setup the link. Lets check the device state now.

harrysolvm# pfexec dladm show-aggr 1

key: 1 (0x0001) policy: L4      address: 0:14:4f:1:c8:b0 (auto)            device       address                 speed           duplex  link    state            e1000g0      0:14:4f:1:c8:b0   1000  Mbps    full    up      attached            e1000g3      0:14:4f:1:c8:b3   1000  Mbps    full    up      attached

We add some more nics to the device while the device is up and running.

harrysolvm# pfexec dladm add-aggr -d e1000g1 -d e1000g2 1

harrysolvm# pfexec dladm show-aggr 1

key: 1 (0x0001) policy: L4      address: 0:14:4f:1:c8:b0 (auto)            device       address                 speed           duplex  link    state            e1000g0      0:14:4f:1:c8:b0   1000  Mbps    full    up      attached            e1000g3      0:14:4f:1:c8:b3   1000  Mbps    full    up      attached            e1000g1      0:14:4f:1:c8:b1   1000  Mbps    full    up      attached            e1000g2      0:14:4f:1:c8:b2   1000  Mbps    full    up      attached

Now lets show off and remove a nic from the link

harrysolvm# pfexec dladm remove-aggr -d e1000g0 1

harrysolvm# pfexec dladm show-aggr 1

key: 1 (0x0001) policy: L4      address: 0:14:4f:1:c8:b3 (auto)            device       address                 speed           duplex  link    state            e1000g3      0:14:4f:1:c8:b3   1000  Mbps    full    up      attached            e1000g1      0:14:4f:1:c8:b1   1000  Mbps    full    up      attached            e1000g2      0:14:4f:1:c8:b2   1000  Mbps    full    up      attached

There's one last thing you'll want to do:

harrysolvm# pfexec mv /etc/hostname.{e1000g0,aggr1}

The changes made with `ifconfig` don't automatically persist. It uses a combination of `/etc/hostname.`, `/etc/hosts` and `/etc/inet/netmasks` to infer the IP address of the interface. So you need to go putting the information in `/etc/network/interfaces` or some file in `/etc/sysconfig`.

Wednesday, November 24, 2010

vol copy

This is effectively the same as a vol clone, but you need to do the entire operation before the volume is online and available. You need to create the destination volume first and then restrict it so that it is ready for the copy. Then you start the copy process.

# vol copy start -s snap_name source_vol dest_vol

“-s snap_name” defines the snapshot you want to base the copy on, and “source_vol” and “dest_vol” define the source and destination for the copy. “-S” can also be used to copy across all the snapshots that are also included in the volume. This can be very useful if you need to copy all backups within a volume as well as just the volume data.

# vol copy start [ -S | -s snapshot ] source destination

Copies all data, including snapshots, from one volume to another. If the -S flag is used, the command copies all snapshots in the source volume to the destination volume. To specify a particular snapshot to copy, use the -s flag followed by the name of the snapshot. If neither the -S nor -s flag is used in the command, the filer automatically creates a distinctively-named snapshot at the time the vol copy start command is executed and copies only that snapshot to the destination volume.

The source and destination volumes must either both be traditional volumes or both be flexible volumes. The vol copy command will abort if an attempt is made to copy between different volume types.

The source and destination volumes can be on the same filer or on different filers. If the source or destination volume is on a filer other than the one on which the vol copy start command was entered, specify the volume name in the filer_name:volume_name format.

The filers involved in a volume copy must meet the following requirements for the vol copy start command to be completed successfully:

The source volume must be online and the destination volume must be offline.

If data is copied between two filers, each filer must be defined as a trusted host of the other filer. That is, the filer’s name must be in the /etc/hosts.equiv file of the other filer. If one filer is not in the /etc/hosts.equiv file of the other filer then "Permission denied" error message is displayed to the user.

If data is copied on the same filer, localhost must be included in the filer’s /etc/hosts.equiv file. Also, the loopback address must be in the filer’s /etc/hosts file. Otherwise, the filer cannot send packets to itself through the loopback address when trying to copy data.

The usable disk space of the destination volume must be greater than or equal to the usable disk space of the source volume. Use the df pathname command to see the amount of usable disk space of a particular volume.

Each vol copy start command generates two volume copy operations: one for reading data from the source volume and one for writing data to the destination volume. Each filer supports up to four simultaneous volume copy operations.

# vol copy status [ operation_number]

Displays the progress of one or all active volume copy operations, if any. The operations are numbered from 0 through 3. If no operation_number is specified, then status for all active vol copy operations is provided.

# vol copy throttle [ operation_number ] value

This command controls the performance of the volume copy operation. The value ranges from 10 (full speed) to 1 (one-tenth of full speed). The default value is maintained in the filer’s vol.copy.throttle option and is set 10 (full speed) at the factory. The performance value can be applied to an operation specified by the operation_number parameter. If an operation number is not specified, the command applies to all active volume copy operations.

Use this command to limit the speed of volume copy operations if they are suspected to be causing performance problems on a filer. In particular, the throttle is designed to help limit the volume copy’s CPU usage. It cannot be used to fine-tune network bandwidth consumption patterns.

The vol copy throttle command only enables the speed of a volume copy operation that is in progress to be set. To set the default volume copy speed to be used by future volume copy operations, use the options command to set the vol.copy.throttle option.

How to clear the cache from memory

Linux has a supposedly good memory management feature that will use up any "extra" RAM you have to cache stuff. This section of the memory being used is SUPPOSED to be freely available to be taken over by any other process that actually needs it, but unfortunately Linux thinks that cache memory is too important to move over for anything else that actually needs it.

I noticed that whenever the server is booted, everything runs great. But as soon as it fills up with cache, performance degrades. It's terrible..

Up until just now, I have been forced to restart every time this happens because I simply cannot get any work done while in this state of retardation. I can close every single program I'm running - and even then, simply right clicking would require some extended thinking before loading the context menu. Ridiculous.

What consumes System Memory?

The kernel - The kernel will consume a couple of MB of memory. The memory that the kernel consumes can not be swapped out to disk. This memory is not reported by commands such as "free" or "ps".

Running programs - Programs that have been executed will consume memory while they run.

Memory Buffers - The amount of memory used is managed by the kernel. You can get the amount with "free".

Memory Cached - The amount of memory used is managed by the kernel. You can get the amount with "free".

Cached memory is freed on demand by the kernel.... it's done this way to make your system more responsive. Trust Linus Torvalds, he knows what he's doing.

There are few ways to check memory usage in Linux. By using the commands like free, vmstat and ps you will be able to check your linux memory usage:-

# free

The 'free -m' command displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.

total used free shared buffers cached
Mem: 1033612 354636 678976 0 240 201508
-/+ buffers/cache: 152888 880724
Swap: 1967952 0 1967952

You can also ask free to display results in every 5 seconds, in order to track the increases/decreases on memory usage.

# free -s 5

# vmstat

The vmstat command reports information about processes, memory, paging, block IO, traps, and cpu activity.


procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 0 131620 35432 341496 0 0 42 82 737 1364 15 3 81 1

# ps

The command "ps" is a c program that reads the "/proc" filesystem.

There are two elements that are useful when determining the per process memory usage.

They are:
a. RSS

Per Process Memory Usage

The inputs to this section were obtained with the command:

# ps -eo pid,ppid,rss,vsize,pcpu,pmem,cmd -ww --sort=pid

Luckily, I found a way to clear out the cache being used. Simply run the following command as root and the cache will be cleared out.

Linux Command :

# sync; echo 3 > /proc/sys/vm/drop_caches

The main point "Don't Panic! Your ram is fine!" is absolutely correct.

The Linux disk cache is very unobtrusive. It uses spare memory to greatly increase disk access speeds, and without taking any memory away from applications. A fully used store of ram on Linux is efficient hardware use, not a warning sign.

Tuesday, November 16, 2010

Creating a FlexClone volume in OnTap 7.3

Flexible volumes let users do some really kool things. One of my favorites involves cloning. We can take an existing FlexVol volume and create a clone from it, either using the volume itself (right now) or using an existing Snapshot copy from some point in the past (like before your server blew up). A FlexClone volume looks exactly like the volume it was created from (its parent FlexVol volume or another FlexClone volume), but it uses no additional physical storage!

This code shows you how easy it is to create a FlexClone volume by:

1) Cloning a flexible volume and
2) Splitting a FlexClone volume from the parent volume.

1) Clone a flexible volume

Display list of snapshots in vol1
FAS1> snap list vol1

Create a clone volume named newvol using the nightly.1 snapshot in vol1
FAS1> vol clone create newvol –b vol1 nightly.1

verify newvol was created
FAS1> vol status –v newvol

Look for snapshots listed as busy, vclone. These are shared with flexclones of vol1 and should not be deleted or the clone will grow to full size
FAS1> snap list vol1

Display space consumed by new and changed data in the flexclone volume.
FAS1> df –m newvol

2) Split a FlexClone volume from the parent volume.

Determine amount of space required to split newvol from its parent flexvol.
FAS1> vol clone split estimate newvol

Display space available in the aggregate containing the parent volume (vol1)
FAS1> df –A

Begin splitting newvol from its parent volume (vol1)
FAS1> vol clone split start newvol

Check the status of the splitting operation
FAS1> vol clone split status newvol

Halt split process. NOTE: All data copied to this point remains duplicated and snapshots of the FlexClone volume are deleted.
FAS1> vol clone split stop newvol

Verify newvol has been split from its parent volume
FAS1> vol clone status –v newvol

Wednesday, November 10, 2010

How to clear NFS locks during network crash or outage for Oracle datafiles.

Symptoms :

Database cannot be opened because old locks still exist on the filer stored data files.

Error : ORA-27086

Cause of this problem :

Data files that were open during network episode left in NFS locked state. Oracle cannot open the locked files

The lock recovery manager (NLM) in the Linux kernel uses uname -n to determine the host name while the rpc.statd process (NSM) uses gethostbyname() to determine the client's name. If these do not match, the recovery process will not work

Solution :

This solution indicates the recovery steps detailed herein should be taken in case Oracle is hung, but that PROPER ORACLE TROUBLESHOOTING AND SUPPORT METHODS SHOULD BE FOLLOWED FOR ANY DATABASE-HUNG ISSUES, independently of NetApp.

Oracle's database product does not typically hang after a network crash.

Summary of Corrective steps :

1) Shutdown Oracle databases

2) Unmount database volumes

3) Kill lockd/statd processes on UNIX host

4) Clear locks on filer

5) Remove the NFS lock files on the host.

6) Restart lockd/statd processes on UNIX host

7) Remount the database volumes on the UNIX host

8) Restart databases

Detailed Procedure:

1) Shutdown all Oracle databases being run by the affected server.

Issue the Oracle shutdown immediate command and verify that no database processes are still running by issuing the UNIX command ps -ef |grep -i ora on the UNIX database host.

If database processes are still running issue the Oracle shutdown abort command and use the UNIX command ps -ef | grep -i ora to verify that no database processes are still running.

If database processes are still running do the following from the UNIX command line:
ps -ef | grep ora to get process id's (pid's) of remaining Oracle processes

kill -9 pid for each remaining Oracle process.

2) Unmount all database volumes using the UNIX umount command.

3) Kill statd and lockd processes on the UNIX host in the order specified below:
Determine the process id's (pid's) of statd and lockd from the UNIX command line:

ps -ef |grep lockd

ps -ef |grep statd

kill [lockd_process_id]

kill [statd_process_id]

4) Remove locks from filer

Execute the following from the filer command line:

filer> priv set advanced

filer> sm_mon -l (In many cases specifying the host name does not clear all the affecting locks, so the recommendation is to NOT specify a hostname)

Delete all files in the filer's "/etc/sm" directory. (Remove the files only. Do NOT remove the "/etc/sm" directory itself.)

If the filer is running Data ONTAP 7.1 or higher run 'lock break -h [hostname]' to release any locks that still exist.

If the 'lock break -h [hostname]' doesn't work, ensure that the server name that you are entering is not the same as the one that the filer has.

If the locks are not cleared, run 'lock break -p nlm' (This also requires Data ONTAP 7.1 or higher). This will clear all the NFS locks on the filer. This will not sever any NFS connections, it will simply force the processes to re-request the locks for the files they are writing to.

5) Remove the NFS lock files on the host.

From TR-3183 - Using the Linux NFS Client with Network Appliance Storage,
rpc.statd uses gethostbyname() to determine the client's name, but lockd (in the Linux kernel) uses uname -n.
By changing the HOSTNAME= fully qualified domain name, lockd will use an FQDN when contacting the storage. If there is a lnx_node1.iop.eng.netapp.com and also a lnx_node5.ppe.iop.eng.netapp.com contacting the same NetApp storage, the storage will be able to correctly distinguish the locks owned by each client. Therefore, we recommend using the fully qualified name in /etc/sysconfig/network. In addition to this, sm_mon -l or lock break on the storage will also clear the locks on the storage which will fix the lock recovery problem.

Additionally, if the client's nodename is fully qualified (that is, it contains the hostname and the domain name spelled out), then rpc.statd must also use a fully qualified name. Likewise, if the nodename is unqualified, then rpc.statd must use an unqualified name. If the two values do not match, lock recovery will not work. Be sure the result of gethostbyname(3) matches the output of uname -n by adjusting your client's nodename in /etc/hosts, DNS, or your NIS databases.

6) Start the UNIX statd and lockd processes from the UNIX host command line in the order specified below:



7) Mount the database volumes on the UNIX host.

8) Start the database(s) and test for availability.

Monday, November 08, 2010

How to send email from the Linux command line using mail and mutt.

The Linux command line can be very powerful once you know how to use it. You can parse data, monitor processes, and do a lot of other useful and cool things using it. We will begin with the “mail” command.


First run a quick test to make sure the “sendmail” application is installed and working correctly.

# rpm -qa | grep sendmail

The output should give you the version of sendmail installed "sendmail-8.13.8-2.el5".

Then execute the following command, replacing “abc@xyzemail.com” with your e-mail address.

# mail -s “Test email” abc@xyzemail.com

Hit the return key and you will come to a new line. Enter the text “This is a test mail”.
Follow up the text by hitting the return key again. Then hit the key combination of Control+D to continue.
The command prompt will ask you if you want to mark a copy of the mail to any other address, hit Control+D again.
Check your mailbox. This command will send out a mail to the email id mentioned with the subject, “Test email”.

To add content to the body of the mail while running the command you can use the following options. If you want to add text on your own:

# echo “This will go into the body of the mail.” | mail -s “Test email” abc@xyzemail.com

And if you want mail to read the content from a file:

# mail -s “Test email” abc@xyzemail.com < /home/oracle/test.log

Some other useful options in the mail command are:

-s subject (The subject of the mail)
-c email-address (Mark a copy to this “email-address”, or CC)
-b email-address (Mark a blind carbon copy to this “email-address”, or BCC)

Here’s how you might use these options:

# echo “This will go into the body of the mail” | mail -s “Test email” abc@xyzemail.com -c lmn@xyzemail.com -b pqr@xyzemail.com


One of major drawbacks of using the mail command is that it does not support the sending of attachments. mutt, on the other hand, does support it. I’ve found this feature particularly useful for scripts that generate non-textual reports or backups which are relatively small in size which I’d like to backup elsewhere. Of course, mutt allows you to do a lot more than just send attachments. It is a much more complete command line mail client than the “mail” command. Right now we’ll just explore the basic stuff we might need often. Here’s how you would attach a file to a mail:

# echo “Sending an attachment.” | mutt -a backup.zip -s “attachment” abc@xyzemail.com

This command will send a mail to abc@xyzemail.com with the subject (-s) “attachment”, the body text “Sending an attachment.”, containing the attachment (-a) backup.zip. Like with the mail command you can use the “-c” option to mark a copy to another mail id.


Now, with the basics covered you can send mails from your shell scripts. Here’s a simple shell script that gives you a reading of the usage of space on your partitions and mails the data to you.

df -h | mail -s “disk space report” abc@xyzemail.com

Save these lines in a file on your Linux server and run it. You should receive a mail containing the results of the command. If, however, you need to send more data than just this you will need to write the data to a text file and enter it into the mail body while composing the mail. Here’s and example of a shell script that gets the disk usage as well as the memory usage, writes the data into a temporary file, and then enters it all into the body of the mail being sent out:

df -h > /tmp/mail_report.log
free -m >> /tmp/mail_report.log
mail -s “disk and RAM report” abc@xyzemail.com < /tmp/mail_report.log

Now here’s a more complicated problem. You have to take a backup of a few files and mail then out. First the directory to be mailed out is archived. Then it is sent as an email attachment using mutt. Here’s a script to do just that:

tar -zcf /tmp/backup.tar.gz /home/oracle/files
echo | mutt -a /tmp/backup.tar.gz -s “daily backup of data” abc@xyzemail.com

The echo at the start of the last line adds a blank into the body of the mail being set out.

Thursday, September 09, 2010

Using NFS mount to sort out a diskspace outage issue on MySQL server

Using NFS mount to sort out a diskspace outage issue on MySQL server

Identify the if the diskspace is full

[root@mysql01 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 7.5G 7.1G 0 100% /
/dev/sda1 289M 17M 258M 6% /boot
tmpfs 1006M 0 1006M 0% /dev/shm

This clearly shows that the MySQL server has no diskpace available.

Identify the folder that stores the innodb files

[root@mysql01 ~]# lsof | grep mysql

[root@mysql01 ~]# lsof /var/lib/mysql

[root@mysql01 ~]# du -h ibdata1

[root@mysql01 ~]# lsof -p 10928

[root@mysql01 ~]# lsof -p 10928 | grep /var/lib/mysql

Stop the MySQL service

[root@mysql01 ~]# service mysqld stop

Create a NFS mount on a NFS Server 192.168.yyy.zzz eg. /vol/mysql more than your current MySQL data and mount the same to some local folder in your system.

[root@mysql01 ~]# mkdir -p /tmp/mysql

[root@mysql01 ~]# mount -o tcp,rsize=32768,wsize=32768,hard,intr,timeo=600 192.168.yyy.zzz:/vol/mysql /tmp/mysql

Chage the folder permissions

[root@mysql01 ~]# cd /tmp/

[root@mysql01 ~]# chown -Rf mysql:mysql mysql

Move all the data from the /var/lib/mysql directoru to the temporary folder/tmp/mysql/

[root@mysql01 ~]# cd /var/lib/mysql

[root@mysql01 ~]# mv ibdata1 /tmp/mysql/
[root@mysql01 ~]# mv ib_logfile0 /tmp/mysql/
[root@mysql01 ~]# mv ib_logfile1 /tmp/mysql/
[root@mysql01 ~]# cp -R mysql /tmp/mysql/
[root@mysql01 ~]# rm -rf mysql

Change the directory permissions.

[root@mysql01 ~]# chown -Rf mysql:mysql mysql

Unmount the NFS mount from the temporary location.

[root@mysql01 ~]# umount /tmp/mysql

Mount the NFS mount with the MySQL data as the original location, /var/lib/mysql.

[root@mysql01 ~]# mount -o tcp,rsize=32768,wsize=32768,hard,intr,timeo=600 192.168.yyy.zzz:/vol/mysql /var/lib/mysql

[root@mysql01 ~]# service mysqld start

Add the following line to your /etc/fstab file

[root@mysql01 ~]# vi /etc/fstab

192.168.yyy.zzz:/vol/mysql /var/lib/mysql nfs proto=tcp,rsize=32768,wsize=32768,hard,intr,timeo=600 0 0

Enjoy !

Monday, September 06, 2010

Support for Jumbo Frames is not enabled by default in ESXi 4.1. So to enable Jumbo frames you must login using the "Tech Support Mode" or the "Remote support Mode (SSH)".

Jumbo Frames should be enabled on every hop in the network otherwise it won’t work and you’ll find yourself in a position that the performance on the storage network will actual be worse than with Jumbo frames disabled.


  • We should have a ESXi host with multiple vmnics and following environment :
  • Switch and NFS storage both configured for Jumbo Frames and an MTU of 9000 bytes.
  • ESXi 4.1 configured with 2 vSwitches - vSwitch0 and vSwitch1
  • vSwitch0 has the default MTU (1500 bytes) and connected to the VLAN facing the users.
  • vSwitch1 connected to the same VLAN as the NFS Storage.
  • Full permission to the Host
  • Working knowledge of making changes to networking and storage.
The Process:

  • Login over SSH to your ESXi host
  • $ esxcfg-vswitch -l… that will list the current MTU
  • $ esxcfg-vswitch -m 9000 vSwitch1… set the MTU to 9000 (Jumbo Frames)
  • $ esxcfg-vswitch -l… verify the change
  • $ esxcfg-nics -l… verify that every NIC has the new MTU-Value

Thats it.

Reboot to see that the changes are persistent.

Tuesday, August 24, 2010

How To Convert Thick Virtual Disks to Thin in VMWare

How To Convert Thick Virtual Disks to Thin in VMWare ESX / ESXi

To convert a thick format disk to a thin format disk, clone the virtual disk with the destination disk type set to thin.
To clone a disk and convert it to thin format, run this command from the Service Console or from the VMware Command-Line Interface:
vmkfstools -i -d thin

For example:

vmkfstools -i vDiskName.vmdk -d thin vDiskNameThin.vmdk

If you are using VMWare RCLI application use :

vmkfstools.pl --server 192.168.xxx.xxx -i /vmfs/volumes/datastore1/VMSRV0004-VM1/VMSRV0004-VM1.vmdk -d thin /vmfs/volumes/datastore1/VMSRV0004-VM1/VMSRV0004-VM1-thin.vmdk

Note: Dont forget to go back to ESX server and remove the old .vmdk and -flat.vmdk files once you are sure that your VM is operating normally off the thin disk.

Wednesday, July 28, 2010

Zabbix 1.8.2 on CenOS 5.5 (x86_64)

Installing Zabbix 1.8.2 on CenOS 5.5 and generating fake load to check the installation

Install the basic CentOS 5.5 on a machine.


Update CentOS using the following command:

yum -y update

Note : Disable firewall and SELinux only for testing


Install EPEL using the following command:

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm


Install mysql and php using yum:

yum install mysql mysql-server php php-gd php-bcmath php-xml php-mbstring php-mysql


Download Zabbix rpm file from http://fedora.danny.cz/danny-el/5Server/

wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-web-mysql-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-server-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-docs-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-agent-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-web-1.8.2-2.el5.x86_64.rpm
wget http://fedora.danny.cz/danny-el/5Server/x86_64/zabbix-server-mysql-1.8.2-2.el5.x86_64.rpm


Install the downloaded Zabbix rpm's using the following command :

yum --nogpgcheck localinstall zabbix-1.8.2-2.el5.x86_64.rpm zabbix-server-1.8.2-2.el5.x86_64.rpm zabbix-server-mysql-1.8.2-2.el5.x86_64.rpm zabbix-agent-1.8.2-2.el5.x86_64.rpm zabbix-docs-1.8.2-2.el5.x86_64.rpm zabbix-web-1.8.2-2.el5.x86_64.rpm zabbix-web-mysql-1.8.2-2.el5.x86_64.rpm


service httpd restart

chkconfig httpd on

Configuring the Database part (MySQL)

service mysqld restart

chkconfig mysqld on

mysqladmin -u root password passw0rd!

mysql --user=root --password=passw0rd!

Welcome to the MySQL monitor. Commands end with ; or \g.

mysql> create database zabbix character set utf8;

mysql> create user 'zabbix'@'localhost' identified by 'password';

mysql> grant all on zabbix.* to zabbix;

mysql> flush privileges;

mysql> quit

Insert schema, data and images

cd /usr/share/doc/zabbix-server-mysql-1.8.2/create/schema

mysql --user=zabbix --password=password zabbix <>

cd ../data/

mysql --user=zabbix --password=password zabbix <>

mysql --user=zabbix --password=password zabbix <>


Configuing PHP

edit /etc/php.ini
;max_execution_time = 600
;max_input_time = 600
;memory_limit = 256M
;upload_max_filesize = 16M
;post_max_size = 32M
;date.timezone = Europe/Brussels
;mbstring.func_overload = 2


Config Zabbix-server

edit /etc/zabbix/zabbix_server.conf
# Database user


Config Zabbix-webinterface

create /usr/share/zabbix/conf/zabbix.conf.php (remove the ;)

;global $DB_TYPE,
;$DB_SERVER = "localhost";
;$DB_PORT = "0";
;$DB_DATABASE = "zabbix";
;$DB_USER = "zabbix";


service zabbix-server start

chkconfig zabbix-server on

service zabbix-agent start

chkconfig zabbix-agent on

lsof -i:10050

more /var/log/zabbix/zabbix_agentd.log



Login using the following credentials :

login: admin / password: zabbix


Generate Fake CPU load :

perl -e 'while (--$ARGV[0] and fork) {}; while () {}' 4



Generate Fake Disk-IO load and Free unused memory currently unavailable :

dd if=/dev/zero of=junk bs=1M count=1K (This will vreate a 1 GB file)

dd if=/dev/zero of=junk bs=1M count=10K (This will vreate a 10 GB file)

This is an useful command for when your OS is reporting less free RAM than it actually has. In case terminated processes did not free their variables correctly, the previously allocated RAM might make a bit sluggis over time.

This command then creates a huge file made out of zeroes and then removes it, thus freeing the amount of memory occupied by the file in the RAM.
In this example, the sequence will free up to 1GB(1M * 1K) of unused RAM. This will not free memory which is genuinely being used by active processes.

rm -rf junk


Monitor Disk IO performance using iostat:

yum -y install sysstat

iostat -x 10 > /tmp/iostat.txt

tail -f /tmp/iostat.txt


Test network speed without wasting disk

dd if=/dev/zero bs=1M count=1M | ssh root@ 'cat > /dev/null'

The above command will send 1GB of data from one host to the next over the network, without consuming any unnecessary disk on either the client nor the host. This is a quick and dirty way to benchmark network speed without wasting any time or disk space.


Wednesday, July 07, 2010

Set timezone using /etc/localtime configuration file in Linux

Often /etc/localtime is a symlink to the file localtime or to the correct time zone file in the system time zone directory.

1 Log in as root, check which timezone your machine is currently using by executing `date`.
You'll see something like Thu 10 Jun 2010 12:15:08 PM IST, IST in this case is the current timezone.

2 Change to the directory /usr/share/zoneinfo here you will find a list of time zone regions. Choose the most appropriate region, if you live in Canada or the US this directory is the "America" directory.

3 If you wish, backup the previous timezone configuration by copying it to a different location. Such as
#mv /etc/localtime /etc/localtime-old

4 Create a symbolic link to the appropriate timezone from /etc/localtime. Example:
#ln -sf /usr/share/zoneinfo/America/New_York /etc/localtime

5 Some distro use /usr/share/zoneinfo/dirname/zonefile format (Red hat and friends)
# ln -sf /usr/share/zoneinfo/EST localtime

6 Set the ZONE entry in the file /etc/sysconfig/clock file (e.g. "America/New_York")

7 Set the hardware clock by executing:
#/sbin/hwclock --systohc

Using NTP (Network Time Protocol)

NTP will connect to a server to get the atomic time. It can be downloaded from www.ntp.org/downloads.html To get started with NTP simply download it, install it, use the ntpdate command followed by a public time server, and update your hardware clock.
$ ntpdate "0.pool.ntp.org (0, 1, or 2)"
4 Nov 22:31:28 ntpdate[26157]: step time server offset 22317290.440932 sec
$ hwclock --systohc

To keep your time accurate you can create a cron job that executes:
(the -w option is the same as --systohc)
#ntpdate "server name" && hwclock -w