Netbackup Important Commands and Explanations

Master Server

1) Check the license details
/usr/openv/netbackup/bin/admincmd/get_license_key
2) Stop and Start the netabackup services
i) /etc/init.d/netbackup stop (start)               —>  graceful stop and start
ii) /usr/openv/netbackup/bin/bp.kill_all      —> Stop backup including GUI sessions, ungraceful
iii) /usr/openv/netbackup/bin/bp.start_all   —> Start the backup                                        iv) /usr/openv/netbackup/bin/initbprd         —> starts the master server
v) /usr/openv/netbackup/bin/vmd                —> starts the media server
vi) /usr/openv/netbackup/bin/jnbSA             —> Starts the GUI sessions

3) Scan the tape devices
#sgscan (in  Solaris)                                                                                                          #/usr/openv/volmgr/bin/scan (in AIX)

4) Display all the netbackup process
#bpps –x

5) Check the backup status
In GUI —>  Activity monitor
In CLI —>  #bpdbjobs -report

6) Lists the server errors
#bperror
#bperrror –U –problems –hoursago 1
#bperror –U –backstat -by_statcode -hoursago 1

7) Display information about the error code
#bperror –S <statuscode>

8) Reread bp.conf file without stop/start the netbackup
#bprdreq -rereadconfig

Media Server (starts with bpxxx )
1) List the storage units
#bpstulist –U
2) List media details
# /usr/openv/netbackup/bin/goodies/available_media
This cmd is retrieving all info about media that is available/full/expired/Frozen
3) List all the netbackup jobs
#bpdbjobs –report <hoursago>
4) Freeze or Unfreeze media
In GUI,
In CLI, #bpmedia –unfreeze [-freeze] –ev <media ID>
5) List media details
#bpmedialist -ev <media ID>                                                                                                          6) List the media contents
#bpmedialist –U –mcontents –m <mediaID>
7) List the information about NB images on media
#bpimmedia –mediaid <ID> -L
8) List backup image information
#bpimagelist -U (general)
# bpimagelist -media –U (for media)
9) Expire a tape
# bpexpdate –d 0 –ev <mediaID> -force
10) Expire client images
#bpimage –cleanup –allclients
11) Which tapes are used for taking backup
In GUI, Backup and Restore –> Find the Filesystem –> Preview Media Button
In CLI, #bpimagelist –media –U

Volume Commands (starts with vmxxx)
1) Tape Drive (vmoprcmd)
1) List the drive status, detail drive info and pending requests
In GUI, Device mgmt
In CLI, #vmoprcmd
#vmoprcmd –d ds (status)
#vmopcrmd –d pr (pending requests)
2) Control a tape device
In GUI, Device mgmt
In CLI, #vmoprcmd [-reset] [-up] [-down] <drive number>

2) Tape Media commands (vmpool,vmquery,vmchange,vmdelete)

1) List all the pools
In CLI, #vmpool –listall -bx
2) List the scratch pool available
#vmpool -list_scratch
3) List tapes in pool
In GUI,
In CLI, #vmquery –pn <pool name> -bx                                                                                      4) List all tapes in the robot
In GUI,
In CLI, #vmquery –rn 0 –bx
5) List cleaning tapes
In CLI, #vmquery –mt dlt_clean –bx
6) List tape volume details
#vmquery –m <media ID>
7) Delete a volume from the catalog
#vmdelete –m <mediaID>
8) Changes a tapes expiry date
#vmchange -exp 12/31/2012 hr:mm:ss –m <media ID>
9) Changes a tape’s media pool
#vmchange -p <pool number> -m <media ID>

3) Tape/Robot commands (starts with tpxxx)
1) List the tape drives
#tpconfig –d
2) List the cleaning times on drives
#tpclean -L
3) Clean a drive
#tpclean –C <drive number>

Client Commands
i) List the clients
#bpplclients

Policy Commands
i) List the policies
#bppllist –U
ii) List the detailed information about the policies
#bpplist –U -allpolicies

Advertisements
Posted in NETBACKUP | Leave a comment

AIX Startup Modes

Startup Modes

There are usually 4 startup modes available in AIX, they are Normal, SMS, Maintenance and Diagnostics.

Normal

  • System at a working stage
  • Multi-User mode
  • All process are up and running.
  • It show login screen to logon to the system.

System Management Service (SMS)

  • Using F1, we can go to SMS mode (RIPL)
  • Not AIX
  • Runs from the firmware
  • Used to select & set normal bootlist and from where the system want to boot or bootdevice
  • Used to set Power-On password and Supervisor password

Maintenance Mode

  • Maintenance mode
  • Fix machine problem that won’t boot in normal mode/boot
  • Used to recover root password
  • Restoration the mksysb backup from tape should be performed in this mode.
  • F5 to press the service bootlist

Diagnostics

  • When your system doesn’t boot and you feel there could be a device related problem, you can startup in diagnostics mode
  • Diagnose and check the device for potential problems and then try for a normal startup.

 

Posted in AIX, AIX LESSONS | Leave a comment

Netbackup Important Error Codes & its Solutions

The following are the Veritas Netbackup important error codes and its solutions.

1. Status code 2
Reason: None of the file backed up
Action taken: no files in target path

2. Status code 13
Reason: File read failed
Action taken: network connectivity

3. Status code 25
Reason: Cannot connect to socket
Action taken: bpcd daemon want to check .

4. Status code 50
Reason: Client process aborted
Action taken: restart the backup manually wants to check any errors

5. Status Code – 59
Reason: Access to the client was not allowed
Action Taken: want to check the bp.conf / master-client access connectivity

6. Status code 71
Reason: Backup taking path changed
Action taken: path should be correct

7. Status Code – 84
Reason: Reduce the backup failure due to I/O error.
Action Taken: Clean the media mounts and to change the tape
default parameters to reduce backup failures due to I/O.

8. Status code – 96
Reason: Backup failure due to unavailable at scratch pool
Action taken: Volume pool has been allocated to scratch pool

9. Status code – 129
Reason: Disk storage unit is full
Action taken: remove old images

10. Status code – 196
Reason: Client backup was not attempted becoz backup window
close/elapsed time.
Action taken: manually restart the backup, if exceeds changes
the timing frequency and backup window frequency.

11. Status code – 2001
Reason: Tape library down error/Robotic path changed
Action taken: Manually bring up the robot.

Posted in NETBACKUP | Leave a comment

Netbackup Important Ports

The following are the important Netbackup ports and its corresponding daemons mandatory for controlling the same from work area.

13720 – bprd
13721 – bpdbm
13722 – bpjava-msvc
13723 – bpjobd
13724 – vnetd
13782 – bpcd
13783 – vopied
1556 — Java port

Posted in NETBACKUP | 1 Comment

RAID NOTES

What is RAID?

Redundant Array of Inexpensive Disks is a set of technology standards for teaming disk drives which allows high level of storage availability.  RAID is not a backup solution.  It is used to improve disk (I/O) performance and reliability (fault tolerance) of your workstation.

Hardware RAID Vs Software RAID

RAID can be deployed in  both software and hardware. The comparison of both s/w and h/w RAID s based on the following features.

Cost, Complexity, Write Back Caching  (BBU-Battery backup Unit) , Performance, Overheads (CPU,RAM etc.,) , Disk hot swapping , Hot spare support, /boot partition, Opensource factor, Faster rebuilds , Higher write throughput,

Can RAID Array Fail?

Yes. The entire RAID array can fail taking down all your data (yes hardware RAID card do dies). Use tapes and other servers that can hold copies of the data, but don’t allow much interaction with it. M ove your data offsite. Another option is to use two or three RAID cards. Combine them together to protect your data. This make sure you gets back your data when one of your RAID card dies out.

 Types of RAID Levels:

RAID 0, RAID 1, RAID 5, RAID 1+0, RAID 0+1

RAID 0

 Minimum requirement of Hard disks – 2

Advantages

  • High performance
  • Easy to implement
  • No parity overhead
  • Read/Write is good

 

 Disadvantages

  • No fault tolerance because no redundancy.

 

RAID 1 (Mirroring)

 

  • Minimum hard disk is 2 (2N).  Writes all the data mirrored disks.

Advantages

  • Fault tolerant, if one disk failed can retrieve data from working disk. No data lost.
  • Easy to recover data
  • High read performance.
  • Easy to implement

 

Disadvantages

  • Low  write performance
  • Very  costly

Suggested Users

Small databases and Critical applications

RAID 5           

Stripes the data at a block level across several drives, with parity equality distributed among the  drives. The parity information allows recovery from the failure of any single drive.

  • Block level striping
  •  Minimum   disks requirement is 3 .
  • Calculation N + 1 (N= 2 disks)
  • Limitation  of disks  Practical 15 disks max but theoretical 32 hdisks

Example

If 3 36GB drives would hold the necessary data, then 4 36 GB drives would be needed to implement RAID 5 and maintain a total of 108 GB of available data space.

Advantages

  • Read performance is good.
  • Fault tolerant and redundancy becoz of distributed parity
  • Good. Can tolerate loss of one drive
  • Provide more disk space than RAID 1

Disadvantages

  • Disk failure has a medium impact on throughput
  • If 2 disk failure occurs , then there is a  data loss
  • More complex to design

Suggested servers

Mid size financial databases, applications

Parity :

Parity is an error correction technique commonly used in certain RAID levels. It is used to reconstruct data on a drive that has failed in an array.

There are two types of parity bits: even parity bit and odd parity bit. An even parity bit is set to 1 if the number of ones in a given set of bits is odd (making the total number of ones even). An odd parity bit is set to 1 if the number of ones in a given set of bits is even (making the total number of ones odd).

 Parity Block

The parity block is used by  certain RAID levels, Redundancy  is achieved  by the use of parity blocks. If a single drive of the array fails, data blocks and a parity block from the working drives can be combined to reconstruct the data.

Given the diagram below, where each column is a disk, assume A1 = 00000111, A2 = 00000101, and A3 = 0000000. Ap, generated by XORing A1, A2, and A3, will then equal 00000010. If the second drive fails, A2 will no longer be accessible, but can be reconstructed by XORing A1, A3, and Ap:

 

A1 XOR A3 XOR Ap = 00000101

 RAID 1+0 (Stripped Mirroring)

RAID 10 is also called RAID 1+0

  • Its also called stripe of mirrors
  • It requires minimum 4 disks
  • Disk 1 and Disk 2 are mirrored to Group 1 striping, hence if disk 1 fails no disruption occurs in data accessing.

 

Advantages

  • Fault tolerance is high
  • High I/O performance
  • Faster rebuilder when compared to  RAID 0+1
  • Under certain circumstances, RAID 10 array can sustain multiple simultaneous drive failures

 

Disadvantages

  • Very expensive
  • High overhead
  • Very limited scalability

 

RAID 0+1 (Mirrored Striping)

 

  • RAID 01 or RAID 0+1 is also called  Mirrored striping , in which  two (or more) stripes of several disks, which then are mirrored onto eachother.
  • Minimum no of disk requirement is 4

 

Advantages

  • Fault tolerant
  • High I/O
  • Performance and availability  are same as RAID 10

 

Disadvantages

One failing disk invalidates a whole stripe set, two disks failing on

Either side of the mirror renders the volume unusable

  • The  recovery takes longer as the volume’s whole content

Needs to be re-mirrored to the repaired stripe.

 

 

 

 

 

 

 

How is RAID 10 overcome RAID 5 ?

 RAID 10 performs better than RAID 5 in read and write, since it does not need to manage parity.

Only the cost matters in RAID 10 when compared to RAID,  other than it will provide high data security and performance.

How is RAID 1+0 different than RAID 0+1?

 Scenario:

 We have 20 disks to form the RAID 1+0 or RAID 0+1 array of 20 disks.

a) If we chose to do RAID 1+0 (RAID 1 first and then RAID 0), then we would divide those 20 disks into 10 sets of two. Then we would turn each set into a RAID 1 array and then stripe it across the 10 mirrored sets.

b) If on the other hand, we choose to do RAID 0+1 (i.e. RAID 0 first and then RAID 1), we would divide the 20 disks into 2 sets of 10 each. Then, we would turn each set into a RAID 0 array containing 10 disks each and then we would mirror those two arrays. So, is there a difference at all? The storage is the same, the drive requirements are the same and based on the testing also, there is not much difference in  erformance either. The difference is actually in the fault tolerance. Let’s look at the two steps that we mentioned above in more detail:

RAID 1+0:

Drives 1+2 = RAID 1 (Mirror Set A)

Drives 3+4 = RAID 1 (Mirror Set B)

Drives 5+6 = RAID 1 (Mirror Set C)

Drives 7+8 = RAID 1 (Mirror Set D)

Drives 9+10 = RAID 1 (Mirror Set E)

Drives 11+12 = RAID 1 (Mirror Set F)

Drives 13+14 = RAID 1 (Mirror Set G)

Drives 15+16 = RAID 1 (Mirror Set H)

Drives 17+18 = RAID 1 (Mirror Set I)

Drives 19+20 = RAID 1 (Mirror Set J)

 

Now, we do a RAID 0 stripe across sets A through J. If drive 5 fails, then only the mirror set C is affected. It still has drive 6 so it will continue to function and the entire RAID 1+0 array will keep functioning. Now, suppose that while the drive 5 was being replaced, drive 17 fails, then also the array is fine because drive 17 is in a different mirror set. So, bo􀁆om line is that in the above configuration atmost 10 drives can fail without effecting the array as long as they are all in different mirror sets.

 


 

In Simple words

 

Fault tolerance is higher than in RAID 0+1, since either disk of any mirrored pair can fail without harming the volume (provided no mirrored pair loses both disks, of course). In an “ideal” failure, half of all participating disks on either side of their mirrors can fail without the volume becoming unavailable. Recovery time is also lower compared to RAID 0+1, since only one disk pair needs to be re-synced

 

Now, let’s look at what happens in RAID 0+1:

 RAID 0+1:

Drives 1+2+3+4+5+6+7+8+9+10 = RAID 0 (Stripe Set A)

Drives 11+12+13+14+15+16+17+18+19+20 = RAID 0 (Stripe Set B)

Now, these two stripe sets are mirrored. If one of the drives, say drive 5 fails, the entire set A fails. The RAID 0+1 is still fine since we have the stripe set B. If say drive 17 also goes down, you are down. One can argue that that is not always the case and it depends upon the type of controller that you have. Say that you had a smart controller that would continue to stripe to the other 9 drives in the stripe set A when the drive 5 fails and if later on, drive 17 fails, it can use drive 7 since it would have the same data. If that can be done by the controller, then theoretically speaking, RAID 0+1 would be as fault tolerant as RAID 1+0. Most of the controllers do not do that though

 In simple words

                Fault tolerance is lower than with RAID 1+0: one failing disk invalidates a whole stripe set, two disks failing on either side of the mirror renders the volume unusable. Furthermore, the recovery takes longer as the volume’s whole content needs to be re-mirrored to the repaired stripe

 

 

 

Posted in RAID, STORAGE | Leave a comment

AIX BOOT PROCESS

System Startup Procedure

There are 3 Phases involved in AIX startup procedure, they are

i) ROS (Read Only Storage) Phase
ii) Device Configuration Phase
iii) Init Phase

ROS (Read Only Storage) Phase

i) Hardware devices are verified and checked for possible issues — POST
ii) Bootlist is found — System ROS detects the first boot device in the bootlist specified
iii) Boot image is loaded into the Memory – First 512 bytes block (sector) contains the bootstrap code of the hdisk is loaded into RAM
iv) Initialization starts – Bootstrap code locates BLV (/hd5) from the disk.

During the above process the following activities has been performed

  •  BLV contains the kernel, boot commands, reduced ODM and rc.boot scripts.
  • BLV get uncompressed in RAM and release the kernel.
  • Then AIX kernel gets control.
  • AIX kernel creates temporary RAMFS with /, /etc , /usr /dev , /mnt etc.,
  • Kernel starts the init process from BLV in the RAM.
  • Init executes rc.boot script from the BLV in the RAM. There are 3 rc.boot scripts rc.boot.1 , rc.boot.2 and rc.boot.3

ii) Base Device Configuration Phase

i) All devices are configured with “cfgmgr” command
ii) init process executes “rc.boot.1” from RAMFS
iii) “restbase” command copies reduced ODM from BLV to RAMFS
iv) “cfgmgr” will run and configure all the devices

i)  LV s are varied on

  • rc.boot.2 is executed
  • varyon the rootvg
  • run fsck on /,/usr./var and mount the same to RAMFS

ii) Paging is started

  • “copycore”   command checks the occurrence of the dump and copy the same from the /var/ras/adm
  • unmount the /var and activate the paging
  • mount /var
  • now /, /usr, /var are mounted on rootvg in disk .

iii) /etc/inittab is processed

  • Kernel removes RAMFS
  • init process is started from / in rootvg
  • /etc/init starts /etc/inittab and runs /sbin/rc.boot3
  • /etc/inittab decides the run level.
  • Run the fsck on /tmp and mount the same
  • syncvg for rootvg and reports the stale PP s.
  • Use “savebase”  command to save the customized data to the BLV.
  • Exit the rc scripts

iv) Relevant services according to the run level starts, srcmstr daemon (System Resource controller) and start the relevant subsystems

Posted in AIX, AIX LESSONS | Tagged | Leave a comment

Welcome to the World of Unix …

UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity…..

Dennis Ritchie (Founder C & Unix)


Posted in GENERAL | Leave a comment