Viewing the World in Binary

OSPF – Virtual Links

Topology is as follows (Snagged from Multiarea OSPF topology)

 

 

First step for debuging Virtual Links starts with the following ‘show’ command – ‘sh ip ospf virtual-links’.  Following output is from ABR connected to Area 0 and the current status of Virtual Link is DOWN.

 

 

 

 

 

! ABR-1-0  ! Area 1 – Area 0 – ABR Router

interface Loopback0                                  ! Loopback interface defined 
 ip address 101.0.0.2 255.255.255.255
!
interface Serial0/0                                    ! Interface in Area 1
 ip address 1.1.1.6 255.255.255.252
 serial restart-delay 0
!
interface Serial0/1                                   ! Interface in Area 0
 ip address 192.0.0.17 255.255.255.252
 serial restart-delay 0
!
router ospf 1
 log-adjacency-changes
 area 1 virtual-link 101.0.0.1                   ! Virtual Link to area 0 via transit area 1
 network 1.1.1.4 0.0.0.3 area 1
 network 101.0.0.2 0.0.0.0 area 0            ! Loopback added to Area 0
 network 192.0.0.16 0.0.0.3 area 0

! ABR-VL-R8 ! Area 1 – Area 10 – ABR Router with Virtual Link to Area 0 via Area 1

interface Loopback0
 ip address 101.0.0.1 255.255.255.255
!
interface Serial0/0
 ip address 1.1.1.9 255.255.255.252
 serial restart-delay 0
!
interface Serial0/1
 ip address 10.10.1.6 255.255.255.252
 serial restart-delay 0
!
router ospf 1
 log-adjacency-changes
 area 1 virtual-link 101.0.0.2
 network 1.1.1.8 0.0.0.3 area 1
 network 10.10.10.4 0.0.0.3 area 10
 network 101.0.0.1 0.0.0.0 area 0

Everything looks good on the configuration end. Lets reboot the two ABR’s (would be fun in production network). Virtual Link came up after the reboot – so it may be Dynamips issues, I will take that and end my case. Will see if I can log this case with Dynamips and if anyone else has faced the same.

 OSPF is running as Demand Circuit and LSA’s are not aged out/refreshed on Virtual Links. 

 

 

 

 

 

‘sh ip ospf neighbor’ command on ABR on Area 0 and Area 1 boundary should show the non-Area0 connected router,

 

 

 

 There are a few things which should be on debugging checklist for Virtual Links –

  • Make sure transit area can reach both loopback addresses on both ABR’s.
  • Stub Area can not act as Transit Area for Virtual Links.
  • Configure Authentication on Virtual Link when Area 0 has authentication enabled.

Since I have started writing on VL’s, I will take this oppurtunity to experiment with last two scenario’s.

Configuring VL Transit Area Router (A1-R2) as  Stub router with ‘area 1 stub‘ command.

A1-R2(config-router)#
*Mar  1 04:05:01.122: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.2 on Serial0/2 from FULL
 to DOWN, Neighbor Down: Adjacency forced to reset
*Mar  1 04:05:01.126: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.1 on Serial0/1 from FULL
 to DOWN, Neighbor Down: Adjacency forced to reset

ABR-VL-R8#
*Mar  1 00:48:12.587: OSPF: 1.1.1.10 address 1.1.1.10 on Serial0/0 is dead
*Mar  1 00:48:12.587: OSPF: 1.1.1.10 address 1.1.1.10 on Serial0/0 is dead, state DOWN
*Mar  1 00:48:12.591: %OSPF-5-ADJCHG: Process 1, Nbr 1.1.1.10 on Serial0/0 from FULL to DOWN, Neighb
or Down: Dead timer expired
*Mar  1 00:48:13.091: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x8000000E
*Mar  1 00:48:18.103: OSPF: Interface OSPF_VL0 going Down
*Mar  1 00:48:18.107: OSPF: 101.0.0.1 address 0.0.0.0 on OSPF_VL0 is dead, state DOWN
*Mar  1 00:48:18.111: OSPF: 101.0.0.2 address 1.1.1.6 on OSPF_VL0 is dead, state DOWN
*Mar  1 00:48:18.111: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.2 on OSPF_VL0 from FULL to DOWN, Neighb
or Down: Interface down or detached

*Mar  1 00:48:18.615: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x8000000F
*Mar  1 00:48:18.619: OSPF: Build router LSA for area 0, router ID 101.0.0.1, seq 0x8000000D
*Mar  1 00:48:20.731: OSPF: Rcv pkt from OSPF_VL0 src 1.1.1.6 dst 1.1.1.9 id 101.0.0.2 type 5 if_sta
te 0 : ignored due to unknown neighbor
*Mar  1 00:48:22.887: OSPF: Rcv pkt from OSPF_VL0 src 1.1.1.6 dst 1.1.1.9 id 101.0.0.2 type 4 if_sta
te 0 : ignored due to unknown neighbor

 

Reverting our Configuration on A1-R2 as a non-stub area with ‘no area 1 stub

‘debug ip ospf adjacency’ output from ABR-VL-R8. Highlighted and underlined are the states of a neighbor relationship. Indented highlights our virtual link transitioning to UP state.

*Mar  1 00:49:15.123: OSPF: 2 Way Communication to 1.1.1.10 on Serial0/0, state 2WAY
*Mar  1 00:49:15.127: OSPF: Send DBD to 1.1.1.10 on Serial0/0 seq 0x1719 opt 0x52 flag 0x7 len 32
*Mar  1 00:49:15.263: OSPF: Rcv DBD from 1.1.1.10 on Serial0/0 seq 0x7F9 opt 0x52 flag 0x7 len 32  m
tu 1500 state EXSTART
*Mar  1 00:49:15.267: OSPF: First DBD and we are not SLAVE
*Mar  1 00:49:15.267: OSPF: Rcv DBD from 1.1.1.10 on Serial0/0 seq 0x1719 opt 0x52 flag 0x2 len 132
 mtu 1500 state EXSTART
*Mar  1 00:49:15.271: OSPF: NBR Negotiation Done. We are the MASTER
*Mar  1 00:49:15.275: OSPF: Send DBD to 1.1.1.10 on Serial0/0 seq 0x171A opt 0x52 flag 0x3 len 132
*Mar  1 00:49:15.543: OSPF: Rcv DBD from 1.1.1.10 on Serial0/0 seq 0x171A opt 0x52 flag 0x0 len 32
mtu 1500 state EXCHANGE
*Mar  1 00:49:15.547: OSPF: Send DBD to 1.1.1.10 on Serial0/0 seq 0x171B opt 0x52 flag 0x1 len 32
*Mar  1 00:49:15.551: OSPF: Send LS REQ to 1.1.1.10 length 24 LSA count 2
*Mar  1 00:49:15.559: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 100 LSA count 1
*Mar  1 00:49:15.903: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 76 LSA count 1
*Mar  1 00:49:15.907: OSPF: Rcv LS REQ from 1.1.1.10 on Serial0/0 length 36 LSA count 1
*Mar  1 00:49:15.907: OSPF: Send UPD to 1.1.1.10 on Serial0/0 length 40 LSA count 1
*Mar  1 00:49:15.999: OSPF: Rcv DBD from 1.1.1.10 on Serial0/0 seq 0x171B opt 0x52 flag 0x0 len 32
mtu 1500 state EXCHANGE
*Mar  1 00:49:15.999: OSPF: Exchange Done with 1.1.1.10 on Serial0/0
*Mar  1 00:49:16.003: OSPF: Synchronized with 1.1.1.10 on Serial0/0, state FULL
*Mar  1 00:49:16.003: %OSPF-5-ADJCHG: Process 1, Nbr 1.1.1.10 on Serial0/0 from LOADING to FULL, Loa
ding Done
*Mar  1 00:49:16.007: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 148 LSA count 2
*Mar  1 00:49:16.507: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x80000010
*Mar  1 00:49:19.971: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 104 LSA count 2
*Mar  1 00:49:20.751: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 112 LSA count 1
*Mar  1 00:49:24.835: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 104 LSA count 2
*Mar  1 00:49:30.579: OSPF: Interface OSPF_VL0 going Up
*Mar  1 00:49:30.687: OSPF: 2 Way Communication to 101.0.0.2 on OSPF_VL0, state 2WAY
*Mar  1 00:49:30.687: OSPF: Send DBD to 101.0.0.2 on OSPF_VL0 seq 0x6EB opt 0x72 flag 0x7 len 32
*Mar  1 00:49:31.387: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL0 seq 0x1FDE opt 0x72 flag 0x7 len 32
mtu 0 state EXSTART
*Mar  1 00:49:31.391: OSPF: NBR Negotiation Done. We are the SLAVE
*Mar  1 00:49:31.391: OSPF: Send DBD to 101.0.0.2 on OSPF_VL0 seq 0x1FDE opt 0x72 flag 0x2 len 192
*Mar  1 00:49:31.767: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL0 seq 0x1FDF opt 0x72 flag 0x3 len 172
 mtu 0 state EXCHANGE
*Mar  1 00:49:31.767: OSPF: Send DBD to 101.0.0.2 on OSPF_VL0 seq 0x1FDF opt 0x72 flag 0x0 len 32
*Mar  1 00:49:31.947: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL0 seq 0x1FE0 opt 0x72 flag 0x1 len 32
mtu 0 state EXCHANGE
*Mar  1 00:49:31.951: OSPF: Exchange Done with 101.0.0.2 on OSPF_VL0
*Mar  1 00:49:31.951: OSPF: Send LS REQ to 101.0.0.2 length 12 LSA count 1
*Mar  1 00:49:31.955: OSPF: Send DBD to 101.0.0.2 on OSPF_VL0 seq 0x1FE0 opt 0x72 flag 0x0 len 32
*Mar  1 00:49:31.959: OSPF: Rcv LS REQ from 101.0.0.2 on OSPF_VL0 length 48 LSA count 2
*Mar  1 00:49:31.963: OSPF: Send UPD to 1.1.1.6 on OSPF_VL0 length 68 LSA count 2
*Mar  1 00:49:32.023: OSPF: Rcv LS UPD from 101.0.0.2 on OSPF_VL0 length 64 LSA count 1
*Mar  1 00:49:32.027: OSPF: Synchronized with 101.0.0.2 on OSPF_VL0, state FULL
*Mar  1 00:49:32.027: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.2 on OSPF_VL0 from LOADING to FULL, Loa
ding Done

*Mar  1 00:49:32.535: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x80000011
*Mar  1 00:49:32.543: OSPF: Build router LSA for area 0, router ID 101.0.0.1, seq 0x8000000E
*Mar  1 00:49:32.551: OSPF: Rcv LS UPD from 101.0.0.2 on OSPF_VL0 length 76 LSA count 1
*Mar  1 00:49:32.555: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 76 LSA count 1
*Mar  1 00:49:40.335: OSPF: Rcv LS UPD from 1.1.1.10 on Serial0/0 length 76 LSA count 1\

 

Configuring Area 0 router with Authentication – Effects and logs

ABR-1-0#

interface Serial0/1
 ip address 192.0.0.17 255.255.255.252
 ip ospf message-digest-key 1 md5 dracula
 serial restart-delay 0
!
router ospf 1
 log-adjacency-changes
 area 0 authentication message-digest
 area 1 virtual-link 101.0.0.1 message-digest-key 1 md5 dracula
 network 1.1.1.4 0.0.0.3 area 1
 network 101.0.0.2 0.0.0.0 area 0
 network 192.0.0.16 0.0.0.3 area 0

ABR-1-0(config-router)#

*Mar  1 00:24:33.475: %OSPF-5-ADJCHG: Process 1, Nbr 192.0.0.18 on Serial0/1 from FULL to DOWN, Neig
hbor Down: Dead tim ip address 192.0.0.17 255.255.255.252 

ABR-VL-R8#debug ip ospf adj

*Mar  1 00:26:29.507: OSPF: Rcv pkt from 1.1.1.6, OSPF_VL0 : Mismatch Authentication type. Input pac
ket specified type 2, we use type 0
*Mar  1 00:26:31.491: OSPF: 101.0.0.2 address 1.1.1.6 on OSPF_VL0 is dead
*Mar  1 00:26:31.491: OSPF: 101.0.0.2 address 1.1.1.6 on OSPF_VL0 is dead, state DOWN
*Mar  1 00:26:31.495: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.2 on OSPF_VL0 from FULL to DOWN, Neighb
or Down: Dead timer expired
*Mar  1 00:26:32.003: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x80000004
*Mar  1 00:26:32.003: OSPF: Build router LSA for area 0, router ID 101.0.0.1, seq 0x80000003
*Mar  1 00:26:34.311: OSPF: Rcv pkt from 1.1.1.6, OSPF_VL0 : Mismatch Authentication type. Input pac
ket specified type 2, we use type 0
*Mar  1 00:26:39.059: OSPF: Rcv pkt from 1.1.1.6, OSPF_VL0 : Mismatch Authentication type. Input pac
ket specified type 2, we use type 0

! !  After adding the md5 authentication statement  ! !

*Mar  1 00:42:50.971: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.291: OSPF: 2 Way Communication to 101.0.0.2 on OSPF_VL1, state 2WAY
*Mar  1 00:42:51.291: OSPF: Send DBD to 101.0.0.2 on OSPF_VL1 seq 0xF8F opt 0x72 flag 0x7 len 32
*Mar  1 00:42:51.295: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.299: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.575: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL1 seq 0xD88 opt 0x72 flag 0x7 len 32  m
tu 0 state EXSTART
*Mar  1 00:42:51.575: OSPF: NBR Negotiation Done. We are the SLAVE
*Mar  1 00:42:51.579: OSPF: Send DBD to 101.0.0.2 on OSPF_VL1 seq 0xD88 opt 0x72 flag 0x2 len 212
*Mar  1 00:42:51.583: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.659: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL1 seq 0xD89 opt 0x72 flag 0x3 len 212
mtu 0 state EXCHANGE
*Mar  1 00:42:51.663: OSPF: Send DBD to 101.0.0.2 on OSPF_VL1 seq 0xD89 opt 0x72 flag 0x0 len 32
*Mar  1 00:42:51.667: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.787: OSPF: Rcv DBD from 101.0.0.2 on OSPF_VL1 seq 0xD8A opt 0x72 flag 0x1 len 32  m
tu 0 state EXCHANGE
*Mar  1 00:42:51.791: OSPF: Exchange Done with 101.0.0.2 on OSPF_VL1
*Mar  1 00:42:51.791: OSPF: Send LS REQ to 101.0.0.2 length 48 LSA count 4
*Mar  1 00:42:51.795: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.795: OSPF: Send DBD to 101.0.0.2 on OSPF_VL1 seq 0xD8A opt 0x72 flag 0x0 len 32
*Mar  1 00:42:51.799: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.803: OSPF: Rcv LS REQ from 101.0.0.2 on OSPF_VL1 length 84 LSA count 5
*Mar  1 00:42:51.807: OSPF: Send with youngest Key 1
*Mar  1 00:42:51.807: OSPF: Send UPD to 1.1.1.6 on OSPF_VL1 length 172 LSA count 5
*Mar  1 00:42:52.139: OSPF: Rcv LS UPD from 101.0.0.2 on OSPF_VL1 length 160 LSA count 4
*Mar  1 00:42:52.143: OSPF: Synchronized with 101.0.0.2 on OSPF_VL1, state FULL
*Mar  1 00:42:52.143: %OSPF-5-ADJCHG: Process 1, Nbr 101.0.0.2 on OSPF_VL1 from LOADING to FULL, Loa
ding Done

*Mar  1 00:42:52.651: OSPF: Build router LSA for area 1, router ID 101.0.0.1, seq 0x80000006
*Mar  1 00:42:52.655: OSPF: Send with youngest Key 1
*Mar  1 00:42:52.659: OSPF: Build router LSA for area 0, router ID 101.0.0.1, seq 0x80000005
*Mar  1 00:42:52.667: OSPF: Rcv LS UPD from 101.0.0.2 on OSPF_VL1 length 88 LSA count 1
*Mar  1 00:42:52.671: OSPF: Rcv LS UPD from 1.1.30.1 on Serial0/0 length 76 LSA count 1
*Mar  1 00:42:54.647: OSPF: Send with youngest Key 1

 

‘sh ip ospf virtual-links’ indicates Message digest authentication is enabled for the Virtual Link.

October 29, 2008 Posted by | Protocols - OSPF | | Leave a comment

ESX – Insuff. resources .. for HA

Got this error when trying to start a new virtual machine. Bad News!! Cluster is running is out of memory.

No Problem!! Check “Allow virtual machines to be powered on even if they violate availability constraints” and you are good to go. This setting is found under Cluser Settings -> ‘Vmware HA’. This is a quick dangerous way to start additional VM’s.

You really have to make sure that the ESX hosts are using less than 70% of memory (This is purely my guestimate!!), it can be much less % in certain scenarios where servers tend to take big chunks at intervals. Quick check on the host mem % in cluster can be made under cluster -> hosts.

How to estimate how many VM’s I can run on my clustered infrastructure?

Simple!

  • Take the VM-host which has been assigned the most memory e.g 1Gb. Figure out which ESX-host in the cluster has least amount of memory (if all ESX hosts in the cluser have same Gbs of memory, then you dont need to figure out any thing) e.g the number is 24Gb.  So 24/1 = 24, if you have three ESX-hosts in the cluser then multiply 24*3 = 72 VM-hosts. 

To power on another VM-host without a sweat and keep ‘violate availability constraint’ enabled,  add another ESX-host to the cluster.

October 24, 2008 Posted by | ESX-VMware | | Leave a comment

ESX – Orphaned VM

 

 

Login to the host where VM was last running.

  • vmkfstools -D /vmfs/volumes/datastore/<any dir to save>    // will dump the messages information in /var/log/vmkernel.
  • cat /var/log/vmkernel | grep owner

The above ‘cat’ command will throw something like ‘44444444-55555555-6666-00166666666‘, check the latest owner message.

The bold portion is the system uuid.

The following command will print out the system uuid when run on each of the ESX hosts:

  • esxcfg-info | grep -i ‘system uuid’ | awk -F ‘-‘ ‘{print $NF}’

Compare the ‘cat’ command output with ‘excfg’ command output, Identify host which is owner of the locked process. Run the following command on the identified host to list the process which is holding up the vm hostage:

  • ps -elf | grep <hostname i.e virtual machine name>

If you get back the process number which is holding up the file, kill the process and thats it.

This never worked for me :D, process state (ps) commad did not show any process with the ‘vmname’. I tried deleting the files from virtual center server console and it deleted everything except the vmdk file. A quick VMWARE Communities lookup indicated resoultion as rebooting the ESX host. Next day I tried deleting the directory and vmdk file, It Worked!! So there was some thing holding up the file and released it overnight.

October 22, 2008 Posted by | ESX-VMware | | Leave a comment