Thursday, September 24, 2009

QoS – Congestion Management demystified!!

  

no. of queues

command

detail

Priority Queue


 

Play with 4 static queues based on high, med, normal, low

4 static

priority-list 1 protocol ip high list 101
priority-list 1 protocol ip medium gt 120
access-lists 101 permit ip any 150.1.1.0 0.0.0.127

int fa0/0
priority-group 1


 

#show queueing interface fa0/0

#sh queueing priority

High – will always be serviced before other…downward
Medium
Normal
Low

Custom queue


 

Play with 16 static queues based on byte-count and packet count.

16 Static

access-list 101 permit ip any 150.1.1.0 0.0.0.127
queue-list 1 protocol ip 1 list 101
queue-list 1 protocol ip 2 tcp 80
…..
queue-list 1 queue 6 limit 100 -> the no. of packets tht can be queued =100 (default 20 packets can be queued)
queue-list 1 queue 6 byte-count 2000 -> 2000 bytes will be served b4 moving to next queue (default 1500 bytes)

int fa0/0

custom-queue-list 1


 

#show queueing interface fa0/0

#sh queueing custom

where frames in each queue are serviced until a byte-counter limit threshold is met. Once this byte-count limit threshold is met, the frames in the next queue are serviced

Weighted Fair queue or

Fair queue


 

This queuing strategy only makes sense if IPP is used. Else it behaves exactly like FIFO. Using IPP assigns weights and thereby customizes the flow.

Or if CDT is configured.


 

Even without IPP values, WFQ works dynamically based on source/dest ip/port. Such flows are assigned a queue. Multiple flows can share one same queue if there are more flows. Each queue is given a chance. Kind of multi door exit.


 


 

For better control:

Pre-mark the traffic with IP Precedence value. The better the precedence, the less the weight and the more priority the traffic gets.


 

Dynamic queues but still the count can be manually configured; tune the CDT value to drop packets from any queue that crosses a certain limit.

dynamic queues per Flow/converation

256 by default

(16-4096 permitted manually)


 

Weight=

32384 *(IPP+1)


 

Default weight = 32384 (IPP=0)


 

int s0/0
hold-queue <N> out --> total max buffer in all queues ; Outbound software queue length of N
packets

hold-queue 256 out


 

fair-queue <CDT> <N Flow Queues> <N Reservable Queues>.

fair-queue 16 128 8



16 -> each queue can hold 16 packets. This changes the CDT congestive discard threshold (CDT), instead of the default 64.


 

128 -> no. of queues;


 

0 -> The 0 at the end says that there is no queues being used with RSVP (Resource Reservation Protocol).


 

IF any queue crosses 16 packets, one packet from the most aggressive queue will be dropped (this can be any other queue also).


 

EHDF-PMM-RTR1#sh queueing fair

Current fair queue configuration:


 

Interface Discard Dynamic Reserved Link Priority

threshold queues queues queues queues

Serial0/2/0 64 256 0 8 1


 

#sh queueing s2/0 -à remember this queue is different than the Traffic-shaping queue. TS queue is shown using #sh traffic-shape queu


 

DFM#sh queue fastEthernet 0/1

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 3945592

Queueing strategy: weighted fair

Output queue: 47/1000/64/3945592 (size/max total/threshold/drops)

Conversations 1/2/16 (active/max active/max total)

1-> only one conversation queued currently (at tht very second)

2-> max-queued at any given second

16-> max-queues defined using "fair-q 64 16 0)

Reserved Conversations 0/0 (allocated/max allocated)

Available Bandwidth 7500 kilobits/sec


 

(depth/weight/total drops/no-buffer drops/interleaves) 47/32384/3526957/0/0

Conversation 7, linktype: ip, length: 62

source: 169.253.10.2, destination: 192.168.58.2, id: 0x5B67, ttl: 127, prot: 17

Default for all interfaces less than E1 line 2.048Mbps


IOS implementation of WFQ assigns weights automatically based on the IP Precedence (IPP) value in the packet's IP header.

Weight(1) = 32384/(0+1) = 32384
Weight(2) = 32384/(1+1) = 16192
..
Step 2:
Compute the sum of all weights:
Sum(Weight(i),1…4) = 32384+16192*2+8096 = 72864
Step 3:
Compute the shares:
Share(1) = 72864/Weight(1) = 72864/32384 = 2.25
Share(2) = 72864/Weight(2) = 72864/16192 = 4.5
Share(3) = 72864/Weight(3) = 72864/16192 = 4.5
Share(4) = 72864/Weight(4) = 72864/8096 = 9
The proportion is 2.25:4.5:4.5:9 = 1:2:2:4

once any Queue crosses the CDT value, one packet is dropped is dropped in any queue tht has the maximum schedule time i.e. the most aggressive queue. This is kind of similar to WRED


 

During congestion,

IF one flow of 96kbps is assigned IP PREC = 5

Other flow of 500kbps is assigned IP PREC = 0 (default)

The ratio of flow = 32765(1+5):32384(1+0) = 6:1

So packets of 96kbps will be sent 6 times and 500kbps will be sent one time. This rate only depends on IP PREC Value and not the received throughput. Even if the traffic received is 1mbps for IPPREC-0, the ratio will still stay same.

Thts the purpose of WFQ!!

--------------------------------

Test:

policy-map PM_MARK

class CM_TELNET

set ip precedence 5

class CM_UDP1000

set ip precedence 4

class CM_UDP1001

set ip precedence 1

class CM_UDP1002

set ip precedence 2

class CM_UDP1003

set ip precedence 3

The below stats are taken at different timeperiods. Shows tht queues are totally dynamic and the flow changes every second. Conversation 11 is assigned for UDP1000 and 1002 wheres their IPP values are different..

DFM#sh queue fastEthernet 0/1

(depth/weight/total drops/no-buffer drops/interleaves) 426/10794/92394/13989/0

Conversation 11, linktype: ip, length: 1242

source: 169.253.10.2, destination: 192.168.58.2, id: 0x6213, ttl: 127,

TOS: 64 prot: 17, source port 4932, destination port 1002


 

(depth/weight/total drops/no-buffer drops/interleaves) 571/8096/15724/2483/0

Conversation 2, linktype: ip, length: 1242

source: 169.253.10.2, destination: 192.168.58.2, id: 0x6218, ttl: 127,

TOS: 96 prot: 17, source port 4938, destination port 1003


 

(depth/weight/total drops/no-buffer drops/interleaves) 517/6476/157534/24106/0

Conversation 11, linktype: ip, length: 1242

source: 169.253.10.2, destination: 192.168.58.2, id: 0xF60C, ttl: 127,

TOS: 128 prot: 17, source port 4934, destination port 1000

Class based Weighted Fair queue (CBWFQ)


 

CBWFQ : Not supported on sub-interfaces without hierarchical policy


 

CBWFQ is supported on sub-interfaces if combined with a hierarchical policy with "Shaping" enabled

else only on physical


 

policy-map child

class voice

priority/bandw 512

!

policy-map parent

class class-default

shape average 2000000

service-policy child

interface ethernet0/0.1

service-policy parent -> valid

service-policy child ànot-valid if child is directly applied


 


 


 


 


 


 

Classes r defined


 

Static: Class-non-default ->

Will match custom traffic; BW defined statically for this class. Weight is derived from a formula in a way tht Static will supersede dynamic queue/ class.


 

Dynamic: Class-default->

Will be like fair-queue with dynamic queues and weights are assigned based on IP Prec values.


 


 

Note that if Shaping is enabled, CBWFQ queues will be numbered according to Traffic-shaping queue numbers and not WFQ no.

dynamic queues per Flow +higher Weight Static Queues based on the no. of flows (not classes).


 


 


 


 


 


 


 

CBWFQ Queue numbering Part1-:

Assuming WFQ queues = 256.

And traffic-shaping disabled

(traffis shaping not used, IF TS used, TS queues will be used to number the queues. TS queue is = 16 by default)

0-255 = WFQ dynamic

256-263 = System reserved

264 (16+8) = Priority

265 onwards = CBWFQ

Counts depend on no. of classes used. (and ACE in class?)


 


 


 


 


 

WFQ flows

Constant

16

64

32

64

64

57

128

30

256

16

512

8

1024

4

2048

2

4096

1


 

interface Serial 0/1
bandwidth 128
max-reserved-bandwidth 100 -> default 75%
no fair-queue -> disabled here, configured under class-default; manual disabling needed for interface with less than 2Mbps BW since fair-queue is default for them

hold-queue 512 out -> total buffers in all the classes combined
!
policy-map SERIAL_LINK
class HTTP
bandwidth 32


queue-limit 16 -> no .of packets to hold in this queue
class SCAVENGER

bandwidth 32 <or bandwidth percent 25>
queue-limit 24

class class-default
fair-queue -
à
practically this is not needed as this class will auto treated as fair-queue. The only reason I can think of is without this, we can't use the next command of "queue-limit"

queue-limit 32->

the maximum number of packets the queue can hold, default 64


 


 

#sh service-pol int fa0/0


 


 

CBWFQ Queue numbering Part2 -:


 

IF TS enabled, queue number for CBWFQ starts from 25.

(0-15 = Shaping Queues ;

16-23 = System reserved;


24 (16+8) = Priority; 25 onwards = CBWFQ


 

Counts depends on no. of classes used. One class gives one Queue??? Test using more than one ACE inside a class)


 


 

If the idea is to allocate "bandwidth" to a few classes and then police the full policy to some specific CIR, this is not supported.

#CBWFQ : Hierarchy supported only if shaping is configured in this class

So the full policy cant be policed, instead it can be shaped when shaper will use internal Shaping queues to shape. CBWFQ needs different queues.


 

--------à

This problem happens even if we manually configure "fair-queue" under class-default.

It's better to use LLQ which has an inbuilt policer.


 

Or put shaping on the same reserved BW.


 

E.g2:

UDP1000 sending 4Mbps with LLQ 6Mbps

UDP1001 sending 20 MBps


 

Class-map: CM_UDP1000 (match-all)

140021 packets, 173906082 bytes

30 second offered rate 4140000 bps----à CBWFQ gets its 4Mbps

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all)

190593 packets, 236716506 bytes

30 second offered rate 5669000 bps----à rest unclassified gets 6Mbps

Match: access-group 101


 


 

On 12.4 mainline:

CBWFQ leaves the unused BW for rest.

But if CBWFQ flow is more, it'll eat more and others will suffer!


 

On12.4T, CBWFQ has inbuilt policer.

MQC equivalent of a combination of the legacy interface level Weighted
fair-queue command and the custom-queue.

This has dynamic queues + Link queus (8 in totoal) + static Queues
IOS auto calculates the no. of dynamic queues
Static Queues are defined manually. Static Queues are numbers after dynamic, then Link queus (system queus 8 in total). e.g. 32 dynamic queus means manual queues will start from 41 onward (33-40 will be link queues used for L2 keepalives and L3 routing updates)
In summary, the key point about CBWFQ is that it uses the same scheduling logic as the legacy WFQ, but user-configurable classes have a special low weight based on constant, making them more important than any dynamic conversation.


 

Assigned weights based on WFQ logic -Weight (dynamic)= 32384/(IP Precedence+1).


 

Static Weight (i) = Const * Interface_Bandwidth/Bandwith(i) ; const is inversely proportional to no. Dynamic queues.


 


 

So traffic that falls in a class with "bandwidth" keyword defined will be treated as CBWFQ. Individual Flows will be allotted individual CBWFQ queue.


 

Any Flows inside a class without "bandwidth" will be allotted a WFQ queue. Even if one single class is


 

On 12.4 mainline (behavior changed in 12.4T(24), 12.4T (24) has kind of inbuilt policer)

This is dangerous if a heavy traffic is configured with "bandwidth" and class-default is left as it is.

E.g. 4 flows; each of 20Mbps; Total Link is 10Mbps; one flow is reserved for 6Mbps. Rest all flows default. (default flows will use fair-queuing with lower weight)


 

Working: heavy traffic will be guaranteed 6Mbps + it'll eat more based on its higher weight. Practically this went to 9.2 Mbps, other got only in Kbps.

So make sure class-default is also allocated some remaining bandwidth e.g. 3990kbps in this example.


 

Or the same LLQ type behavior can be achieved using shaping on the 6Mbps reserved class so that it doesn't cross a certain limit. E.g.

policy-map PM_QOS

class CM_UDP1000

bandwidth 6000

shape average 6000000

!

interface FastEthernet0/1

bandwidth 10000

ip address 192.168.58.1 255.255.255.0

load-interval 30

duplex auto

speed 10

no keepalive

max-reserved-bandwidth 100

service-policy output PM_QOS


 

On 12.4T, CBWFQ behaves as if it has an inbuilt policer for the reserved BW.
E.g1:.

UDP1000 with BW=6000kbps sending at the rate of 8MBps

UDP1001 sending at 20Mbps


 

In 12.4T,

WAN#sh policy-map int fa0/1

FastEthernet0/1


 

Service-policy input: PM_QOS_STATS


 

Class-map: CM_UDP1000 (match-all)

226396 packets, 281183832 bytes

30 second offered rate 5945000 bps
----
à CBWFQ gets 6Mbps even when sending at 8Mbps

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all)

146264 packets, 181659888 bytes

30 second offered rate 3864000 bps
-> other flow gets remaining 4MBps

Match: access-group 101


 

In 12.4 Mainline,

WAN#sh policy-map int fa0/1

FastEthernet0/1


 

Service-policy input: PM_QOS_STATS


 

Class-map: CM_UDP1000 (match-all)

1185545 packets, 1472446890 bytes

30 second offered rate 8161000 bps ----à CBWFQ gets 8.0Mbps when sending at 8Mbps based of the higher weight


 

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all)

373738 packets, 464182596 bytes

30 second offered rate 1649000 bps-> other flow get only around 2 Mbps

Match: access-group 101

LLQ - Low Latency queue


 

Combo of one priority + CBWFQ


 

LLQ on IOS has a BW parameter in the priority keyword.


 

PIX/ASA don't have this BW parameter. PIX/ASA don't police the Priority queue.


 


 

Max-bandwidth reserved on interface applies on both LLQ and Bandwitdh reservation.


 

Both the values shud be within the reserved bandwidth value. E.g. interface=10Mbps, LLQ=6000 …below e.g.

DFM(config-if)# max-reserved-bandwidth 50

Reservable bandwidth is being reduced.

Some existing reservations may be terminated.

CBWFQ: Not enough available bandwidth for all classes Available 5000 (kbps) Needed 6010 (kbps)

1 Priority - static
CBWFQ (dynamic queues per Flow +higher Weight Static Queues)


 


 

Auto enables CBWFQ on the interface for traffic not having "priority keyword"


 

These traffic are treated as per CBWFQ if "bandwith" is assigned


 

Else based on WFQ if IPP is assigned

Or just treated equally


 


 


 


 

There are some IOS where the priority command accepts without any additional kbps argument. In tht case all of the interface BW is for LLQ. This is present in IOS12.1 Cisco 7300.

policy-map SERIAL_LINK
class VOICE
priority 27

---> 27Kbps (e.g. 60 bytes L3 packet with 50 packets per second = (60-L3+7-L2)*50*8bits=26800bps = 27kbps) ; traffic exceeding this will be auto dropped based on single token bucket system

auto maps to dscp ef

class HTTP
no bandwidth
bandwidth remaining percent 33
class SCAVENGER
no bandwidth
bandwidth remaining percent 33


 

#sh service-pol int fa0/0


 


 

e.g. 3:

UDP1000 sending 20Mbps with LLQ 6Mbps

UDP1001 sending 5 MBps with no IPP=0

Total Link = 10Mbps


 

Class-map: CM_UDP1000 (match-all)

88004 packets, 109300968 bytes

30 second offered rate 6015000 bps -> 6 Mbps

Match: access-group 100


 

Class-map: CM_UDP1003 (match-all)

54388 packets, 67549896 bytes

30 second offered rate 3795000 bps -> 3.8 Mbps

Match: access-group 103


 

This means during a slight congestion LLQ doesn't fight for anything more. Instead stays within limit and other flows get the rest.


 

Without congestion, LLQ can go to 100% of interface BW (even when 75% is available for allotment)

1 priority queue with a defined internal policed BW + Class based Weighted Fair queue


 

e.g1 . 4 Flows of each 20Mbps with LLQ

Class-map: CM_UDP1000 (match-all) -> LLQ 6000kbps

30 second offered rate 5981000 bps = 5.9 Mbps

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all) -> set ip precedence 1

30 second offered rate 846000 bps -> 846Kbps

Match: access-group 101    


 

Class-map: CM_UDP1002 (match-all) -> set ip precedence 2

30 second offered rate 1269000 bps -> 1.26 Mbps

Match: access-group 102


 

Class-map: CM_UDP1003 (match-all) -> IPP 3

30 second offered rate 1693000 bps -> 1.7 Mbps

Match: access-group 103


 

Class-map: class-default (match-any)

97 packets, 5820 bytes

30 second offered rate 0 bps, drop rate 0 bps


 

e.g. 2:

UDP1000 sending 20Mbps with LLQ 6Mbps

UDP1001 sending 3 MBps with IPP1

Total Link = 10Mbps


 

Class-map: CM_UDP1000 (match-all) -> LLQ 6000

66054 packets, 82039068 bytes

30 second offered rate 6997000 bps -> 7mbps

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all) -> IPP 1

26277 packets, 32636034 bytes

30 second offered rate 2811000 bps
->2.9 Mbps

Match: access-group 101


 

e.g. 3:

UDP1000 sending 2Mbps with LLQ 6Mbps

UDP1001 sending 20 MBps


 

Result:

LLQ – UDP1000 get its 2Mbps and rest 8Mbps stays free.

Rest 8 can be used by any other flow without issue.


 


 

e.g. 4:

UDP1000 sending 8Mbps with LLQ 6Mbps

UDP1001 sending 20 MBps


 

Class-map: CM_UDP1000 (match-all)

455068 packets, 565194456 bytes

30 second offered rate 6000000 bps ----à LLQ doesn't try to get more than 6Mbps

Match: access-group 100


 

Class-map: CM_UDP1001 (match-all)

288129 packets, 357856218 bytes

30 second offered rate 3810000 bps --à rest of the flows get the remaining 4 Mbps

Match: access-group 101

No comments:

Post a Comment