“Packets were dropped” Faults in ACI
ACI reports multiple Faults with the text “packets were dropped during” or similar. These can have different reasons, but most of the time they come from a limitation in the Atomic counters.
Cisco itself states in Bug #99892
When packets are forwarded from one vPC peer to the other vPC peer and the vPC virtual IP address [VIP] (which is configured on both vPC peers) is used as a source address, then received packets that are counted by atomic counters are not counted correctly.
This then leads to different Faults in ACI, as an example F1545
So if you see faults related to Cisco ACI packet drops, this is normally nothing to worry about!
When are the Atomic Counters not reliable?
There are two main cases when the atomic counters are not reliable and generate faults F1545 and similar.
- Using of 2nd generation hardware as Leaf switches in ACI, e.g. 93180yc-ex, 93180yx-fx, 93108tc-ex, etc.
- Having more than 64 TEP’s (This would be the sum of Leafs, Spines and VPC Pairs), in which case ACI uses Path mode for the Atomic counters.
If the second case is true, can be checked with the following class “dbgOngoingAcMode”. There are different approaches to query this class:
- ACI API with Postman
- Moquery, ssh to the APIC and
apic# moquery -c dbgOngoingAcMode | grep mode
- With Visore, go to https://<APIC_IP>/visore.html and then search for the class.
If the mode is shown as path, then your Atomic counters aren’t reliable and the Faults F1545 and similar can be ignored.
How to clean Faults F1545 and others related to “packets were dropped during”?
After you confirmed that the counters aren’t reliable in your fabric, you can squelch them. Which makes sure that they don’t appear in the Faults list of ACI.
This can be done under Fabric -> Fabric Policies -> Policies -> Monitoring -> default -> Fault Severity Assignment Policies
Create squelch policies to suppress faults F1545 and related
First you need to modify the existing Policies.
Then you need to enable the different underlying monitoring objects, to be able to squelch the related faults. Click on the pen for that.
Scroll down until you find this block of monitoring objects, enable all of them.
Then do a first submit, now you can start to apply the Squelch action to the different monitoring objects. Select one of them through the dropdown. Then you can add the Fault and the Initial Severity.
Repeat this for all the monitoring objects that are related, namely
- Atomic Counter for a Path (dbg.AcPath)
- Atomic Counter for a Path from a VPC pair to a VPC pair (dbg.SDVPCPath)
- Atomic Counter for a Path from a VPC pair to a single node (dbg.SVPCPath)
- Atomic Counter for a Path from a single node to a VPC pair (dbg.DVPCPath)
- Atomic Counter for a Trail (dbg.AcTrail)
It’s quite a bit of clicking :) But in the end it should look like that.
JSON to create the squelch policy
I prepared the JSON to squelch all the mentioned faults:
- F1545
- F1546
- F1547
- F1548
- F1549
- F1550
- F1551
- F1552
You can find it as a gist here. POST it against https://<APIC_IP>/api/mo/uni/fabric/monfab-default.json and all the policies will appear.
Overview over the Faults
F1545 This fault occurs when a significant number of packet drops are detected by a configured and enabled Atomic Counter
F1546 This fault occurs when a small number of packet drops are detected by a configured and enabled Atomic Counter
F1547 This fault occurs when a significant number of excess packets are detected by a configured and enabled Atomic Counter
F1548 This fault occurs when a small number of excess packets are detected by a configured and enabled Atomic Counter
F1549 This fault occurs when a significant number of packet drops are detected by a configured and enabled Atomic Counter
F1550 This fault occurs when a small number of packet drops are detected by a configured and enabled Atomic Counter
F1551 This fault occurs when a significant number of excess packets are detected by a configured and enabled Atomic Counter
F1552 This fault occurs when a small number of excess packets are detected by a configured and enabled Atomic Counter