Once a network incident occurs, the system starts a multi-stage network incident impact evaluation sequence:
- First, it uses intelligent event correlation and root cause analysis algorithms to indicate a single point of failure.
- Next, the system masks and suppresses all non-root failure events, effectively decreasing a number of operator-level alarms to one main alert.
- Then the system finds all affected services, i.e. services which paths go through the root failure point. If service paths are dynamic, the system applies the network topology knowledge to figure out how they are switched/routed at the moment.
- At the next step, every affected service is analyzed to figure out whether any alternative routes are available. That's necessary to distinguish between complete service failures and partial degradations caused by switching to failover routes.
- The overall incident impact is now evaluated based on the number of services affected, their importance, and degradation ratio.
- Finally, the initial alert's severity is updated according to its service impact. This happens right before the alert is routed to ITSM/Service Desk for further processing.
IT service registry is normally maintained in a third-party OSS/BSS system that includes billing, CRM and other modules. AggreGate Platform is flexible enough to retrieve the structured service data from the inventory system using SOAP, CORBA, HTTP/REST, SQL or any custom API. However, it's possible to set up and keep the IT service registry within AggreGate Network Manager. In this case, service descriptors will be stored in the server database.