AggreGate is one of the few device monitoring/management systems in the world that supports true distributed architecture. This architecture is designed to assure unlimited scalability by balancing all operations between AggreGate servers subdivided in multiple layers. Such common distributed architecture is expected to serve as a base for any modern and prospective management software.
Unlike nodes of AggreGate failover cluster, servers participating in the distributed architecture are completely independent. Each server has its own database, local user accounts, associated permissions, etc.
AggreGate distributed architecture is extremely flexible. It is techically based on establishing peering relationship between servers and attaching parts of unified data model from certain servers ("providers") to other servers ("consumers").
Objectives of Distributed Operations
The primary purposes of distributed architecture are:
- Scalability. Lower-level servers may be heavily loaded with near-real-time control and extensive polling of devices. However, in practice the number of devices that can be managed by a single server is limited to several thousands. To scale the system for larger number of devices, it's reasonable to install multiple servers and join them into a distributed installation.
- Load Balancing. Each server in a distributed installation solves its own task. Network management servers check availability and operability of IP network infrastructure, while physical access control servers are serving requests from door/turnstile controllers. In addition, the supervising operations (such as generating and emailing reports) can be performed by a central server.
- Firewall Penetration. Secondary "probe" servers may be installed in remote locations and connect to the central server themselves. System operators connect to the central server only, so there is no need to setup VPNs and port forwarding to the secondary servers.
- Centralization. Secondary servers may work in a fully automated mode, while their overall operation may be supervised by a single operator via primary server installed in the central control room.
Example 1: Smart City Management
Here is an example of multi-tier AggreGate-based architecture employed in a large metro area automation project:
- Layer 1: physical hardware (network routers, controllers, industrial equipment, etc.)
- Layer 2: direct management servers (network monitoring server, access control server, building automation server, etc.)
- Layer 3: building control center servers (one server per building, consolidates information from all "specialized" Layer 2 servers in this building)
- Layer 4: urban district servers (the final destination for escalated lower-level alerts, real-time monitoring by human operators, integration with Service Desk systems)
- Layer 5: HQ server (overall supervision, incident and situation management, global reporting and alerting)
Any particular AggreGate server in the above schema may actually be a multi-node failover cluster.
Example 2: Multi-segment Network Management
AggreGate Network Manager product built atop of AggreGate Platform is one of the most typical use cases for the distributed architecture. Large segmented networks of corporations and telecom operators cannot be monitored from a single location due to routing restrictions, security concerns and limited bandwidth between geographically separated segments.
Thus, a unified monitoring system is usually composed of several components:
- A primary or central server consolidating aggregated information from all network segments
- Secondary or probe servers that perform polling of devices in isolated segments
- Specialized servers, such as the traffic decomposition server which handles billions of NetFlow events per day
Secondary and specialized servers act as data providers for the primary server, exposing a part of their data model for the control center. This can be:
- The whole content of probe server's context tree, allowing full monitoring and configuration through the central server. In this case, the probe server is just used as a proxy for overcoming network segmentation issues.
- Alerts generated by probe servers. In this case, 99% of jobs are performed remotely, but central server operators are immediately notified about issues raised in secondary segments.
- A custom set of probe server's data, such as certain mission-critical devices and important overview reports. The actual polling and report generation will be performed by the secondary server, allowing to properly balance system load.
Example 3: High-performance Event Management
Some AggreGate Platform usage scenarios, such as centralized incident management, assume that a vast number of events should be received, processed and persistently stored in structured format. Some events streams may reach millions of events per second coming from several sources.
In such cases a single AggreGate server won't be able to handle the whole event stream. Distributed architecture helps organize event processing:
- Multiple local event processing servers are installed at event source locations. Several sources (probes) can be connected to a single processing server.
- A dedicated storage server or a multi-server Big Data storage cluster is associated with each local processing server. Number of cluster nodes may vary according to event rate.
- All local storage servers perform event pre-filtering, deduplication, correlation (using rules applicable to locally connected probes), enrichment and storage.
- Local storage servers are connected to a central aggregation server. The aggregation server is in charge of system-wide correlation of important events.
- Central server operators may screen the whole event database while actual data search tasks are distributed among the storage servers. Thus, centralized reporting and alerting based on the whole event database is possible.