In the past several years log analysis technologies have matured, becoming a mainstream solution for troubleshooting a variety of problems across the various IT layers (infrastructure elements as well as applications). Although at first sight these technologies seem to do the same thing, i.e. enable the analysis of log events, different technologies have evolved to deal with different use cases. One of the main differentiation is the use of log analysis for data security vs. the use of log analysis for troubleshooting applications. In this post I will explain the main differences between these two use cases and the different technologies that are most appropriate for each one.
Log analysis for security
One of the use cases for log analysis is data security. Also known as SIEM or security intelligence, the idea is to analyze data from a variety of systems in order to identify anomalies, which can be used to identify cyber security attacks. The technologies that address this use case provide the means to collect and analyze huge amounts of data, and then employ a variety of algorithms and rules to identify anomalies (i.e. what is normal and what isn’t). A typical example, is the analysis of HTTP communications to find anomalies that can indicate a cyber attack by analyzing IP addresses, ports, protocols, etc.
Although this is undoubtedly a challenging task in itself, at least the data that is being analyzed is relatively structured, predictable, and consistent, allowing the use of relatively static rules to identify these anomalies.
For example, rules can correlate between log-in data from one site, while there are indications that the same user has also logged in from another location. In the logs below we can see that the data that is used to make this analysis is structured (IP addresses, ports, etc.).
This means that the aggregation and correlation of events for analysis does not rely on the use of semantic analysis technologies in order to understand the meaning of each event. Instead, these technologies focus on a large amount of static rules which are applied on the relatively consistent data structures.
Log analysis for applications
Another major use case is the use of log analysis technologies for applications. As part of the IT operational analytics, the idea is to identify and troubleshoot problems in any application by collecting and analyzing log data produced by the applications, while correlating it with log data produced by the infrastructure elements.
Application faults often have immediate business impact, such as unavailability of a service or feature, and even data corruption. In some cases application faults may be also related to security issues. Therefore identifying these events can be a critical task.
As opposed to the security logs, which are structured and predictable, the application layer springs from a multitude of developers, including in-house, commercial teams, etc. Since there are no industry standards for documenting and managing log events, these applications are extremely inconsistent both in terms of the IDs used for events and their wording.
The following example illustrates a connectivity problem (the user was not able to connect to the app). In order to investigate the source of the problem, the user searches for the term “socket”, which returns hundreds of results. However, using the Augmented Search semantic analysis the use can immediately identify the connection problems and authentication failure indications, although they are very different in their wording and ID. Additionally in order to expedite the identification process, Augmented Search identified them as critical, allowing the user to focus on the most critical issues along the timeline.
Without the semantic analysis of the Augmented Search the user would have received thousands of results, making it almost impossible to find the root cause of the problem. With the Augmented Search, the user can immediately focus on the most important and most relevant issues even in cases where the query itself yields tens of thousands of results. Furthermore the grouping of these events across the timeline with intuitive visualization methods assists in the immediate identification of interrelated events which in most cases are the root cause of the issue at hand.
As we have seen, due to the nature of the data being analyzed, the disciplines and methods used for security log analysis are very different from application log analysis. This difference is what led to the emergence of new technologies, such as Augmented Search, which are designed to address the challenges of chaotic application layer, while providing the means to rapidly focus on the most important issues.