Big data has been a buzzword thrown around a lot since past 4-5 years. A lot of companies use it for showing they are aware of the latest technology trends are in sync with the latest happenings. Big data can be simply defined as extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. The 2 words “big” and “data” pretty much define themselves and are self-explanatory.

Since IT environments have a lot of complex logs generated everyday, they are a part of big data problem. These logs are collected at a central place and stored in a centralized database in order to process them, retrieve them easily and make some meaningful actionable items. Once the log data is collected at a centralized location, they are processed, analyzed, integrated and correlated. Back in the days, logs were just for troubleshooting but over the years, they have become more useful in functions like network performance monitoring, recording user actions providing helpful data for investigating suspicious or unauthorized activity, and assisting with proactive monitoring of the environment. In many cases, having log data easily available can provide early warnings about problems before they become a bigger issue.

And not to forget they are even required for compliance reasons. Centralized logging and event management is required or heavily implied by many security standards, making it unavoidable and one of the fastest growing segments of the industry. If your organization is subject to FISMA, GLBA, HIPAA, SOX, COBIT, or PCI DSS compliance, then this is an area you should already be thinking about or have implemented. Also, even though ISO 27002 compliance is not mandated for any organization, it’s always good to be prepared with the forthcoming requirements

Finding the Needle In the Haystack

As with any data, even with the log data all of it is not useful or relevant. There is only some data which is actually useful and helps with the issues. The challenge is to identify that important data of logs and use it to get actionable data to achieve desired results. Log generation and storage are just the beginning. The next steps would be log confidentiality, integrity and availability and if all of these have no issues then log analysis will follow. These are the steps required in centralized logging for the data to be used to solve any purpose. A centralized logging is the most common method of storing logs in an IT department.

Once the log data is collected from all the different sources and stored in the centralized database it is easier to process, retrieve and access it and make any manipulations on it. If the centralized logging is well planned it can be a big advantage for an organisation but otherwise it definitely is a big challenge. Let’s look at factors that can make sure that centralized logging is done right by an organization.

  • Define and prioritise the function of log management: – It’s important to start off with defining the requirements and goals for log performance and monitoring based on applicable laws, regulations, and existing organisational policies. Once this step is done, next is to prioritise goals based on balancing the need to reduce risk with the time and resources necessary to perform log management functions.
  • Properly establish the log management policies: – Once the policies and procedures are established in an organisation regarding the storage and access of logs, the consistency in following laws and regulatory requirements becomes possible. A regular audit of these policies and procedures will ensure that the regulatory requirements are met and followed.
  • Create and maintain a secure Centralised log management infrastructure: – It is very important that the infrastructure needed for log management is properly identified and made available. All the components should be able to interact with each other without any friction to have smooth functioning of the system. Furthermore, it’s very important that the infrastructure is scaled to handle peak log volumes, if required and can handle the 3 important functions of the log management architecture namely log generation, log storage and analysis and log monitoring.

Once these factors are considered, the next step is to make sure that the staff responsible for handling the log management is trained and provided necessary tools to carry on their tasks properly. It is imperative to provide log management tools, documentation, and technical guidance for log management staff members to succeed at their jobs. Centralized logging is definitely a big data challenge but proper planning, tools and implementation can make it work really well within an organisation.

The approach should be to tackle this challenge head on and make it work with the right tools, right policies, procedures and adequate planning. For a well managed IT infrastructure, centralized logging is an invaluable asset. It is a source of information that can be used in a number of business processes, for security assessments and also various laws mandate that logs be maintained and reviewed.