Thursday, 4 July 2019

What is Splunk? Beginners Tutorial

Splunk is a software technology which is used for monitoring, searching, analyzing and visualizing the machine generated data in real time. It can monitor and read different type of log files and stores data as events in indexers. This tool allows you to visualize data in various forms of dashboards.
In this tutorial, you will learn
  • What is Splunk?
  • Why we need Splunk?
  • Features of Splunk
  • Splunk Products
  • Splunk Architecture
  • How Splunk Works?
  • Applications of Splunk
  • Best Practices of using Splunk
  • Famous companies using Splunk
  • Alternative to Splunk
  • Disadvantages of using Splunk

Why we need Splunk?

Splunk offers plenty of benefits for an organization. Some of the benefits of using Splunk are:
  • Offers enhanced GUI and real-time visibility in a dashboard
  • It reduces troubleshooting and resolving time by offering instant results.
  • It is a best-suited tool for root cause analysis.
  • Splunk allows you to generate graphs, alerts, and dashboards.
  • You can easily search and investigate specific results using Splunk.
  • It allows you to troubleshoot any condition of failure for improved performance.
  • Helps you to monitor any business metrics and make an informed decision.
  • Splunk allows you to incorporate Artificial Intelligence into your data strategy.
  • Allows you to gather useful Operational Intelligence from your machine data
  • Summarizing and collecting valuable information from different logs
  • Splunk allows you to accept any data type like .csv, json, log formats, etc.
  • Offers most powerful search analysis, and visualization capabilities to empower users of all types.
  • Allows you to create a central repository for searching Splunk data from various sources.

Features of Splunk

Important features of Splunk are:
  • Accelerate Development & Testing
  • Allows you to build Real-time Data Applications
  • Generate ROI faster
  • Agile statistics and reporting with Real-time architecture
  • Offers search, analysis and visualization capabilities to empower users of all types

Splunk Products

Splunk is available in three different versions.
  • Splunk Enterprise
  • Splunk Light
  • Splunk Cloud

Splunk Enterprise

Splunk Enterprise edition is used by large IT business. It helps you to gather and analyze the data from applications, websites, applications, etc.

Splunk Cloud

Splunk Cloud is a hosted platform. It has the same features as the enterprise version. It can be availed from Splunk or using AWS cloud platform.

Splunk Light

Splunk Light is a free version. It allows search, report and alter your log data. It has limited functionalities and feature compared to other versions.

Splunk Architecture

Splunk Architecture Diagram
Here, are fundamental components of Splunk architecture:

Universal Forward (UF):

Universal forward or UF is a lightweight component which pushes the data to the heavy Splunk forwarder. You can install Universal Forward at client side or application server. The job of this component is only to forward the log data.

Load Balancer (LB):

Load balancer is default Splunk load balancer. However, it also allows you to use your personalized load balancer.

Heavy forward (HF):

Heavy forward is a heavy component. This Splunk component allows you to filter the data. Example: collecting only error logs.

Indexer (LB):

Indexer helps you to store and index the data. It improves Splunk search performance. By default, Splunk automatically performs the indexing. For example, host, source, and date & time.

Search head (SH):

Search head is used to gain intelligence and perform reporting.

Deployment Server(DS):

Deployment server helps to deploy the configuration. For example, update the UF configuration file. We can use a deployment server to share between the component we can use the deployment server.

License manager (LM):

The license is based on volume & usage — for example, 50 GB per day. Splunk regular checks the licensing details.

How Splunk Works?

Forwarder:

Forwarder collect the data from remote machines then forwards data to the Index in real-time

Indexer:

Indexer process the incoming data in real-time. It also stores & Indexes the data on disk.

Search Head:

End users interact with Splunk through Search Head. It allows users to do search, analysis & Visualization.

Applications of Splunk

Problem Statement: Mac-Donald had no clear visibility into what offers work best.
  • Offer type ( For example 20% off)
  • Cultural differences at a region level
  • Time of Purchase
  • Device used by the customer
  • Revenue generated per order
They needed insight into consumer behaviors and customer response.
The entire process using three types of Data source
  1. Order placed in Mac Donald Outlet
  2. Order placed in the Mobile Application
  3. Order places using the Web Application
Now the process carried from one step to other as mention in the below-given diagram.

Input

Input Data moves to Parsing stage,

Parsing

In Parsing Stage, relevant data is converted into events:
  • Customer Region
  • Revenue per order
  • Time of Order (Morning, Afternoon, Evening, Night)
  • A device used by customers (Mobile, PC, Tablet)
  • Discount Coupons applied

Indexing stage

In this stage, events are sorted and indexed for storage based on:
  • Sales by Geographical location
  • Order Revenue
  • Time of order (Morning, Afternoon, Evening, Night)
  • Device use by the customer
  • Coupon offered applied

Search Head

It is used to gain intelligence and perform reporting.
Mac- Donald used it to get the following information:
  • Which sales offer works best in which geographical location?
  • How does customer behavior changes in order revenue?
  • What is the best time to apply burger or combo offers?

How Splunk Helped?

  • Show all the order coming from across the specific region in real time.
  • Determine how different promotional offers are impacting in real-time
  • Monitor the performance of Mac Donald's in-house developing point of sale systems.
  • An employee can monitor what customers are saying and help understand customer expectations.
  • Analyzed the speed of different payment modes
  • Determine error-free payments mode

Best Practices of using Splunk

  • You should test the index so you can quickly perform the test.
  • There are specific fields you must get right at index time. Everything else you can create/modify only after indexing.
  • Event breaking happens automatically in spunk, so it's important to check that Splunk correctly detected the beginning and end of an event.
  • Splunk can automatically detect the time stamp. However, if your log format has a differ timestamp you need to configure the timestamp.

Famous companies using Splunk

Some famous companies using Splunk are:
  • Cisco
  • Bosch
  • IBM
  • Motorola
  • PepsiCo
  • Adobe
  • Visa
  • Adidas
  • Facebook
  • Salesforce
  • Walmart

Alternative to Splunk

Sumo Logic

Sumo logic tool helps you maintain the infrastructure of your application. Searching and analyzing data logs in real-time is simple. The tool allows you to monitor and visualize historical and real-time events.

Loggly

It allows you to analyze the logs and have fast searching experience. The tool helps you to collect data from the system using Syslog compatibility.
Download link: https://www.loggly.com/

Fluentd

Fluentd is a free and open source data collector tool. It helps you to save the logs in FS buffer. Therefore, you can retrieve it whenever you want. It also offers services like load balancing, retries for maintaining robustness.
Download link: https://www.fluentd.org/

ELK stack

ELK Stack allows users to take to data from any source, in any format, and to search, analyze, and visualize that data. The tool offers centralized logging. This feature is helpful when attempting to identify problems with servers or applications.

LogFaces

Logfaces is another alternative of spunk which allows you to email your queries. This tool keeps log data within the premises. The tool comes with an easy to a desktop application.

Disadvantages of using Splunk

Some disadvantages of using Splunk tool are:
  • Splunk can prove expensive for large data volumes.
  • Dashboards are functional but not as effective as some other monitoring tools.
  • Its learning curve is stiff, and you need Splunk training as it's a multi-tier architecture. So you need to spend lots of time to learn this tool.
  • Searches are difficult to understand, especially regular expressions and search syntax.

Summary

  • Splunk is a software which is used for monitoring, searching, analyzing and visualizing the machine-generated data in real time.
  • Splunk reduces troubleshooting and resolving time by offering instant results.
  • Splunk is available in three different versions are 1)Splunk Enterprise 2) Splunk Light 3) Splunk Cloud.
  • 1)Universal Forward (UF) 2) Load Balancer (LB) 3) Heavy forward (HF) 4) Indexer (LB) 5) Search head (SH) 6) Deployment Server(DS) 7) License manager (LM) are essential components of Splunk tool.
  • Important applications of Splunk are: 1)Interactive map 2) Promotional Support 3) Performance Monitor 4) Real-time feedback 5) Dashboard, and Payment process.
  • The most important best practice of using Splunk is that you should use test index so you can quickly perform the test.
  • Famous companies like Cisco, Bosch, IBM, Motorola, Adobe, Visa are using this tool.
  • 1)SumoLogic 2) ELK stack 3) Log faces 4) Fluentd are some alternatives of Splunk
  • The biggest drawback of Splunk is that it can prove expensive for large data volumes.

No comments:

Post a Comment