ELK Stack Setup in Azure with Fetch Data From EventHub

Chirag Patel
7 min readOct 22, 2020

Prerequisites

1. Basic knowledge about the ELK(Elastic Search, Logstash, Kibana).
2. Familier with Azure Portal & you must have Azure Account

Basic Intro About Used Services

Elasticsearch is a real-time, distributed storage, search, and analytics engine. It can be used for many purposes, but one context where it excels is indexing streams of semi-structured data, such as logs or decoded network packets.

Logstash is an open-source data collection engine with real-time pipelining capabilities. Logstash can dynamically unify data from disparate sources and normalize the data into destinations of your choice. Cleanse and democratize all your data for diverse advanced downstream analytics and visualization use cases.

Kibana is an open-source analytics and visualization platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualize your data in a variety of charts, tables, and maps.

Azure Event Hubs is a big data streaming platform and event ingestion service. It can receive and process millions of events per second. Data sent to an event hub can be transformed and stored by using any real-time analytics provider or batching/storage adapters.

Lets Start!!!

1.Login to the Azure portal using the mentioned link. https://portal.azure.com/

2. In the portal, search bar type “Elasticseach(Self-Managed)” and find in market place section & click it.

3. On Elasticsearch Page click on the create button. it is redirecting to the configuration page for creating an Elasticsearch cluster, Kibana & Logstash setup.

4. Basic Section:
Subscription :
<select your subscription>(ex. subscription-1)
Resource group: <select your resource group>(ex: elk-rg)If not exist then create a new one.
Region: <Select your region for this deployment>(ex: South Central US)
Username: <mention username for login in elk virtual machine>(ex:elkadmin)
Authentication Type: <Select Password>(If you want you can go with SSH Public Key)
Password: <Enter Strong Password for login in elk virtual machine>(ex:ElK$Set!78%!!1)
Confirm Password: <Enter the same password as above step>
=> Click on Next: Cluster Settings

5. Cluster Settings:
Elasticsearch Version
: <Select Latest Version>(v7.9.0)
Cluster name: <Enter Cluster Name>(ex: elk-cluster)
Virtual network: <Default as it is>(You can create or select existing one)
Elasticsearch node subnet:<Default as it is>(You can create or select existing one)
=> Click on Next: Nodes Configuration

6. Nodes Configuration:
Hostname prefix:(The prefix to use for hostnames when naming virtual machines in the cluster. Hostnames are used for resolution of master nodes so if you are deploying a cluster into an existing virtual network containing an existing Elasticsearch cluster, be sure to set this to a unique prefix, to differentiate the hostnames of this cluster from an existing cluster)(ex: elk)
=> For Data nodes section
Number of data nodes: 3
Data node VM size: DS1 v2(1 vcpu, 3.5GM memory)
Data nodes are master eligible: Allow data nodes to be master eligible, setting this to Yes will no longer deploy the 3 dedicated master nodes. Select yes.
=> Data node disks
Number of managed disks per data node:
1
Size of each managed disk:
32GiB
Type of manages disks:
The storage type of managed disks. The default will be Premium disks for VMs that support Premium disks and Standard disks for those that do not. Choose “Standard disks”.
=>Master nodes
Master node VM size:
DS1 v2(1 vcpu, 3.5GM memory)
Client nodes(optional):0
=> Choose an option based on your load and requirements.
=> click Next: Nodes Configuration

7. Kibana & Logstash
=> Kibana
Install Kibana
: yes
Kibana VM size: Standard A2 v2(2 vcpu,4GB memory)
=> Logstash
Install Logstash:
Yes
Number of Logstash VMs: 1
Logstash VM size: Standard DS1 v2(1 vcpu,3.5 GB memory)
Logstash config file: skip it now we will add it manually.
Additional Logstash plugins: logstash-input-azure_event_hubs
=>External Access

Use a jump box: no(A jump box allows you to connect to your cluster from a public access point like SSH. This is usually not necessary if Kibana is installed since Kibana itself acts as a jump box.)
Load balancer type: External(Choose whether the load balancer should be public-facing (external) or internal.).
=>click on Next: Security

8. Security:
=>
In this section set password for all the built-in users of ELK Stack.
=> click on Next: Certificates

9. Certificates:
=> In this section you can set up the Certificate for the HTTP and TLS
=> If you want you can set up otherwise skip this as default
=> Click on Next:Review + Create

10. Review + Create:
=> Wait for the azure to validate the details and after a click on create.
=> Wait some time for deployment succeeded.

***Let's Create Event Hub namespace and event hub In azure***

11. Create EventHub Namespace, Event hub & Consumer Group:
=> Go to the below link and create event namespace and event hub
https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-create
=> After that go to that event hub and create the consumer group. From the below image.
=> Go to that created event hub and copy the “Connection string–primary key”

***Now you ready with elastic cluster, Logstash, and kibana in a virtual machine running in the Azure environment. Let's configure the event hub & logstash for running almost real-time logs fetch pipeline and visualize in kibana dashboard.***

11. Go to the resource group you used for this deployment
=>
In the portal, search bar type “Resource groups” and click it.
=> Click on your resource group which you choose or create at the time of the creation of the ELK service in azure. (From step 4)

12. SSH into logstash virtual machine using kibana virtual machine
=> Find the kibana virtual machine and click on it.
=> You will find the public Ip address of the kibana in the overview section
=> Open your local machine terminal and ssh into kibana using VM.
=> Command: ssh <admin>@<public IP of kibana> (ex: ssh admin@255.255.255.255).
=>admin is the username from step 4. For the first time asks to add the host in the machine so type yes and enter the password. You should be in the kibana virtual machine.
=> In that kibana SSH session login into logstash virtual machine using same step as kibana.(ex: ssh <admin>@<private IP of the logstash> ). You will find the private in the logstash virtual machine in the overview section of logstash vm.
=> Now you finally ssh into the logstash VM.

13. Run the pipeline in Logstash virtual machine
=>Go to the folder using this command “cd /etc/logstash/conf.d/”
=> In that folder create the file logstash.conf and add the below content to the file.

input {
azure_event_hubs{
event_hub_connections => [“<event-hub-connection-string>”]
threads => 16
decorate_events => true
consumer_group => “<event-hub-consumer-group>”
initial_position => “end”
storage_connection => “<storage-account-connection-string>”
storage_container => “<storage-account-name>”
}
}
## Add your filters / logstash plugins configuration here
filter{
json {
source => “message”
remove_field => “message”
}
if event.get(‘[payload][op]’) != ‘d’ then
event.get(‘[payload][after]’).each {|k, v|
event.set(k,v)
}
end
event.set(‘op’, event.get(‘[payload][op]’).downcase)

}
mutate {
remove_field => [“schema”,”payload”]
}
}
output {
stdout { codec => rubydebug }
if [op] != “d” {
elasticsearch {
hosts => [“<elasticsearch hosts>:9200"]
index => “sql-server-%{+YYYY-MM-dd}”
user => “elastic”
password => “<password for built in elastic user>”
sniffing => “true”
}
}
}

=> This file is a filter log which comes from the SQL server to Event-Hub.
=> Change the following parameter in file.

event_hub_connections : your event-hub connections string(from step 11 )
consumer_group: your event hub consumer group (from step 11)
storage_connection: storage account connection string(creat it if not exists)
storage_container: name of the storage account from the storage container
hosts: hosts of the elastic search(you can find in internal load balancing)
user: built-in user ‘elastic’
password: password for built-in ‘elastic’ user

=> Make sure you change the comma(“”) to the normal comma of the editor when you add content to the file.
=>Run the below command to start the pipeline.

sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf

Congratulation, you have successfully created the almost real-time pipeline from the event-hub to ELK Stack.

You can comment if you faced any issues and share if you liked it.

--

--

Chirag Patel

DevOps Engineer | Cloud Enthusiasm | 1x GCP | AWS | 2x Azure | Kubernetes | Docker | Jenkins | Terraform