Guide for Set Up of Telegraf for Monitoring Windows

This entry is part 2 of 5 in the series Collecting Performance Metrics

Problem

As good DBAs, we sometimes have the need to monitor stats or just want to be nice to our system admins and share our new shiny toys and watch Windows using the same tools as SQL Server.

Install and Configure Telegraf for on Windows

The solution for this is fairly simple now that you have setup Part 1 of this series.  You can download the Windows conf file for Telegraf from my presentation.  Below are the important pieces of the file. The main part of the OUTPUT PLUGINS being to place the data in the InfluxDB database. The data will be housed in the same database as our SQL performance metrics. Next, you can collect any Windows Performance Counters you want and group them into a “Measurement”. I’m using the dashboard that is on the Grafana website along with the performance metrics they have set up to be collected.

###############################################################################
#                            OUTPUT PLUGINS                                   #
###############################################################################
[[outputs.influxdb]]
  ## The full HTTP or UDP URL for your InfluxDB instance.
  ##
  ## Multiple urls can be specified as part of the same cluster,
  ## this means that only ONE of the urls will be written to each interval.
  # urls = ["udp://127.0.0.1:8089"] # UDP endpoint example
  urls = ["http://10.1.1.11:8086"] # required
  ## The target database for metrics (telegraf will create it if not exists).
  database = "telegraf" # required

###############################################################################
#                            INPUT PLUGINS                                    #
###############################################################################
[[inputs.win_perf_counters.object]]
  ObjectName = "PhysicalDisk"
  Instances = ["*"]
  Counters = ["Avg. disk sec/read","Avg. disk sec/write"]
  Measurement = "win_disk"

[[inputs.win_perf_counters.object]]
  # Processor usage, alternative to native, reports on a per core.
  ObjectName = "Processor"
  Instances = ["*"]
  Counters = ["% Idle Time", "% Interrupt Time", "% Privileged Time", "% User Time", "% Processor Time"]
  Measurement = "win_cpu"
  IncludeTotal=true #Set to true to include _Total instance when querying for all (*).

[[inputs.win_perf_counters.object]]
  # Disk times and queues
  ObjectName = "LogicalDisk"
  Instances = ["*"]
  Counters = ["% Idle Time", "% Disk Time","% Disk Read Time", "% Disk Write Time", "% User Time", "Current Disk Queue Length", "% free space", "free megabytes"]
  Measurement = "win_disk"
  #IncludeTotal=false #Set to true to include _Total instance when querying for all (*).

[[inputs.win_perf_counters.object]]
  ObjectName = "System"
  Counters = ["Context Switches/sec","System Calls/sec", "Processor Queue Length"]
  Instances = ["------"]
  Measurement = "win_system"
  #IncludeTotal=false #Set to true to include _Total instance when querying for all (*).

[[inputs.win_perf_counters.object]]
  # Example query where the Instance portion must be removed to get data back, such as from the Memory object.
  ObjectName = "Memory"
  Counters = ["Available Bytes","Cache Faults/sec","Demand Zero Faults/sec","Page Faults/sec","Pages/sec","Transition Faults/sec","Pool Nonpaged Bytes","Pool Paged Bytes"]
  Instances = ["------"] # Use 6 x - to remove the Instance bit from the query.
  Measurement = "win_mem"
  #IncludeTotal=false #Set to true to include _Total instance when querying for all (*).

[[inputs.win_perf_counters.object]]
  # more counters for the Network Interface Object can be found at
  # https://msdn.microsoft.com/en-us/library/ms803962.aspx
  ObjectName = "Network Interface"
  Counters = ["Bytes Received/sec","Bytes Sent/sec","Packets Received/sec","Packets Sent/sec","Output queue length"]
  Instances = ["*"] # Use 6 x - to remove the Instance bit from the query.
  Measurement = "win_net"
  #IncludeTotal=false #Set to true to include _Total instance when querying for all (*).

Next, download the current zip file from the website and unzip the file to a network location so we can use PowerShell to install it remotely on multiple servers.

Lastly, we can install and start the service to our Windows machines using PowerShell.  Save the telegraf_windows.conf file to your network location.

$servers = @(
    'server1', 'server2'
)
$servers | % {
    Write-Host "$($_)..."
    New-Item -Path "\\$($_)\c$\Program Files\telegraf" -ItemType Directory -Force
    Copy-Item -Path "\\server\telegraf\telegraf.exe" -Destination "\\$($_)\c$\Program Files\telegraf\" -Force
    Copy-Item -Path "\\server\telegraf\telegraf_windows.conf" -Destination "\\$($_)\c$\Program Files\telegraf\telegraf.conf" -Force
    Invoke-Command -ComputerName $_ -ScriptBlock {
        Stop-Service -Name telegraf -ErrorAction SilentlyContinue
        & "c:\program files\telegraf\telegraf.exe" --service install -config "c:\program files\telegraf\telegraf.conf"
        Start-Service -Name telegraf
    }
}

Setting up the Dashboard

Next, let’s set up the dashboard so you can see the data. The dashboard is available on the Grafana site here and my all dashboards from GitHub. Again we can click on the orange icon in the right-hand corner, click on Dashboards then click on Import. Now we click on the Upload .json File and select the file. Then you can click Load and the name of the Dashboard will auto-populate and you can if you like change the name.

Next

We will set up the dashboard that monitors you InfluxDB then we get to our post on the SQL performance metrics and how to edit the dashboards.

Series Navigation<< Guide for Set Up of Telegraf for Monitoring SQL Server xPlatSetting up Telegraf to Startup on Windows >>

Related Posts

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.