Troubleshoot Logs ingestion issues for Kubernetes Operator

Overview

This KB details some troubleshooting steps to follow when there may be issues with the collection and sending of Logs to the VMware Aria for Applications using Kubernetes Operator. The symptoms reported may include, not being able to see my Kubernetes cluster on Kubernetes Integration page, No Logs seen when queried in the Logs user interface.

 

Kubernetes Operator

Wavefront Operator deploys Wavefront Logging as a Fluentd DaemonSet that collects, formats, filters, and adds Kubernetes metadata to container logs.

mceclip0.png

Note: The preceding log excerpts/Messages are only examples. Date, time, and environmental variables may vary depending on your environment.

 

Procedure

  • Ensure Wavefront integration is healthy locally
kubectl get wavefront -n observability-system

The result should look like this:

mceclip0.png

NAME STATUS PROXY CLUSTER-COLLECTOR NODE-COLLECTOR LOGGING AGE MESSAGE
wavefront Healthy Running (1/1) Running (1/1) Running (2/2) Running (2/2) 3h3m All components are healthy

IfSTATUSisUnhealthy, check troubleshooting.

  • Check Wavefront Logging Fluentd DaemonSet pods logs

Check Logs for wavefront-logging component:

kubectl logs daemonset/wavefront-logging -n observability-system

You may find errors such:

[error]: [filter_kube_metadata] Exception 'HTTP status code , Timed out connecting to server' encountered fetching pod metadata from Kubernetes API v1 endpoint https://10.0.0.1:443/api
  • Check Proxy logs

The proxy forwards logs, metrics, traces, and spans from all other components to VMware Aria for Applications. No data flowing likely means the proxy is failing.

Errors could be found but also INFO logs are useful to confirm if proxy is sending logs or getting blocked for some reason.

Check the proxy logs for INFO entries:

kubectl logs deployment/wavefront-proxy -n observability-system | grep INFO

You might find error Error INFO [AbstractReportableEntityHandler:reject] [2878] blocked input:some logs are getting blocked.

mceclip2.png

Note: If you see~proxy.log.*TooLongor~proxy.log.tooManyLogTagsmetrics, check the limits for logs. If you want to increase the logs limits for your Wavefront instance, contact technical support.

A healthy proxy output processing logs, could look like:

mceclip3.png

2022-11-19 00:18:20,807 INFO [AbstractReportableEntityHandler:printTotal] [2878] Logs processed since start: 0; blocked: 4
2022-11-19 00:18:20,807 INFO [AbstractReportableEntityHandler:printStats] [2878] Logs received rate: 3 logs/s (1 min), 2 logs/s (5 min), 3 logs/s (current).
2022-11-19 00:18:28,300 INFO [AbstractReportableEntityHandler:printStats] [2878] Points received rate: 80 pps (1 min), 64 pps (5 min), 0 pps (current).
2022-11-19 00:18:30,808 INFO [AbstractReportableEntityHandler:printStats] [2878] Logs received rate: 3 logs/s (1 min), 2 logs/s (5 min), 1 logs/s (current).
  • Verify Ingestion

Leverage the use of Wavefront Service and Proxy Data Dashboard and check Logs Received by Proxy (Bytes per Second) chart, if ingestion is working, cluster proxy pod should be reporting pps in the metric.

mceclip1.png

In the above screenshot [3] clusters are ingesting logs; thus we see [3] proxies.

 

These are initial validation steps of the ingestion flow, if you still encounter issues or have further questions, please contact Technical Support How to Engage Technical Support.

See also

Overview of Wavefront Operator for Kubernetes

Logs Troubleshooting

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments