This KB details some troubleshooting steps to follow when there may be issues with the collection and sending of Logs to the VMware Aria for Applications using Kubernetes Operator. The symptoms reported may include, not being able to see my Kubernetes cluster on Kubernetes Integration page, No Logs seen when queried in the Logs user interface.
Wavefront Operator deploys Wavefront Logging as a Fluentd DaemonSet that collects, formats, filters, and adds Kubernetes metadata to container logs.
Note: The preceding log excerpts/Messages are only examples. Date, time, and environmental variables may vary depending on your environment.
- Ensure Wavefront integration is healthy locally
kubectl get wavefront -n observability-system
The result should look like this:
NAME STATUS PROXY CLUSTER-COLLECTOR NODE-COLLECTOR LOGGING AGE MESSAGE
wavefront Healthy Running (1/1) Running (1/1) Running (2/2) Running (2/2) 3h3m All components are healthy
Unhealthy, check troubleshooting.
- Check Wavefront Logging Fluentd DaemonSet pods logs
Check Logs for wavefront-logging component:
kubectl logs daemonset/wavefront-logging -n observability-system
You may find errors such:
[error]: [filter_kube_metadata] Exception 'HTTP status code , Timed out connecting to server' encountered fetching pod metadata from Kubernetes API v1 endpoint https://10.0.0.1:443/api
- Check Proxy logs
The proxy forwards logs, metrics, traces, and spans from all other components to VMware Aria for Applications. No data flowing likely means the proxy is failing.
Errors could be found but also INFO logs are useful to confirm if proxy is sending logs or getting blocked for some reason.
Check the proxy logs for INFO entries:
kubectl logs deployment/wavefront-proxy -n observability-system | grep INFO
You might find error
Error INFO [AbstractReportableEntityHandler:reject]  blocked input:some logs are getting blocked.
~proxy.log.tooManyLogTagsmetrics, check the limits for logs. If you want to increase the logs limits for your Wavefront instance, contact technical support.
A healthy proxy output processing logs, could look like:
2022-11-19 00:18:20,807 INFO [AbstractReportableEntityHandler:printTotal]  Logs processed since start: 0; blocked: 4
2022-11-19 00:18:20,807 INFO [AbstractReportableEntityHandler:printStats]  Logs received rate: 3 logs/s (1 min), 2 logs/s (5 min), 3 logs/s (current).
2022-11-19 00:18:28,300 INFO [AbstractReportableEntityHandler:printStats]  Points received rate: 80 pps (1 min), 64 pps (5 min), 0 pps (current).
2022-11-19 00:18:30,808 INFO [AbstractReportableEntityHandler:printStats]  Logs received rate: 3 logs/s (1 min), 2 logs/s (5 min), 1 logs/s (current).
- Verify Ingestion
Leverage the use of Wavefront Service and Proxy Data Dashboard and check Logs Received by Proxy (Bytes per Second) chart, if ingestion is working, cluster proxy pod should be reporting pps in the metric.
In the above screenshot  clusters are ingesting logs; thus we see  proxies.
These are initial validation steps of the ingestion flow, if you still encounter issues or have further questions, please contact Technical Support How to Engage Technical Support.