Skip to main content
  1. Posts/

Log Smarter, Not Harder - Transform Your Organization's Logs with Vector

·7 mins· loading · loading · ·
Vector Keycloak Logging Monitoring
Joshua Brewer
Author
Joshua Brewer
Table of Contents

Transforming Logs Like a Pro with Vector.dev
#

Vector is a high-performance observability data pipeline that enables you to collect, transform, and route logs with ease. In this post, we’ll walk through how to use Vector to transform logs for better observability and operational efficiency.


🧠 Why Log Transformation Matters
#

Log data is often messy, inconsistent, or too verbose. Before sending logs to your SIEM, log aggregator, or storage system, it’s often necessary to:

  • Normalize field names and formats
  • Redact sensitive information
  • Enrich logs with metadata
  • Reduce noise by filtering unimportant logs

⚙️ What is Vector?
#

Vector.dev is a lightweight, ultra-fast tool written in Rust that lets you build streaming data pipelines. Key features:

  • Sources: Where data comes from (e.g., files, syslog, journald, Kubernetes)
  • Transforms: Modify logs using VRL (Vector Remap Language)
  • Sinks: Where logs go (e.g., Elasticsearch, Loki, S3, Kafka)

Prerequisites
#

While not required to finish this walkthrough, this article assumes you have some knowledge of Kubernetes and Helm.

Tools used in this article
#


## 🛠️ Gathering logs from Kubernetes pods

For this walkthrough, we will deploy a dev keycloak instance to gather logs from.

We will use the vector kubernetes logs source to get logs from our pods.

Create a k3d cluster
#

  1. Create our k3d cluster
k3d cluster create keycloak-cluster

Deploy and Configure Keycloak
#

  1. Deploy Keycloak in dev mode and ensure it is streaming logs in JSON format.
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install keycloak bitnami/keycloak -n keycloak --create-namespace --set logging.output=json --set logging.level=debug
  1. We can now take a look at the JSON logs from Keycloak to get an idea of what data we are getting back. It could take a few minutes for Keycloak to start up and stream logs.
kubectl logs -n keycloak -l app.kubernetes.io/name=keycloak -c keycloak | jq .

output:

{
  "timestamp": "2025-04-22T15:11:48.22487318Z",
  "sequence": 60413,
  "loggerClassName": "org.jboss.logging.Logger",
  "loggerName": "org.keycloak.transaction.JtaTransactionWrapper",
  "level": "DEBUG",
  "message": "JtaTransactionWrapper end. Request Context: HTTP GET /realms/master",
  "threadName": "executor-thread-1",
  "threadId": 24,
  "mdc": {},
  "ndc": "",
  "hostName": "keycloak-0",
  "processName": "/opt/bitnami/java/bin/java",
  "processId": 1
}

As we see the logs output from keycloak is not bad, but we can add some useful information.

Deploy and Configure Vector
#

  1. We now can deploy vector to our cluster to start gathering logs. I will be using the Vector helm chart to deploy Vector. You may see an error Error: INSTALLATION FAILED: 1 error occurred: * Service "vector" is invalid: spec.ports: Required value this can be safely ignored.
helm repo add vector https://helm.vector.dev
helm repo update 
cat <<-'VALUES' > values.yaml
role: Agent
customConfig:
  data_dir: "vector-console-data/"
  sources:
    pod_logs:
      type: "kubernetes_logs"
  sinks:
    standard_out:
      type: "console"
      inputs:
        - pod_logs
      encoding:
        codec: "json"

extraVolumeMounts:
  - name: vector-console-data
    mountPath: /vector-console-data
extraVolumes:
  - name: vector-console-data
    emptyDir: {}
VALUES
helm install vector vector/vector -n vector --create-namespace --values values.yaml
  1. In the values above we specify that we want to get logs from kubernetes_pods vector source and send those logs to console which we can see when we get the logs from vector pods.
  2. After waiting a couple of minutes we can get the pod logs of out vector pod and see that it is indeed grabbing the pods logs from all pods running in our cluster.
kubectl logs -n vector --selector=app.kubernetes.io/name=vector | jq . 
{
  "file": "/var/log/pods/kube-system_coredns-ccb96694c-fpqvr_cf31755d-9f6f-4749-bbe0-2dec6b633769/coredns/0.log",
  "kubernetes": {
    "container_id": "containerd://1066c3be6fa76983bb72231bc9975e3c7c65d28dc6bc60abf7168b9f289b0539",
    "container_image": "rancher/mirrored-coredns-coredns:1.12.0",
    "container_image_id": "docker.io/rancher/mirrored-coredns-coredns@sha256:82979ddf442c593027a57239ad90616deb874e90c365d1a96ad508c2104bdea5",
    "container_name": "coredns",
    "namespace_labels": {
      "kubernetes.io/metadata.name": "kube-system"
    },
    "node_labels": {
      "beta.kubernetes.io/arch": "arm64",
      "beta.kubernetes.io/instance-type": "k3s",
      "beta.kubernetes.io/os": "linux",
      "kubernetes.io/arch": "arm64",
      "kubernetes.io/hostname": "k3d-keycloak-cluster-server-0",
      "kubernetes.io/os": "linux",
      "node-role.kubernetes.io/control-plane": "true",
      "node-role.kubernetes.io/master": "true",
      "node.kubernetes.io/instance-type": "k3s"
    },
    "pod_ip": "10.42.0.3",
    "pod_ips": [
      "10.42.0.3"
    ],
    "pod_labels": {
      "k8s-app": "kube-dns",
      "pod-template-hash": "ccb96694c"
    },
    "pod_name": "coredns-ccb96694c-fpqvr",
    "pod_namespace": "kube-system",
    "pod_node_name": "k3d-keycloak-cluster-server-0",
    "pod_owner": "ReplicaSet/coredns-ccb96694c",
    "pod_uid": "cf31755d-9f6f-4749-bbe0-2dec6b633769"
  },
  "message": "[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server",
  "source_type": "kubernetes_logs",
  "stream": "stdout",
  "timestamp": "2025-04-22T14:13:27.503695712Z"
}
{
  "file": "/var/log/pods/keycloak_keycloak-0_4b293248-6e0e-46d3-ab3d-75dfc0d379e8/keycloak/0.log",
  "kubernetes": {
    "container_id": "containerd://364e92023c3c31ae10497683cfe66ef91c45e5f559dafeedb1f5ad702c3d1d7e",
    "container_image": "docker.io/bitnami/keycloak:26.2.0-debian-12-r2",
    "container_image_id": "docker.io/bitnami/keycloak@sha256:eb39d4ec77208b724167d183a89c37612edd8efb3e6c0395ad5abb608d52362b",
    "container_name": "keycloak",
    "namespace_labels": {
      "kubernetes.io/metadata.name": "keycloak",
      "name": "keycloak"
    },
    "node_labels": {
      "beta.kubernetes.io/arch": "arm64",
      "beta.kubernetes.io/instance-type": "k3s",
      "beta.kubernetes.io/os": "linux",
      "kubernetes.io/arch": "arm64",
      "kubernetes.io/hostname": "k3d-keycloak-cluster-server-0",
      "kubernetes.io/os": "linux",
      "node-role.kubernetes.io/control-plane": "true",
      "node-role.kubernetes.io/master": "true",
      "node.kubernetes.io/instance-type": "k3s"
    },
    "pod_annotations": {
      "checksum/configmap-env-vars": "f0fa0260c40367946f68087eb6c2ea8768ce554818b5e2475e10d14a1ba47240",
      "checksum/secrets": "83224322b8ddbe1a5a3114b90e493f7b2522a29956749247581ea63241a2abfb"
    },
    "pod_ip": "10.42.0.7",
    "pod_ips": [
      "10.42.0.7"
    ],
    "pod_labels": {
      "app.kubernetes.io/app-version": "26.2.0",
      "app.kubernetes.io/component": "keycloak",
      "app.kubernetes.io/instance": "keycloak",
      "app.kubernetes.io/managed-by": "Helm",
      "app.kubernetes.io/name": "keycloak",
      "app.kubernetes.io/version": "26.2.0",
      "apps.kubernetes.io/pod-index": "0",
      "controller-revision-hash": "keycloak-654c49d649",
      "helm.sh/chart": "keycloak-24.5.7",
      "statefulset.kubernetes.io/pod-name": "keycloak-0"
    },
    "pod_name": "keycloak-0",
    "pod_namespace": "keycloak",
    "pod_node_name": "k3d-keycloak-cluster-server-0",
    "pod_owner": "StatefulSet/keycloak",
    "pod_uid": "4b293248-6e0e-46d3-ab3d-75dfc0d379e8"
  },
  "message": "{\"timestamp\":\"2025-04-22T14:11:38.202612884Z\",\"sequence\":57936,\"loggerClassName\":\"org.jboss.logging.Logger\",\"loggerName\":\"org.keycloak.transaction.JtaTransactionWrapper\",\"level\":\"DEBUG\",\"message\":\"JtaTransactionWrapper end. Request Context: HTTP GET /realms/master\",\"threadName\":\"executor-thread-1\",\"threadId\":24,\"mdc\":{},\"ndc\":\"\",\"hostName\":\"keycloak-0\",\"processName\":\"/opt/bitnami/java/bin/java\",\"processId\":1}",
  "source_type": "kubernetes_logs",
  "stream": "stdout",
  "timestamp": "2025-04-22T14:11:38.202753467Z"
}
  1. As you can now see we get much more data in our logs, but some of this data is not useful to us and we are getting logs back from all pods.
  2. Lets filter this out and get just keycloak logs.
  3. We will add a filter to grab logs from only the keycloak pod based on labels.
cat <<-'VALUES' > values.yaml
role: Agent
customConfig:
  data_dir: "vector-console-data/"
  sources:
    pod_logs:
      type: "kubernetes_logs"
  transforms:
    keycloak_logs:
      type: "filter"
      inputs:
        - pod_logs
      condition: '.kubernetes.pod_name == "keycloak-0"' 
  sinks:
    standard_out:
      type: "console"
      inputs:
        - keycloak_logs 
      encoding:
        codec: "json"

extraVolumeMounts:
  - name: vector-console-data
    mountPath: /vector-console-data
extraVolumes:
  - name: vector-console-data
    emptyDir: {}
VALUES


helm upgrade --install vector vector/vector -n vector --create-namespace --values values.yaml
kubectl rollout restart daemonset vector -n vector
  1. We can now see that we are getting logs from the keycloak pod. This can take a second for vector to come back up.
kubectl logs -n vector --selector=app.kubernetes.io/name=vector | jq .
{
  "file": "/var/log/pods/keycloak_keycloak-0_4b293248-6e0e-46d3-ab3d-75dfc0d379e8/keycloak/0.log",
  "kubernetes": {
    "container_id": "containerd://364e92023c3c31ae10497683cfe66ef91c45e5f559dafeedb1f5ad702c3d1d7e",
    "container_image": "docker.io/bitnami/keycloak:26.2.0-debian-12-r2",
    "container_image_id": "docker.io/bitnami/keycloak@sha256:eb39d4ec77208b724167d183a89c37612edd8efb3e6c0395ad5abb608d52362b",
    "container_name": "keycloak",
    "namespace_labels": {
      "kubernetes.io/metadata.name": "keycloak",
      "name": "keycloak"
    },
    "node_labels": {
      "beta.kubernetes.io/arch": "arm64",
      "beta.kubernetes.io/instance-type": "k3s",
      "beta.kubernetes.io/os": "linux",
      "kubernetes.io/arch": "arm64",
      "kubernetes.io/hostname": "k3d-keycloak-cluster-server-0",
      "kubernetes.io/os": "linux",
      "node-role.kubernetes.io/control-plane": "true",
      "node-role.kubernetes.io/master": "true",
      "node.kubernetes.io/instance-type": "k3s"
    },
    "pod_annotations": {
      "checksum/configmap-env-vars": "f0fa0260c40367946f68087eb6c2ea8768ce554818b5e2475e10d14a1ba47240",
      "checksum/secrets": "83224322b8ddbe1a5a3114b90e493f7b2522a29956749247581ea63241a2abfb"
    },
    "pod_ip": "10.42.0.7",
    "pod_ips": [
      "10.42.0.7"
    ],
    "pod_labels": {
      "app.kubernetes.io/app-version": "26.2.0",
      "app.kubernetes.io/component": "keycloak",
      "app.kubernetes.io/instance": "keycloak",
      "app.kubernetes.io/managed-by": "Helm",
      "app.kubernetes.io/name": "keycloak",
      "app.kubernetes.io/version": "26.2.0",
      "apps.kubernetes.io/pod-index": "0",
      "controller-revision-hash": "keycloak-654c49d649",
      "helm.sh/chart": "keycloak-24.5.7",
      "statefulset.kubernetes.io/pod-name": "keycloak-0"
    },
    "pod_name": "keycloak-0",
    "pod_namespace": "keycloak",
    "pod_node_name": "k3d-keycloak-cluster-server-0",
    "pod_owner": "StatefulSet/keycloak",
    "pod_uid": "4b293248-6e0e-46d3-ab3d-75dfc0d379e8"
  },
  "message": "{\"timestamp\":\"2025-04-22T14:36:39.317766759Z\",\"sequence\":59036,\"loggerClassName\":\"org.slf4j.impl.Slf4jLogger\",\"loggerName\":\"org.jgroups.protocols.dns.DNS_PING\",\"level\":\"DEBUG\",\"message\":\"keycloak-0-28856: sending discovery requests to hosts [10.42.0.7:0] on ports [7800 .. 7810]\",\"threadName\":\"jgroups-9,keycloak-0-28856\",\"threadId\":169,\"mdc\":{},\"ndc\":\"\",\"hostName\":\"keycloak-0\",\"processName\":\"/opt/bitnami/java/bin/java\",\"processId\":1}",
  "source_type": "kubernetes_logs",
  "stream": "stdout",
  "timestamp": "2025-04-22T14:36:39.317930801Z"
}
  1. We can see that we still have a lot of data bloat and have to filter out a lot of noise.
  2. We can filter logs with vector remap language
cat <<-'VALUES' > values.yaml
role: Agent
customConfig:
  data_dir: "vector-console-data/"
  sources:
    pod_logs:
      type: "kubernetes_logs"
  transforms:
    keycloak_logs:
      type: "filter"
      inputs:
        - pod_logs
      condition: '.kubernetes.pod_name == "keycloak-0"' 
    keycloak_logs_filtered:
      type: remap
      inputs:
        - keycloak_logs
      source: |
        .pod_ip = .kubernetes.pod_ip 
        .node_name = .kubernetes.node_labels."kubernetes.io/hostname"
        .message = parse_json!(.message) 
  sinks:
    standard_out:
      type: "console"
      inputs:
        - keycloak_logs_filtered
      encoding:
        codec: "json"
        only_fields:
          - message
          - pod_ip
          - node_name

extraVolumeMounts:
  - name: vector-console-data
    mountPath: /vector-console-data
extraVolumes:
  - name: vector-console-data
    emptyDir: {}
VALUES


helm upgrade --install vector vector/vector -n vector --create-namespace --values values.yaml
kubectl rollout restart daemonset vector -n vector
  1. As you can see we have transformed the naming of our log field and only have the message, pod ip, and node name.
{
  "message": {
    "hostName": "keycloak-0",
    "level": "DEBUG",
    "loggerClassName": "org.jboss.logging.Logger",
    "loggerName": "org.keycloak.transaction.JtaTransactionWrapper",
    "mdc": {},
    "message": "JtaTransactionWrapper  commit. Request Context: HTTP GET /realms/master",
    "ndc": "",
    "processId": 1,
    "processName": "/opt/bitnami/java/bin/java",
    "sequence": 59680,
    "threadId": 24,
    "threadName": "executor-thread-1",
    "timestamp": "2025-04-22T14:52:58.218502337Z"
  },
  "node_name": "k3d-keycloak-cluster-server-0",
  "pod_ip": "10.42.0.7"
}
{
  "message": {
    "hostName": "keycloak-0",
    "level": "DEBUG",
    "loggerClassName": "org.jboss.logging.Logger",
    "loggerName": "org.keycloak.transaction.JtaTransactionWrapper",
    "mdc": {},
    "message": "JtaTransactionWrapper end. Request Context: HTTP GET /realms/master",
    "ndc": "",
    "processId": 1,
    "processName": "/opt/bitnami/java/bin/java",
    "sequence": 59681,
    "threadId": 24,
    "threadName": "executor-thread-1",
    "timestamp": "2025-04-22T14:52:58.218646421Z"
  },
  "node_name": "k3d-keycloak-cluster-server-0",
  "pod_ip": "10.42.0.7"
}
  1. We can now use this same method to standarize logs across various applications throughout your orgs.

Cleaning Up
#

k3d cluster delete keycloak-cluster

Conclusion
#

We have seen how to use Vector to transform logs for better observability and operational efficiency. Vector is a very powerfull tool and can be extended way beyond what we have seen here to completely transform the way your organization handles data.

Whats Next?
#

I will be adding more content to this series such as getting vector data into loki and visible with grafana. Check back soon for more content.