Although the OpenTelemetry Collector is at the heart of the OpenTelemetry architecture, it is independent of the W3C Trace Context. In my tracing demo I use Jaeger instead of Collector. But like all OpenTelemetry related posts, this information is ubiquitous. I wanted to find out more.
In this post, we will discuss various aspects of collectors.
- Data types: logs, metrics, traces
- Push and pull models
- Operations: read, convert, write
first step
long ago, observability As we know it, it didn’t exist.What we had instead was surveillance. Back then, surveillance meant having a large group of people looking at a screen with a dashboard on it. The dashboard itself consisted only of metrics and mainly system metrics such as his CPU, memory, and disk usage. So let’s start with metrics.
Prometheus is one of the leading monitoring solutions. Prometheus works on a pull-based model. Prometheus collects and stores your application’s compatible endpoints internally.
Scrape Prometheus-compatible endpoints using the OTEL Collector and print the results to the console. Grafana Labs offers a project to play with and generate random metrics. For simplicity, we’ll use Docker Compose. The setup looks like this:
version: "3"
services:
fake-metrics:
build: ./fake-metrics-generator #1
collector:
image: otel/opentelemetry-collector:0.87.0 #2
environment: #3
- METRICS_HOST=fake-metrics
- METRICS_PORT=5000
volumes:
- ./config/collector/config.yml:/etc/otelcol/config.yaml:ro #4
- There are no Docker images available for use with the Fake Metrics project.Therefore you need to build it
- Latest version of OTEL Collector as of this writing
- Parameterize the following configuration file
- everything happens here
As mentioned above, OTEL Collector can do many things. So configuration is everything.
receivers: #1
prometheus: #2
config:
scrape_configs: #3
- job_name: fake-metrics #4
scrape_interval: 3s
static_configs:
- targets: [ "$env:METRICS_HOST:$env:METRICS_PORT" ]
exporters: #5
logging: #6
loglevel: debug
service:
pipelines: #7
metrics: #8
receivers: [ "prometheus" ] #9
exporters: [ "logging" ] #9
- List of recipients. The receiver reads the data. It can be either push-based or pull-based.
- What we use is
prometheus
Predefined recipients - Define a pull job
- Job configuration
- List of exporters. In contrast to receivers, exporters write data.
- The simplest exporter writes data to standard output.
- Pipeline assembles receiver and exporter
- Define metrics-related pipelines
- The pipeline retrieves data from previously defined data.
prometheus
Send to recipient.logging
exporter, In other wordsprint them
Here is a sample of the results:
2023-11-11 08:28:54 otel-collector-collector-1 | StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Timestamp: 2023-11-11 07:28:54.14 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Value: 83.090000
2023-11-11 08:28:54 otel-collector-collector-1 | NumberDataPoints #1
2023-11-11 08:28:54 otel-collector-collector-1 | Data point attributes:
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__embrace_world_class_systems: Str(concept)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__exploit_magnetic_applications: Str(concept)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__facilitate_wireless_architectures: Str(extranet)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__grow_magnetic_communities: Str(challenge)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__reinvent_revolutionary_applications: Str(support)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__strategize_strategic_initiatives: Str(internet_solution)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__target_customized_eyeballs: Str(concept)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__transform_turn_key_technologies: Str(framework)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__whiteboard_innovative_partnerships: Str(matrices)
2023-11-11 08:28:54 otel-collector-collector-1 | StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Timestamp: 2023-11-11 07:28:54.14 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Value: 53.090000
2023-11-11 08:28:54 otel-collector-collector-1 | NumberDataPoints #2
2023-11-11 08:28:54 otel-collector-collector-1 | Data point attributes:
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__expedite_distributed_partnerships: Str(approach)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__facilitate_wireless_architectures: Str(graphical_user_interface)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__grow_magnetic_communities: Str(policy)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__reinvent_revolutionary_applications: Str(algorithm)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__transform_turn_key_technologies: Str(framework)
2023-11-11 08:28:54 otel-collector-collector-1 | StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Timestamp: 2023-11-11 07:28:54.14 +0000 UTC
2023-11-11 08:28:54 otel-collector-collector-1 | Value: 16.440000
2023-11-11 08:28:54 otel-collector-collector-1 | NumberDataPoints #3
2023-11-11 08:28:54 otel-collector-collector-1 | Data point attributes:
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__exploit_magnetic_applications: Str(concept)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__grow_magnetic_communities: Str(graphical_user_interface)
2023-11-11 08:28:54 otel-collector-collector-1 | -> fake__target_customized_eyeballs: Str(extranet)
Beyond printing
The above is a good first step, but there’s more to it than just printing to the console. Exposes metrics collected by a regular Prometheus instance. You can add Grafana dashboards for visualization. It may seem pointless, but bear with it, it’s just a stepping stone.
To achieve the above, only change the OTEL collector configuration.
exporters:
prometheus: #1
endpoint: ":$env:PROMETHEUS_PORT" #2
service:
pipelines:
metrics:
receivers: [ "prometheus" ]
exporters: [ "prometheus" ] #3
- addition
prometheus
exporter - Expose a Prometheus-compliant endpoint
- Replace printing with exposure
that’s it. OTEL Collector is very flexible.
Note that the collector is multiple input, multiple output. Add data to the pipeline to both print the data and publish it via an endpoint.
exporters:
prometheus: #1
endpoint: ":$env:PROMETHEUS_PORT"
logging: #2
loglevel: debug
service:
pipelines:
metrics:
receivers: [ "prometheus" ]
exporters: [ "prometheus", "logging" ] #3
- Publish your data
- Print data
- Pipelines both output and publish data
Configuring the Prometheus exporter allows you to visualize metrics in Grafana.
Note that the receiver and exporter specify their type and Each of them must be unique. To satisfy the last requirement, you can add a modifier to distinguish them. In other words, prometheus/foo
and prometheus/bar.
Intermediate data processing
It’s natural to ask why an OTEL Collector is configured between the source and Prometheus, as it makes the overall design more fragile. At this stage, you can leverage the true power of OTEL Collector: data processing. So far, we’re ingesting raw metrics, but the source format may not be compatible with how we want to visualize the data. For example, in our setup, metrics come from the fake generator “Business” and his underlying NodeJS platform “Technical”. That’s reflected in the metric’s name. You can add dedicated source labels and remove unnecessary prefixes to filter more efficiently.
Declare a data processor. processors
Configuration file section. The collector executes them in the order in which they are declared. Let’s implement the above transformation.
The first step towards our goal is to understand that there are two types of collectors. One is the “raw” one, and the other is a contribution based on it. The processors included in the former are limited in number and functionality. Therefore, you need to switch versions of contrib.
collector:
image: otel/opentelemetry-collector-contrib:0.87.0 #1
environment:
- METRICS_HOST=fake-metrics
- METRICS_PORT=5000
- PROMETHEUS_PORT=8889
volumes:
- ./config/collector/config.yml:/etc/otelcol-contrib/config.yaml:ro #2
- use
contrib
flavor - For more fun, the configuration files are in a separate path
At this point you can add the processor itself.
processors:
metricstransform: #1
transforms: #2
- include: ^fake_(.*)$ #3
match_type: regexp #3
action: update
operations: #4
- action: add_label #5
new_label: origin
new_value: fake
- include: ^fake_(.*)$
match_type: regexp
action: update #6
new_name: $$freelance designers #6-7
# Do the same with metrics generated by NodeJS
- Call the metric transformation processor
- list of transformations applied in order
- Match all metrics against defined regular expressions
- List of operations to be applied in order
- add a label
- Rename the metric by removing the regular expression group prefix.
- Interesting thing: the syntax is
$$x
Finally, add the defined processor to the pipeline.
service:
pipelines:
metrics:
receivers: [ "prometheus" ]
processors: [ "metricstransform" ]
exporters: [ "prometheus" ]
The results are as follows.
Connecting the receiver and exporter
The connector is also a receiver and It is an exporter and connects two pipelines. The example in the documentation takes a number of spans (traces) and exports a count with metrics. I tried to achieve the same thing with a 500 error – spoiler: it doesn’t work as intended.
Let’s start by adding a log receiver.
receivers:
filelog:
include: [ "/var/logs/generated.log" ]
Next, add the connector.
connectors:
count:
requests.errors:
description: Number of 500 errors
condition: [ "status == 500 " ]
Finally, connect the log receiver and metrics exporter.
service:
pipelines:
logs:
receivers: [ "filelog" ]
exporters: [ "count" ]
metrics:
receivers: [ "prometheus", "count" ]
The name of the metric is log_record_count_total
However, its value remains 1.
Log operations
Processors allow data manipulation. Operators are specialized processors that process logs. If you are familiar with elk stack is equivalent to Logstash.
Currently, the log timestamp is the ingestion timestamp. Change this to the creation timestamp.
receivers:
filelog:
include: [ "/var/logs/generated.log" ]
operators:
- type: json_parser #1
timestamp: #2
parse_from: attributes.datetime #3
layout: "%d/%b/%Y:%H:%M:%S %z" #4
severity: #2
parse_from: attributes.status #3
mapping: #5
error: 5xx #6
warn: 4xx
info: 3xx
debug: 2xx
- id: remove_body #7
type: remove
field: body
- id: remove_datetime #7
type: remove
field: attributes.datetime
- id: remove_status #7
type: remove
field: attributes.status
- Logs are in JSON format.You can use the provided JSON parser
- Metadata attributes to set
- field to read
- analysis pattern
- mapping table
- Accept range. for example,
501-599
.Operators have specially interpreted values5xx
(and similar) HTTP status. - Remove duplicate data
log
At this point, you can send your logs to any log aggregation component. We stay in the Grafana Labs realm and use Loki.
exporters:
loki:
endpoint: "
You can also use logs from the collector itself.
service:
telemetry:
logs:
Finally, let’s add another pipeline.
service:
pipelines:
logs:
receivers: [ "filelog" ]
exporters: [ "loki" ]
Grafana also allows you to visualize your logs. Select Loki as your data source.
conclusion
In this post, we learned more about the OpenTelemetry collector. Although it is not a required part of the OTEL architecture, it is a useful Swiss knife for all your data processing needs. This is very helpful if you are not or don’t want to stick to a particular stack.
The complete source code for this post can be found on GitHub.
To proceed further:
Originally published on November 12th in A Java Geekth2023