Fields in Logs

A log line contains different attributes attached to it. These attributes helps you to filter your logs so that you can write effiecient queries and get your results faster. These attributes are reffered to as fields in SigNoz.

There are two kind of fields interesting and selected .

Interesting Log Fields

These kind of fields are the resource and log attributes which are parsed by the otel collector but is not indexed. These fields are also not auto suggested by the query builder. But you can still use these fields for querying by writing the query manually.

Selected Log Fields

These are created by converting an interesting field. When a interesting field is converted to selected field, an index is added to the field so that queries for this fields are faster. In addition to that when you write a query this fields will be autosuggested. Selected fields are also displyed explicitly with each log line.

Configuring the SigNoz Collector

If you take a look in the SigNoz repository, you’ll find the configuration for the collector in /deploy/docker/clickhouse-setup/otel-collector-config.yaml You can edit this file to filter what logs are being stored after being received by the collector.

After editing this file, you’ll need to restart the collector. If using Docker, the command would be docker restart signoz-otel-collector

Adding Atttributes

To add attributes to all lines logged by this collector, add a section to the processors to add an attribute. This example is a bit contrived, but let’s say that this collector is only gathering data for a single client organization:

      - key: client_id
        value: 1123
        action: insert

Adding this mapping, however, isn’t enough to add the attribute, we haven’t yet added this processor to our pipeline, check the pipelines mapping for logs:

      receivers: [otlp, tcplog/docker]
      processors: [logstransform/internal, batch]
      exporters: [clickhouselogsexporter]'

To actually affect our data, we must add it to the pipeline. The revised version looks like this:

      receivers: [otlp, tcplog/docker]
      processors: [attributes/clientid, logstransform/internal, batch]
      exporters: [clickhouselogsexporter]

Creating Log Fields

By default whenever you receive a log from a non OTLP receivers it will be stored directly in the body and you won't be able to filter logs based on fields/attributes. Opentelemetry provides different ways to parse attributes from your logs using different operators that the available. These parsed attributes are referred to as fields in signoz.

Ex :- Lets say we have our logs formatted as

{"time": "2022-09-20,15:27:17 +0530", "message": "Logging test...", "service": "python"}

Here we have a timestamp, a message and an attribute named service. Now we will have to parse these in our otel collector config.

    include: [ /tmp/app.log ]
    start_at: beginning
        - type: json_parser
            parse_from: attributes.time
            layout: '%Y-%m-%d,%H:%M:%S %z'
        - type: move
        from: attributes.message
        to: body
        - type: remove
        field: attributes.time
  • In the yaml file above we are using the json parser. This will parse the json log line and add it in the attributes key.
  • Since we want to populate timestamp we are using the timestamp parser and pointing it to attributes.time which was parsed by the json parser.
  • Now we want the value inside message key to be in the log body, so we are moving it to body using move operator.
  • And finally we are removing time from attributes as we have already populated the value of timestamp from it.

Transforming Attributes

Logs data can be transformed arbitrarily using the OpenTelemetry Transformation Language (OTTL). It’s not necessary or practical to learn all the ins and outs of this language, but let’s start with a simple bit of processing that adds some useful attributes.

Within the processors section. we’ll add a transform

    - context: log
        - set(severity_text, "FAIL") where body == "request failed"

When faced with excessively high cardinality data, it may be useful to 'crush' attribute values by replacing with generics. Here’s an example where some metric crushing will make our lives easier later

  replace_match(attributes[""], "/user/*/list/*", "/user/{userId}/list/{listId}")

Removing Sensitive Data from Logs

The collector is one more place where you can control potentially sensitive data from being collected or transmitted. You can remove attributes with simple regex-style matching, like in this example where all values should be removed:

    - context: log
        - set(severity_text, "FAIL") where body == "request failed"
        - replace_match(attributes["social_security_number"], "*", "{userSocial}")

Or you can use pattern matching to find any strings that look similar:

    - context: log
        - set(severity_text, "FAIL") where body == "request failed"
        - replace_all_patterns(attributes, "value", "^\\D*\\d{3}-\\d{2}-\\d{4}", "{ss_number}")

Remember that in these examples you'll need to add transform to the pipeline section of your config and restart the collector for these changes to take effect. The result is a value replaced with a placeholder

a detail view of a log with the attribute for a social security number replaced with a placeholder
Either of the above configurations results in this much safer placeholder.