Writing traces based ClickHouse queries for building dashboard panels

Traces Schema

✅ Info

The distributed tables in clickhouse have been named by prefixing distributed_ to existing single shard table names. If you want to use clickhouse queries in dashboard or alerts, you should use the distributed table names. Eg, signoz_index_v3 now corresponds to the table of a single shard. To query all the shards, query against distributed_signoz_index_v3.

distributed_signoz_index_v3

This is primary table of Traces which is queried to fetch spans and apply aggregation on spans. It has over 30 different columns which helps in faster filtering on most common attributes following OpenTelemetry Trace Semantic conventions.

(
    `ts_bucket_start` UInt64 CODEC(DoubleDelta, LZ4),
    `resource_fingerprint` String CODEC(ZSTD(1)),
    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),
    `trace_id` FixedString(32) CODEC(ZSTD(1)),
    `span_id` String CODEC(ZSTD(1)),
    `trace_state` String CODEC(ZSTD(1)),
    `parent_span_id` String CODEC(ZSTD(1)),
    `flags` UInt32 CODEC(T64, ZSTD(1)),
    `name` LowCardinality(String) CODEC(ZSTD(1)),
    `kind` Int8 CODEC(T64, ZSTD(1)),
    `kind_string` String CODEC(ZSTD(1)),
    `duration_nano` UInt64 CODEC(T64, ZSTD(1)),
    `status_code` Int16 CODEC(T64, ZSTD(1)),
    `status_message` String CODEC(ZSTD(1)),
    `status_code_string` String CODEC(ZSTD(1)),
    `attributes_string` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `attributes_number` Map(LowCardinality(String), Float64) CODEC(ZSTD(1)),
    `attributes_bool` Map(LowCardinality(String), Bool) CODEC(ZSTD(1)),
    `resources_string` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `events` Array(String) CODEC(ZSTD(2)),
    `links` String CODEC(ZSTD(1)),
    `response_status_code` LowCardinality(String) CODEC(ZSTD(1)),
    `external_http_url` LowCardinality(String) CODEC(ZSTD(1)),
    `http_url` LowCardinality(String) CODEC(ZSTD(1)),
    `external_http_method` LowCardinality(String) CODEC(ZSTD(1)),
    `http_method` LowCardinality(String) CODEC(ZSTD(1)),
    `http_host` LowCardinality(String) CODEC(ZSTD(1)),
    `db_name` LowCardinality(String) CODEC(ZSTD(1)),
    `db_operation` LowCardinality(String) CODEC(ZSTD(1)),
    `has_error` Bool CODEC(T64, ZSTD(1)),
    `is_remote` LowCardinality(String) CODEC(ZSTD(1)),
    `resource_string_service$$name` LowCardinality(String) DEFAULT resources_string['service.name'] CODEC(ZSTD(1)),
    `attribute_string_http$$route` LowCardinality(String) DEFAULT attributes_string['http.route'] CODEC(ZSTD(1)),
    `attribute_string_messaging$$system` LowCardinality(String) DEFAULT attributes_string['messaging.system'] CODEC(ZSTD(1)),
    `attribute_string_messaging$$operation` LowCardinality(String) DEFAULT attributes_string['messaging.operation'] CODEC(ZSTD(1)),
    `attribute_string_db$$system` LowCardinality(String) DEFAULT attributes_string['db.system'] CODEC(ZSTD(1)),
    `attribute_string_rpc$$system` LowCardinality(String) DEFAULT attributes_string['rpc.system'] CODEC(ZSTD(1)),
    `attribute_string_rpc$$service` LowCardinality(String) DEFAULT attributes_string['rpc.service'] CODEC(ZSTD(1)),
    `attribute_string_rpc$$method` LowCardinality(String) DEFAULT attributes_string['rpc.method'] CODEC(ZSTD(1)),
    `attribute_string_peer$$service` LowCardinality(String) DEFAULT attributes_string['peer.service'] CODEC(ZSTD(1)),
    INDEX idx_trace_id trace_id TYPE tokenbf_v1(10000, 5, 0) GRANULARITY 1,
    INDEX idx_span_id span_id TYPE tokenbf_v1(5000, 5, 0) GRANULARITY 1,
    INDEX idx_duration duration_nano TYPE minmax GRANULARITY 1,
    INDEX idx_name name TYPE ngrambf_v1(4, 5000, 2, 0) GRANULARITY 1,
    INDEX idx_kind kind TYPE minmax GRANULARITY 4,
    INDEX idx_http_route `attribute_string_http$$route` TYPE bloom_filter GRANULARITY 4,
    INDEX idx_http_url http_url TYPE bloom_filter GRANULARITY 4,
    INDEX idx_http_host http_host TYPE bloom_filter GRANULARITY 4,
    INDEX idx_http_method http_method TYPE bloom_filter GRANULARITY 4,
    INDEX idx_timestamp timestamp TYPE minmax GRANULARITY 1,
    INDEX idx_rpc_method `attribute_string_rpc$$method` TYPE bloom_filter GRANULARITY 4,
    INDEX idx_response_statusCode response_status_code TYPE set(0) GRANULARITY 1,
    INDEX idx_status_code_string status_code_string TYPE set(3) GRANULARITY 4,
    INDEX idx_kind_string kind_string TYPE set(5) GRANULARITY 4,
    INDEX attributes_string_idx_key mapKeys(attributes_string) TYPE tokenbf_v1(1024, 2, 0) GRANULARITY 1,
    INDEX attributes_string_idx_val mapValues(attributes_string) TYPE ngrambf_v1(4, 5000, 2, 0) GRANULARITY 1,
    INDEX attributes_number_idx_key mapKeys(attributes_number) TYPE tokenbf_v1(1024, 2, 0) GRANULARITY 1,
    INDEX attributes_number_idx_val mapValues(attributes_number) TYPE bloom_filter GRANULARITY 1,
    INDEX attributes_bool_idx_key mapKeys(attributes_bool) TYPE tokenbf_v1(1024, 2, 0) GRANULARITY 1,
    INDEX resources_string_idx_key mapKeys(resources_string) TYPE tokenbf_v1(1024, 2, 0) GRANULARITY 1,
    INDEX resources_string_idx_val mapValues(resources_string) TYPE ngrambf_v1(4, 5000, 2, 0) GRANULARITY 1
)
ORDER BY (ts_bucket_start, resource_fingerprint, has_error, name, timestamp)

Columns in the distributed_signoz_index_v3 table

NAME	DESCRIPTION
ts_bucket_start	Start timestamp of the bucket
resource_fingerprint	Fingerprint of the resource, generated by combining all the resource attributes
timestamp	Time when the span generated at the source
trace_id	Trace ID. W3C Trace Context
span_id	Span ID
trace_state	Trace state of the span
parent_span_id	Parent Span ID
flags	Flags of the span
name	Name of the span
kind	Kind of the span. OpenTelemetry Span Kind
kind_string	String representation of the kind of the span
duration_nano	Duration of the span in nanoseconds
status_code	Status code of the span. OpenTelemetry Status Code
status_message	Status message of the span
status_code_string	String representation of the status code of the span
attributes_string	Map of all string tags/attributes of the span
attributes_number	Map of all number tags/attributes of the span
attributes_bool	Map of all bool tags/attributes of the span
resources_string	Map of all resource tags/attributes of the span
events	Events of the span. It is an array of stringified json of span events
links	Links of the span. It is a stringified json of span links
response_status_code	Response status code of the span. Derived from `http.status_code`, `rpc.grpc.status_code` and `rpc.jsonrpc.error_code` attribute of a span
external_http_url	External HTTP url of the span
http_url	HTTP url of the span
external_http_method	External HTTP method of the span
http_method	HTTP method of the span
http_host	HTTP host of the span
db_name	Database name of the span. Derived from `db.name` attribute of a span
db_operation	Database operation of the span. Derived from `db.operation` attribute of a span
has_error	Whether the span has error or not
is_remote	Whether the span is remote or not
resource_string_service$$name	Name of the service. Derived from `resources_string['service.name']` attribute of a span
attribute_string_http$$route	HTTP route of the span. Derived from `attributes_string['http.route']` attribute of a span
attribute_string_messaging$$system	Messaging system of the span. Derived from `attributes_string['messaging.system']` attribute of a span
attribute_string_messaging$$operation	Messaging operation of the span. Derived from `attributes_string['messaging.operation']` attribute of a span
attribute_string_db$$system	Database system of the span. Derived from `attributes_string['db.system']` attribute of a span
attribute_string_rpc$$system	RPC system of the span. Derived from `attributes_string['rpc.system']` attribute of a span
attribute_string_rpc$$service	RPC service of the span. Derived from `attributes_string['rpc.service']` attribute of a span
attribute_string_rpc$$method	RPC method of the span. Derived from `attributes_string['rpc.method']` attribute of a span
attribute_string_peer$$service	Peer service of the span. Derived from `attributes_string['peer.service']` attribute of a span

Timestamp Bucketing

In the new schema for traces i.e distributed_signoz_index_v3, we have a new column ts_bucket_start which is used to store the start timestamp of the bucket. This is used for faster filtering on timestamp attribute.

How to use this column in the queries? If your timestamp query is timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} then you will have to add ts_bucket_start BETWEEN {{.start_timestamp}} - 1800 AND {{.end_timestamp}}

We have added -1800, as bucketing is done in 30 minute intervals.

distributed_signoz_error_index_v2

This table stores error events derived from spans

(
    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),
    `errorID` FixedString(32) CODEC(ZSTD(1)),
    `groupID` FixedString(32) CODEC(ZSTD(1)),
    `traceID` FixedString(32) CODEC(ZSTD(1)),
    `spanID` String CODEC(ZSTD(1)),
    `serviceName` LowCardinality(String) CODEC(ZSTD(1)),
    `exceptionType` LowCardinality(String) CODEC(ZSTD(1)),
    `exceptionMessage` String CODEC(ZSTD(1)),
    `exceptionStacktrace` String CODEC(ZSTD(1)),
    `exceptionEscaped` Bool CODEC(T64, ZSTD(1)),
    `resourceTagsMap` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    INDEX idx_error_id errorID TYPE bloom_filter GRANULARITY 4,
    INDEX idx_resourceTagsMapKeys mapKeys(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX idx_resourceTagsMapValues mapValues(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64
)

Columns in the distributed_signoz_error_index_v2 table

NAME	DESCRIPTION
timestamp	Time when the span generated at the source
errorID	Error ID. W3C Trace Context
groupID	Group ID of the error
traceID	Trace ID. W3C Trace Context
spanID	Span ID
serviceName	Name of the service. Derived from `service.name` attribute of a span
exceptionType	Exception type of the error. Derived from `exception.type` attribute of a span
exceptionMessage	Exception message of the error. Derived from `exception.message` attribute of a span
exceptionStacktrace	Exception stacktrace of the error. Derived from `exception.stacktrace` attribute of a span
exceptionEscaped	Whether the exception is escaped or not. Derived from `exception.escaped` attribute of a span
resourceTagsMap	Map of all resource tags/attributes of the span

distributed_top_level_operations

This table stores top operations and service name.

(
    `name` LowCardinality(String) CODEC(ZSTD(1)),
    `serviceName` LowCardinality(String) CODEC(ZSTD(1))
)

Columns in the distributed_top_level_operations table

NAME	DESCRIPTION
name	Name of the span
serviceName	Name of the service

distributed_span_attributes_keys

This table stores all the attributes keys of the span.

(
    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),
    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),
    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),
    `isColumn` Bool CODEC(ZSTD(1))
)

Columns in the distributed_span_attributes_keys table

NAME	DESCRIPTION
tagKey	Name of the attribute
tagType	Type of the attribute. It can be `tag` or `resource`
dataType	Data type of the attribute. It can be `string`, `bool` or `float64`
isColumn	Whether the attribute is a column or not

distributed_span_attributes

This table stores all the attributes of the span.

(
    `timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),
    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),
    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),
    `stringTagValue` String CODEC(ZSTD(1)),
    `float64TagValue` Nullable(Float64) CODEC(ZSTD(1)),
    `isColumn` Bool CODEC(ZSTD(1))
)

Columns in the distributed_span_attributes table

NAME	DESCRIPTION
timestamp	Time when the span generated at the source
tagKey	Name of the attribute
tagType	Type of the attribute. It can be `tag` or `resource`
dataType	Data type of the attribute. It can be `string`, `bool` or `float64`
stringTagValue	String value of the attribute
float64TagValue	Float64 value of the attribute
isColumn	Whether the attribute is a column or not

Writing Clickhouse Queries for Dashboard Panels

Timeseries

This panel is used when you want to view your aggregated data in a timeseries.

Examples

Plotting a chart on 100ms interval

Plot a chart of 1 minute showing count of spans in 100ms interval of service frontend with duration > 50ms

SELECT fromUnixTimestamp64Milli(intDiv( toUnixTimestamp64Milli ( timestamp ), 100) * 100) AS interval, 
toFloat64(count()) AS count 
FROM (
 SELECT timestamp 
 FROM signoz_traces.distributed_signoz_index_v3
 WHERE resource_string_service$$name='frontend' 
 AND duration_nano>=50*exp10(6) 
 AND timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}}
 AND ts_bucket_start BETWEEN {{.start_timestamp}} - 1800 AND {{.end_timestamp}})
 GROUP BY interval ORDER BY interval ASC;

Value

For the value type panel, the overall query will be similar to timeseries, just that you will have to get the absolute value at the end. You can reduce your end result to either average, latest, sum, min, or max.

Examples

Average duration of spans where `method = 'POST'` , `service_name = 'sample-service'`

SELECT 
    avg(value) as value, 
    any(ts) as ts FROM (
        SELECT 
            toStartOfInterval((timestamp), INTERVAL 1 MINUTE) AS ts, 
            toFloat64(count()) AS value 
        FROM 
            signoz_traces.distributed_signoz_index_v3
        WHERE 
            timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND
            ts_bucket_start BETWEEN {{.start_timestamp}} - 1800 AND {{.end_timestamp}} AND
            http_method = 'POST' AND 
            resource_string_service$$name = 'sample-service'
        GROUP BY ts 
        ORDER BY ts ASC
    )

Table

This is used when you want to view the timeseries data in a tabular format.

The query is similar to timeseries query but instead of using time interval we use just use now() as ts in select.

Examples

GroupBy a tag/attribute in distributed tracing data

SELECT now() as ts, 
attributes_string['example_string_attribute'] AS attribute_name, 
toFloat64(avg(duration_nano)) AS value 
FROM signoz_traces.distributed_signoz_index_v3  
WHERE timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}}
AND ts_bucket_start BETWEEN {{.start_timestamp}} - 1800 AND {{.end_timestamp}}
GROUP BY (attribute_name, ts) order by (attribute_name, ts) ASC;

Last updated: June 6, 2024

Was this page helpful?

Logs

PromQl Query New Format

On this page

Traces Schema

distributed_signoz_index_v3

Timestamp Bucketing

distributed_signoz_error_index_v2

distributed_top_level_operations

distributed_span_attributes_keys

distributed_span_attributes

Writing Clickhouse Queries for Dashboard Panels

Timeseries

Value

Table