Writing traces based ClickHouse queries for building dashboard panels

Traces Schema

Traces has 6 tables used for different usecases

All Traces table follows OpenTelemetry Trace Semantic conventions

Info

The distributed tables in clickhouse have been named by prefixing distributed_ to existing single shard table names. If you want to use clickhouse queries in dashboard or alerts, you should use the distributed table names. Eg, signoz_index_v2 now corresponds to the table of a single shard. To query all the shards, query against distributed_signoz_index_v2.

distributed_signoz_index_v2

This is primary table of Traces which is queried to fetch spans and apply aggregation on spans. It has over 30 different columns which helps in faster filtering on most common attributes following OpenTelemetry Trace Semantic conventions.

(
    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),
    `traceID` FixedString(32) CODEC(ZSTD(1)),
    `spanID` String CODEC(ZSTD(1)),
    `parentSpanID` String CODEC(ZSTD(1)),
    `serviceName` LowCardinality(String) CODEC(ZSTD(1)),
    `name` LowCardinality(String) CODEC(ZSTD(1)),
    `kind` Int8 CODEC(T64, ZSTD(1)),
    `durationNano` UInt64 CODEC(T64, ZSTD(1)),
    `statusCode` Int16 CODEC(T64, ZSTD(1)),
    `component` LowCardinality(String) CODEC(ZSTD(1)),
    `dbSystem` LowCardinality(String) CODEC(ZSTD(1)),
    `dbName` LowCardinality(String) CODEC(ZSTD(1)),
    `dbOperation` LowCardinality(String) CODEC(ZSTD(1)),
    `peerService` LowCardinality(String) CODEC(ZSTD(1)),
    `events` Array(String) CODEC(ZSTD(2)),
    `httpMethod` LowCardinality(String) CODEC(ZSTD(1)),
    `httpUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `httpRoute` LowCardinality(String) CODEC(ZSTD(1)),
    `httpHost` LowCardinality(String) CODEC(ZSTD(1)),
    `msgSystem` LowCardinality(String) CODEC(ZSTD(1)),
    `msgOperation` LowCardinality(String) CODEC(ZSTD(1)),
    `hasError` Bool CODEC(T64, ZSTD(1)),
    `rpcSystem` LowCardinality(String) CODEC(ZSTD(1)),
    `rpcService` LowCardinality(String) CODEC(ZSTD(1)),
    `rpcMethod` LowCardinality(String) CODEC(ZSTD(1)),
    `responseStatusCode` LowCardinality(String) CODEC(ZSTD(1)),
    `stringTagMap` Map(String, String) CODEC(ZSTD(1)),
    `numberTagMap` Map(String, Float64) CODEC(ZSTD(1)),
    `boolTagMap` Map(String, Bool) CODEC(ZSTD(1)),
    `resourceTagsMap` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    INDEX idx_service serviceName TYPE bloom_filter GRANULARITY 4,
    INDEX idx_name name TYPE bloom_filter GRANULARITY 4,
    INDEX idx_kind kind TYPE minmax GRANULARITY 4,
    INDEX idx_duration durationNano TYPE minmax GRANULARITY 1,
    INDEX idx_hasError hasError TYPE set(2) GRANULARITY 1,
    INDEX idx_httpRoute httpRoute TYPE bloom_filter GRANULARITY 4,
    INDEX idx_httpUrl httpUrl TYPE bloom_filter GRANULARITY 4,
    INDEX idx_httpHost httpHost TYPE bloom_filter GRANULARITY 4,
    INDEX idx_httpMethod httpMethod TYPE bloom_filter GRANULARITY 4,
    INDEX idx_timestamp timestamp TYPE minmax GRANULARITY 1,
    INDEX idx_rpcMethod rpcMethod TYPE bloom_filter GRANULARITY 4,
    INDEX idx_responseStatusCode responseStatusCode TYPE set(0) GRANULARITY 1,
    INDEX idx_resourceTagsMapKeys mapKeys(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX idx_resourceTagsMapValues mapValues(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64,
    PROJECTION timestampSort
    (
        SELECT *
        ORDER BY timestamp
    )
)

Columns in the distributed_signoz_index_v2 table

NAMEDESCRIPTION
timestampTime when the span generated at the source
traceIDTrace ID. W3C Trace Context
spanIDSpan ID
parentSpanIDParent Span ID
serviceNameName of the service
nameName of the span
kindKind of the span. OpenTelemetry Span Kind
durationNanoDuration of the span in nanoseconds
statusCodeStatus code of the span. OpenTelemetry Status Code
componentComponent of the span. Eg: http, grpc etc. Derived from component attribute of a span
dbSystemDatabase system of the span. Eg: mysql, postgres etc. Derived from db.system attribute of a span
dbNameDatabase name of the span. Derived from db.name attribute of a span
dbOperationDatabase operation of the span. Derived from db.operation attribute of a span
peerServicePeer service of the span. Derived from peer.service attribute of a span
eventsEvents of the span. It is an array of stringified json of span events
httpMethodHTTP method of the span. Derived from http.method attribute of a span
httpUrlHTTP url of the span. Derived from http.url attribute of a span
httpRouteHTTP route of the span. Derived from http.route attribute of a span
httpHostHTTP host of the span. Derived from http.host attribute of a span
msgSystemMessaging system of the span. Derived from messaging.system attribute of a span
msgOperationMessaging operation of the span. Derived from messaging.operation attribute of a span
hasErrorWhether the span has error or not
rpcSystemRPC system of the span. Derived from rpc.system attribute of a span
rpcServiceRPC service of the span. Derived from rpc.service attribute of a span
rpcMethodRPC method of the span. Derived from rpc.method attribute of a span
responseStatusCodeResponse status code of the span. Derived from http.status_code, rpc.grpc.status_code and rpc.jsonrpc.error_code attribute of a span
stringTagMapMap of all string tags/attributes of the span
numberTagMapMap of all number tags/attributes of the span
boolTagMapMap of all bool tags/attributes of the span
resourceTagsMapMap of all resource tags/attributes of the span

distributed_signoz_spans

This table stores span json model which is used to create trace tree and displayed on trace detail page.

(
    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),
    `traceID` FixedString(32) CODEC(ZSTD(1)),
    `model` String CODEC(ZSTD(9))
)

Columns in the distributed_signoz_spans table

NAMEDESCRIPTION
timestampTime when the span generated at the source
traceIDTrace ID. W3C Trace Context
modelStringified json of the span

distributed_signoz_error_index_v2

This table stores error events derived from spans

(
    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),
    `errorID` FixedString(32) CODEC(ZSTD(1)),
    `groupID` FixedString(32) CODEC(ZSTD(1)),
    `traceID` FixedString(32) CODEC(ZSTD(1)),
    `spanID` String CODEC(ZSTD(1)),
    `serviceName` LowCardinality(String) CODEC(ZSTD(1)),
    `exceptionType` LowCardinality(String) CODEC(ZSTD(1)),
    `exceptionMessage` String CODEC(ZSTD(1)),
    `exceptionStacktrace` String CODEC(ZSTD(1)),
    `exceptionEscaped` Bool CODEC(T64, ZSTD(1)),
    `resourceTagsMap` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    INDEX idx_error_id errorID TYPE bloom_filter GRANULARITY 4,
    INDEX idx_resourceTagsMapKeys mapKeys(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX idx_resourceTagsMapValues mapValues(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64
)

Columns in the distributed_signoz_error_index_v2 table

NAMEDESCRIPTION
timestampTime when the span generated at the source
errorIDError ID. W3C Trace Context
groupIDGroup ID of the error
traceIDTrace ID. W3C Trace Context
spanIDSpan ID
serviceNameName of the service. Derived from service.name attribute of a span
exceptionTypeException type of the error. Derived from exception.type attribute of a span
exceptionMessageException message of the error. Derived from exception.message attribute of a span
exceptionStacktraceException stacktrace of the error. Derived from exception.stacktrace attribute of a span
exceptionEscapedWhether the exception is escaped or not. Derived from exception.escaped attribute of a span
resourceTagsMapMap of all resource tags/attributes of the span

distributed_top_level_operations

This table stores top operations and service name.

(
    `name` LowCardinality(String) CODEC(ZSTD(1)),
    `serviceName` LowCardinality(String) CODEC(ZSTD(1))
)

Columns in the distributed_top_level_operations table

NAMEDESCRIPTION
nameName of the span
serviceNameName of the service

distributed_span_attributes_keys

This table stores all the attributes keys of the span.

(
    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),
    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),
    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),
    `isColumn` Bool CODEC(ZSTD(1))
)

Columns in the distributed_span_attributes_keys table

NAMEDESCRIPTION
tagKeyName of the attribute
tagTypeType of the attribute. It can be tag or resource
dataTypeData type of the attribute. It can be string, bool or float64
isColumnWhether the attribute is a column or not

distributed_span_attributes

This table stores all the attributes of the span.

(
    `timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),
    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),
    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),
    `stringTagValue` String CODEC(ZSTD(1)),
    `float64TagValue` Nullable(Float64) CODEC(ZSTD(1)),
    `isColumn` Bool CODEC(ZSTD(1))
)

Columns in the distributed_span_attributes table

NAMEDESCRIPTION
timestampTime when the span generated at the source
tagKeyName of the attribute
tagTypeType of the attribute. It can be tag or resource
dataTypeData type of the attribute. It can be string, bool or float64
stringTagValueString value of the attribute
float64TagValueFloat64 value of the attribute
isColumnWhether the attribute is a column or not

Writing Clickhouse Queries for Dashboard Panels

Timeseries

This panel is used when you want to view your aggregated data in a timeseries.

Examples

Plotting a chart on 100ms interval

Plot a chart of 1 minute showing count of spans in 100ms interval of service frontend with duration > 50ms

SELECT fromUnixTimestamp64Milli(intDiv( toUnixTimestamp64Milli ( timestamp ), 100) * 100) AS interval, 
toFloat64(count()) AS count 
FROM (
 SELECT timestamp 
 FROM signoz_traces.distributed_signoz_index_v2 
 WHERE serviceName='frontend' 
 AND durationNano>=50*exp10(6) 
 AND timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}})
 GROUP BY interval ORDER BY interval ASC;

Value

For the value type panel, the overall query will be similar to timeseries, just that you will have to get the absolute value at the end. You can reduce your end result to either average, latest, sum, min, or max.

Examples

Average duration of spans where method = 'POST' , service_name = 'sample-service'
SELECT 
    avg(value) as value, 
    any(ts) as ts FROM (
        SELECT 
            toStartOfInterval((timestamp), INTERVAL 1 MINUTE) AS ts, 
            toFloat64(count()) AS value 
        FROM 
            signoz_traces.distributed_signoz_index_v2
        WHERE 
            timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND
            httpMethod = 'POST' AND 
            serviceName = 'sample-service'
        GROUP BY ts 
        ORDER BY ts ASC
    )

Table

This is used when you want to view the timeseries data in a tabular format.

The query is similar to timeseries query but instead of using time interval we use just use now() as ts in select.

Examples

GroupBy a tag/attribute in distributed tracing data
SELECT now() as ts, 
stringTagMap['example_string_attribute'] AS attribute_name, 
toFloat64(avg(durationNano)) AS value 
FROM signoz_traces.distributed_signoz_index_v2  
WHERE timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}}
GROUP BY (attribute_name, ts) order by (attribute_name, ts) ASC;