Troubleshooting
This troubleshooting guide includes step-by-step instructions that should resolve most issues with logs.
v0.53.0: Docker: Schema Migrator: Dirty database version
If you are migrating from older version of SigNoz to V0.53.0
, you might encounter issue with signoz-schema-migrator
where it fails to run in Docker deployments.
clickhouse migrate failed to run, error: Dirty database version 14. Fix and force version.
Here are the steps on how to resolve it.
- Check if these change is present, i.e macro is uncommented in the
clickhouse-config.xml
file.
<macros>
<shard>01</shard>
<replica>example01-01-1</replica>
</macros>
- Make sure clickhouse has restarted after the above change, you can restart it by running
docker container restart signoz-clickhouse
. If you try to run./install.sh
ordocker compose up -d
the migrator will fail again but it's fine, we will fix it in the next step. - Exec into the clickhouse container
docker exec -it signoz-clickhouse /bin/bash
- Run the client
clickhouse client
- Delete the migration entry by running this SQL command
delete from signoz_logs.schema_migrations where version=14;
- Now you can run
./install.sh
ordocker compose up -d
, and the migrator will run successfully.
If you face any issue please reach out to us on slack.
Schema Migrator: Dirty database version
If you are migrating from older version of SigNoz to V0.49.1
or V0.50.0
, you might encounter issue with signoz-schema-migrator
where it fails to run and results in an error message like this.
Dirty database version 10. Fix and force version.
Similar error messages include
Dirty database version 11. Fix and force version.
Dirty database version 12. Fix and force version.
To fix this issue, first disable the schema migrator.
- On k8s you can delete the job.
- On Docker you can delete the schema migrator container.
Run The Migrations Manually
- Exec into the clickhouse container.
- Start the clickhouse client by running
clickhouse client
- Check the mutations table and clear it.
- Check the running mutations
select * from system.mutations where is_done=0
- If some mutation is stuck with error message you can kill it by running.
KILL MUTATION where mutation_id='<mutation_id>'
- If you have any doubt's please reach out to us on our community slack
- Check the running mutations
- Run the migrations manually by running each command from the below files on the clickhouse console. If the 10th migration failed then start with 10th, else start with the version where it failed.
- 10:- https://github.com/SigNoz/signoz-otel-collector/blob/main/migrationmanager/migrators/logs/migrations/000010_body_ngram.up.sql
- 11:- https://github.com/SigNoz/signoz-otel-collector/blob/main/migrationmanager/migrators/logs/migrations/000011_add_instrumentation_scope.up.sql
- 12:- https://github.com/SigNoz/signoz-otel-collector/blob/main/migrationmanager/migrators/logs/migrations/000012_rename_instrumentation_scope.up.sql
- 13:- https://github.com/SigNoz/signoz-otel-collector/blob/main/migrationmanager/migrators/logs/migrations/000013_rename_instrumentation_scope.up.sql
While running them replace{{.SIGNOZ_CLUSTER}}
with cluster. - Once the commands run successfully, update the schema migrations table.
- Truncate the schema_migrations table
truncate table signoz_logs.schema_migrations
- Insert data to the migrations table
insert into signoz_logs.schema_migrations values (1 ,1 ,1720103647412982569), (1 ,0 ,1720103647810031709), (2 ,1 ,1720103647811955242), (2 ,0 ,1720103647869983948), (3 ,1 ,1720103647875299543), (3 ,0 ,1720103648053893053), (4 ,1 ,1720103648055714316), (4 ,0 ,1720103648113219798), (5 ,1 ,1720103648115626118), (5 ,0 ,1720103648678466995), (6 ,1 ,1720103648680332053), (6 ,0 ,1720103648795967894), (7 ,1 ,1720103648797859904), (7 ,0 ,1720103648967002426), (8 ,1 ,1720103648969563518), (8 ,0 ,1720103649251171410), (9 ,1 ,1720103649252821091), (9 ,0 ,1720103649365897523), (10 ,1 ,1720103649367680188), (10 ,0 ,1720103660451466954), (11 ,1 ,1720103660454458733), (11 ,0 ,1720103661073573745), (12 ,1 ,1720115099220469277), (12 ,0 ,1720115146203106526), (13 ,1 ,1720115146205945060), (13 ,0 ,1720115146319582687)`
- Truncate the schema_migrations table
Verify The Changes
- Check the schema of logs table by runningThe following should be present in the above response
show create table signoz_logs.logs format Vertical;
`scope_name` String CODEC(ZSTD(1)), `scope_version` String CODEC(ZSTD(1)), `scope_string_key` Array(String) CODEC(ZSTD(1)), `scope_string_value` Array(String) CODEC(ZSTD(1)), ... INDEX body_idx lower(body) TYPE ngrambf_v1(4, 60000, 5, 0) GRANULARITY 1, ... INDEX scope_name_idx scope_name TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4, ...
- Check the schema of distributed_logs table by runningThe following should be present in the above response
show create table signoz_logs.distributed_logs format Vertical;
`scope_name` String CODEC(ZSTD(1)), `scope_version` String CODEC(ZSTD(1)), `scope_string_key` Array(String) CODEC(ZSTD(1)), `scope_string_value` Array(String) CODEC(ZSTD(1)),
- Check the schema of tag_attributes tableThe following should be present in the above response
SHOW CREATE TABLE signoz_logs.tag_attributes
`tagType` Enum8('tag' = 1, 'resource' = 2, 'scope' = 3) CODEC(ZSTD(1)),
Once verified run the schema migrator again
- On k8s run helm upgrade.
- On Docker rerun the docker compose file.
If you face any issue please reach out to us on slack.
Filter logs of same application/service from different host
If you have a application/service which is emitting similar kind of logs but there are multiple instances of it running on different hosts you can identify them by adding a resource attribute.
There are different ways to add a resource attribute.
- By passing it as an environment variable if you are using the opentelemtry SDK or auto-instrumentationreplace the value of
OTEL_RESOURCE_ATTRIBUTES=service.name=<service_name>,hostname=<host_name>
<service_name>
with the actual service name and<host_name>
with the actual hostname - If you have a otel collector running on each host you can add a processor to add the hostname.replace the value of
processors: attributes/add_hostname: actions: - key: hostname value: <hostname> action: upsert ... service: logs: processors: [attributes/add_hostname, batch]
<host_name>
with the actual hostname
Missing Columns Issue
In case you are using signoz version 0.27
or newer and in the past you ran the migration 0.27
You may face missing columns issue if you are using distributed clickhouse with multiple shards.
To solve this issue, exec into all the shards and follow the steps below:
show create table signoz_logs.distributed_logs
If in one shard the name of a column is host_name
and in other shard the name is attribute_string_host_name
then run the following command
alter table signoz_logs.distributed_logs on cluster cluster rename column if exists host_name to attribute_string_host_name
Run the above command for all the column names which were not migrated.
K8s Attribute Filtering Issue in Logs
In the SigNoz charts releases v0.9.1
, v0.10.0
and 0.10.1
, some users who are facing issues querying the following selected fields.
k8s_container_name
k8s_namespace_name
k8s_pod_name
k8s_container_restart_count
k8s_pod_uid
If you have included any of the above to selected
fields, and you get empty data when you filter using those fields then you will have to perform the following steps.
Exec into your clickhouse container
kubectl exec -n platform -it chi-my-release-clickhouse-cluster-0-0-0 -- sh clickhouse client
Run the following queries
use signoz_logs; show create table logs;
For the corresponding field, you will find a materialized column and an index. For example:
k8s_namespace_name
you will havek8s_namespace_name String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'k8s_namespace_name')] CODEC(LZ4)
and indexINDEX k8s_namespace_name_idx k8s_namespace_name TYPE bloom_filter(0.01) GRANULARITY 64
You will have to delete the index and remove the materialized column
alter table logs drop column k8s_namespace_name; alter table logs drop index k8s_namespace_name_idx;
Perform the above steps for all the k8s fields listed.
Once done truncate the attribute keys table
truncate table logs_atrribute_keys;
Now you can go back to the UI and convert them back to the selected field, and filters will work as expected.