More and more companies are now shifting to a cloud-native & microservices-based architecture. Having an application monitoring tool is critical in this world because you can’t just log into a machine and figure out what’s going wrong.
We have spent years learning about application monitoring & observability. What are the key features an observability tool should have to enable fast resolution of issues.
In our opinion, good observability tools should have
- Out of the box application metrics
- Way to go from metrics to traces to find why some issues are happening
- Seamless flow between metrics, traces & logs — the three pillars of observability
- Filtering of traces based on different tags and filters
- Ability to set dynamic thresholds for alerts
- Transparency in pricing
We found that though there are open-source tools like Prometheus & Jaeger, they don’t provide a great user experience as SaaS products do. It takes lots of time and effort to get them working, figuring out the long-term storage, etc. And if you want metrics and traces, it’s not possible as Prometheus metrics & Jaeger traces have different formats.
SaaS tools like DataDog and NewRelic do a much better job at many of these aspects:
- They are easy to setup & get started
- Provide out-of-box application metrics
- Provides correlation between metrics & traces
But it has the following issues:
- Crazy node-based pricing, which doesn’t make sense in today’s micro-services architecture. Any node which is live for more than 8hrs in a month is charged. So, unsuitable for spiky workloads
- Very costly. They charge custom metrics for $5/100 metrics
- It is cloud-only, so not suitable for companies that have concerns with sending data outside their infra
- For any small feature, you are dependent on their roadmap. We think this is an unnecessary restriction for a product which developers use. A product used by developers should be extendible
To fill this gap we built SigNoz, an open-source alternative to DataDog.
Some of our key features which makes SigNoz vastly superior to current open-source products and a great alternative to DataDog are:
- Out of the box application metrics
- Seamless flow between metrics & traces
- Filtering based on tags
- Custom aggregates on filtered traces
- Transparent usage Data
- Detailed Flamegraphs & Gantt charts
Get p90, p99 latencies, RPS, Error rates and top endpoints for a service out of the box.
Found something suspicious in a metric, just click that point in the graph & get details of traces which may be causing the issues. Seamless, Intuitive.
For example, you can find latency experienced by customers who have customer_type set as
Create custom metrics from filtered traces to find metrics of any type of requests. Want to find p99 latency of
customer_type: premium who are seeing
status_code:400. Just set the filters, and you have the graph. Boom!
You can drill down details of how many events is each application sending or at what granularity, so that you can adjust your sampling rate as needed and not get a shock at the end of the month ( case with SaaS vendors many a times)
Detailed flamegraph & Gantt charts to find exact cause of the issue, and which of the underlying requests is causing the problem. Is it a SQL query gone rogue or a redis operation is causing an issue?
If you have docker installed, getting started with SigNoz just takes three easy steps at the command line:
You can read more about deploying SigNoz from its documentation.
If you liked what you read, then check out our GitHub repo 👇
Our slack community is a great place to get your queries solved instantly and get community support for SigNoz. Link to join 👇
SigNoz slack community