If you want to check our Github repo before diving in 👇
The cost of a millisecond.
TABB Group, a financial services industry research firm, estimates that if a broker's electronic trading platform is 5 milliseconds behind the competition, it could cost $4 million in revenue per millisecond.
The cost of latency is too high in the financial services industry, and the same is true for almost any software-based business today. For Google, half a second delay in search results caused a 20% drop in traffic. Half a second is enough to kill user satisfaction to a point where they abandon an app's service.
While a user sees a screen, there are thousands of services in the background taking care of a user's request. In a microservices architecture, the challenge for engineering teams is to constantly figure out areas of optimization in a complex distributed network. And the solution starts with setting up a robust monitoring infrastructure for the application's production environment.
Capturing and analyzing data about your production environment is critical. You need to proactively solve stability and performance issues in your web application to avoid system failures and ensure a smooth user experience.
And to do that, you need insights into how your infrastructure handles user requests. With SigNoz, you can start monitoring your app in a few simple steps, and with an easy-to-use dashboard, you can quickly identify bottlenecks in your services.
SigNoz is a full-stack open-source application monitoring and observability platform which can be installed within your infra. You can track metrics like p99 latency, error rates for your services, external API calls, and individual endpoints. With service maps, you can quickly assess the health of your services.
And once you know the affected service, trace data can help you identify the exact code causing the issue. Using SigNoz dashboard, you can visualize your traces easily with flamegraphs.
Now let's get down to some action and see everything for yourself.
We have set up a sample ToDo Python app based on Flask web framework, which uses MongoDB as a database to demonstrate how SigNoz works. We will divide the tutorial into two parts:
- Installing SigNoz
- Instrumenting sample app to start monitoring
Part 1 - Installing SigNoz
- Install Docker
You can install Docker by following the steps listed on their website here. For this tutorial, you can choose the Docker Desktop option based on the system you have.
2. Clone SigNoz GitHub repository
From your terminal use the following command to clone SigNoz's GitHub repository.
git clone https://github.com/SigNoz/signoz.git
3. Update path to signoz/deploy and install SigNoz
The deploy folder contains the files necessary for deploying SigNoz through Docker.
You will be asked to select one of the 2 ways to proceed:
- Clickhouse as database (default)
- Kafka + Druid setup to handle scale (recommended for production use)
Trying out SigNoz with clickhouse database takes less than 1GB of memory and for this tutorial, we will use that option.
You will get the following message once the installation is complete.
Note that this setup is just for demo/testing purposes and you need to proceed with Kafka + Druid set up option in case you want to set up SigNoz for use in production.
Once the installation runs successfully, the UI should be accessible at port 3000. Wait for 2-3 mins for the data to be available to frontend.
The applications shown in the dashboard are from a sample app called Hot R.O.D that comes with the installation bundle. It has 4 microservices being monitored: Frontend, Customer, Driver and Route. You can access the Hot R.O.D application UI at: http://localhost:9000/
Now that you have SigNoz up and running, let's see how instrumentation works. Instrumentation is the process of implementing code instructions to monitor your application's performance. Instrumentation is key to see how your application handles the real world.
SigNoz supports OpenTelemetry as the primary way for users to instrument their application. OpenTelemetry is a single, vendor-agnostic instrumentation library per language with support for both automatic and manual instrumentation. You don't need to worry about instrumentation in this tutorial. OpenTelemetry comes with all currently available instrumentation.
Part 2 - Instrumenting sample app to start monitoring
1. Python 3.4 or newer
If you do not have Python installed on your system, you can download it from the link here. Check the version of Python using
python3 --version on your terminal to see if Python is properly installed or not.
If you already have MongoDB services running on your system, you can skip this step.
Download link: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/
On MacOS the installation is done using Homebrew's brew package manager. Once the installation is done, don't forget to start MongoDB services using
brew services start firstname.lastname@example.org on your macOS terminal.
1. Clone sample Flask app repository
From your terminal use the following command to clone sample Flask app GitHub repository.
git clone https://github.com/SigNoz/sample-flask-app.git
2. Update path to sample-flask-app & check if the app is running
Check if the app is working or not using the following command:
cd sample-flask-app python3 app.py
You can now access the UI of the app on your local host: http://localhost:5000/
Press 'Ctrl + C' to exit the app once you have made sure it is running properly.
3. Set up OpenTelemetry Python instrumentation library
Your app folder contains a file called requirements.txt. This file contains all the necessary commands to set up OpenTelemetry python instrumentation library. All the mandatory packages required to start the instrumentation are installed with the help of this file. Make sure your path is updated to the root directory of your sample app and run the following command:
pip3 install -r requirements.txt
If it hangs while installing
grpcio during pip3 install opentelemetry-exporter-otlp then follow below steps as suggested in this stackoverflow link
- pip3 install --upgrade pip
- python3 -m pip install --upgrade setuptools
- pip3 install --no-cache-dir --force-reinstall -Iv grpcio
4. Install application specific packages
This step is required to install packages specific to the application. Make sure to run this command in the root directory of your installed application. This command figures out which instrumentation packages the user might want to install and installs it for them:
5. Configure a span exporter and run your application
You're almost done. In the last step, you just need to configure a few environment variables for your OTLP exporters. Environment variables that need to be configured:
- SERVICE_NAME - application service name (you can name it as you like)
- ENDPOINT_ADDRESS - OTLP gRPC collector endpoint address (IP of SigNoz)
After taking care of these environment variables, you only need to run your instrumented application. Accomplish all these by using the following command at your terminal.
OTEL_RESOURCE_ATTRIBUTES=service.name=pythonApp OTEL_EXPORTER_OTLP_ENDPOINT="http://<IP of SigNoz>:4317" opentelemetry-instrument python3 app.py
<Ip of SigNoz> can be replaced with localhost in this case. Hence, the final command becomes:
OTEL_RESOURCE_ATTRIBUTES=service.name=pythonApp OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" opentelemetry-instrument python3 app.py
And, congratulations! You have instrumented your sample Python app. You can now access the SigNoz dashboard at http://localhost:3000 to monitor your app for performance metrics.
Using SigNoz dashboard to identify issues causing high latency in your app
Now that you have installed SigNoz, let's see how you can identify specific events causing high latency in your deployed applications.
In just 5 easy steps, our dashboard lets you drill down to events causing a delay in your deployed apps 👇
- Choose the service you want to inspect
2. Choose the timestamp where latency is high and click on view traces
3. Choose the trace ID with the highest latency
4. Inspect distributed traces with flamegraph
5. Zero in on the highest latency event and take action
If you need any help with trying out SigNoz, feel free to mail me at email@example.com.
Check out our documentation for more installation guides and troubleshooting instructions.
They say, "If it's not monitored, then it's not in production." And with SigNoz you can start monitoring your applications now. Enabling your team to resolve issues quickly in production is critical to maintaining complex distributed systems in fine health.
At SigNoz, we are committed to making the best open-source, self-hosted tool for application performance monitoring. Feel free to check out our GitHub repo here: