In this article, I’ll show you how to use pinba with clickhouse and grafana instead of pinba_engine and pinboard.

On a phba project, pinba is perhaps the only reliable way to understand what is happening with performance. True, pinba is usually implemented only when problems are already observed and it is not clear where to dig.

Often no one understands how many times per second / minute a particular script is called and begin to optimize “by touch”, starting from those places that seem more logical.

Someone analyzes nginx logs, and someone slow requests in a DB.

Of course pinba would not be superfluous, but there are several reasons why it is not on every project.

And the first reason is the installation.

In order to more or less get some kind of “result” from the introduction of pinba, it is very good to see metrics not only in the last minutes, but also over a long period of time (from days to months).

To do this, you need:

  • install extension for php (and maybe you want a module for nginx)
  • compile extension for mysql
  • install pinboard and configure cron

Due to the small amount of information about pinba, many have the impression that it worked only on php5 and has long been in the past, but as we will see later, this is not like this.

The first step is the easiest, all you need to do is run the command:

In the repositories this extension is right up to php 7.3 and you do not need to compile anything.

After executing the installation command, we immediately get an already working extension that collects and sends the metrics of each script (duration, memory, etc.) in the protobuf format by udp to 127.0.0.1{0002.

No one has yet caught and processed these udp packages, but this doesn’t affect the speed or stability of your php scripts in any way.

Until recently, only pinba_engine was the only application that could catch and process these udp packets. The description of the “simple and concise” installation discourages the desire to ever read and delve into it again. In long dependency lists there are both package names and program names and links to individual pages with their installation, and those have their own links to other dependencies.

The pinba2 installation process has not become much easier.

Perhaps someday pinba10 can be installed with one or two commands and will be not needed to read a lot of material to understand how to do it, but so far this is not so.

If you still installed pinba_engine, then this is only half of the battle. Indeed, without a pinboard, you will be limited with data only in the last few minutes or have to aggregate, save and visualize data yourself. It’s good that the pinboard is quite easy to install.

But for what is such suffering if all the metrics from php already go to the udp port in protobuf format and all that is needed is to write an application that will catch and store them in some kind of storage? Looks like, those developers who came up with this idea immediately sat down to write their own solutions, some of which are available on the github.

Here you can find a review of four open source projects that store metrics in storage, from which this data is easy to get and visualize, for example, using grafana.

olegfedoseev / pinba-server (november 2017)

udp server on go, which stores metrics in OpenTSDB. Perhaps if you already use OpenTSDB on the project, then such a solution will suit you otherwise I recommend passing by.

olegfedoseev / pinba-influxdb (June 2018)

udp server on go, from the same author which this time saves metrics in InfluxDB. On many projects, InfluxDB is already being used for monitoring, so this solution can be great for them.

Pros:

InfluxDB allows you to aggregate the received metrics, and delete the original after a specified time.

Minuses:

this solution does not save timer information.

InfluxDB will save page addresses as tags and if you have many unique page addresses, this will lead to increased memory consumption.

ClickHouse-Ninja / Proton (January 2019)

udp server on go that stores metrics in ClickHouse.

Pros:

  • the clickhouse is ideal for such tasks, it allows you to compress data so that you can store all raw data even without aggregations
  • if required, you can easily aggregate the resulting metrics
  • ready-made template for grafana
  • saves timer information

Minuses:

  • there is no configuration file in which it would be possible to configure name of the database and tables, address and port of the server.
  • when saving raw data, an auxiliary dictionary table is used to store page and domain addresses, which subsequently complicates queries

pinba-server / pinba-server (April 2019)

udp server in php that stores metrics in ClickHouse. It is a “proof of concept”, which unexpectedly for me didn’t consume significant resources (30 MB of RAM and less than 1% of one of the eight processor cores).

Pros are the same as in the previous solution, here are also used the usual names from the original pinba_engine. Also is added a config that allows you to run several pinbaserver instances at once to save metrics to different tables – this is useful if you want to collect data not only from php, but also from nginx.

Principle of operation

The udp port 30002 is listened. All incoming packets are decoded according to the protobuf scheme and aggregated. Once a minute, the packet is inserted into the clickhouse in the pinba.requests table. (all parameters are configured in the config)

A little about clickhouse

Clickhouse supports various storage engines. The most commonly used is MergeTree.

If at some point you decide to store aggregated data for all time, and raw data only for the last, then you can create a materialized view with grouping, and clean the main pinba.requests table periodically, while all data will remain in the materialized view. Moreover, when creating the pinba.requests table, you can specify “engine = Null”, then the raw data will not be saved to disk at all and at the same time it will still get into the materialized view and stored aggregated. I use this scheme for nginx metrics, because on nginx I have 50 times more requests than on php.

There is also a detailed description of the installation and configuration of solution and everything that you need, as well as pitfalls that more than one “ship” crashed on. The entire installation process is described for Ubuntu 18.04 LTS and Centos 7, on other distributions and versions the process may vary slightly.

Installation

All the necessary commands are made in the Dockerfile to facilitate the reproducibility of instructions. Only pitfalls will be described below.

php pinba

After installation, make sure that in the /etc/php/7.2/fpm/conf.d/20-pinba.ini file you have all the options uncommented. In some distributions (for example, centos) they can be commented out.

clickhouse

During installation, clickhouse will ask you to set a password for the default user. By default, this user is accessible from all ip, so if you don’t have a firewall on the server, be sure to set a password for it. This can also be done after installation in the /etc/clickhouse-server/users.xml file.

Also should be mentioned that clickhouse uses several ports, including 9000. This port is also used for php-fpm in some distributions (for example, centos). If you already have this port in use, you can change it to another one in the /etc/clickhouse-server/config.xml file.

grafana with plugin for clickhouse

After installing grafana, use the admin username and the admin password. At the first entrance, it will ask you to set a new password.

Next, go to the “+” -> import menu and specify the dashboard number for import 10011.

The grafana supports working with the clickhouse through a third-party plugin, but for third-party plugins don’t work alerts.

pinba server

Installing protobuf and libevent is optional, but improves pinba-server performance. If you install pinba-server in a folder other than / opt, then you will also need to fix the systemd script file.

pinba module under nginx

To compile the module, you need the sources of the same version of nginx that is already installed on your server, as well as the same compilation options, otherwise the assembly will succeed, but when the module is connected, will be generated an error that the module is binary incompatible. Compilation options can be viewed using the nginx -V command.

Life hacks

All my sites work only on https. The schema field becomes meaningless, so I use it to separate the web / console.

In scripts that are accessible from the web I use:

And in the console (for example, crown scripts):

In the dashboard there is a web / console switch for viewing statistics separately.

You can also transfer your tags to pinbu, for example: