This article will discuss the nginx-log-collector project, which will read nginx logs and send them to the Clickhouse cluster. Usually for logs we use ElasticSearch. Clickhouse requires less resources (disk space, RAM, CPU). Clickhouse records data faster. Clickhouse compresses data, making disk data even more compact.
- To view log analytics, create a dashboard for Grafana.
- Install nginx, grafana in the standard way
- Install the clickhouse cluster using ansible-playbook
Creating Databases and Tables in Clickhouse
This file describes SQL queries for creating a database and tables for nginx-log-collector in Clickhouse.
We make each request in turn on each server of the Clickhouse cluster.
Important note. On this line, logs_cluster should be replaced with your cluster name from the clickhouse_remote_servers.xml file between “remote_servers” and “shard”.
1 |
ENGINE = Distributed('logs_cluster', 'nginx', 'access_log_shard', rand()) |
Installing and Configuring nginx-log-collector-rpm
Nginx-log-collector does not have rpm. Here https://github.com/patsevanton/nginx-log-collector-rpm create rpm for it. Rpm will be collected using Fedora Copr.
Install the rpm package nginx-log-collector-rpm
1 2 3 4 |
yum -y install yum-plugin-copr yum copr enable antonpatsev/nginx-log-collector-rpm yum -y install nginx-log-collector systemctl start nginx-log-collector |
Edit the config /etc/nginx-log-collector/config.yaml:
1 2 3 4 5 6 7 8 9 10 11 |
....... upload: table: nginx.access_log dsn: http://ip-адрес-кластера-clickhouse:8123/ - tag: "nginx_error:" format: error # access | error buffer_size: 1048576 upload: table: nginx.error_log dsn: http://ip-адрес-кластера-clickhouse:8123/ |
Nginx setup
General nginx config:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
user nginx; worker_processes auto; #error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections 1024; } http { include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; log_format avito_json escape=json '{' '"event_datetime": "$time_iso8601", ' '"server_name": "$server_name", ' '"remote_addr": "$remote_addr", ' '"remote_user": "$remote_user", ' '"http_x_real_ip": "$http_x_real_ip", ' '"status": "$status", ' '"scheme": "$scheme", ' '"request_method": "$request_method", ' '"request_uri": "$request_uri", ' '"server_protocol": "$server_protocol", ' '"body_bytes_sent": $body_bytes_sent, ' '"http_referer": "$http_referer", ' '"http_user_agent": "$http_user_agent", ' '"request_bytes": "$request_length", ' '"request_time": "$request_time", ' '"upstream_addr": "$upstream_addr", ' '"upstream_response_time": "$upstream_response_time", ' '"hostname": "$hostname", ' '"host": "$host"' '}'; access_log syslog:server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx avito_json; #ClickHouse error_log syslog:server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx_error; #ClickHouse #access_log /var/log/nginx/access.log main; proxy_ignore_client_abort on; sendfile on; keepalive_timeout 65; include /etc/nginx/conf.d/*.conf; } |
vhost1.conf:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
upstream backend { server ip-адрес-сервера-с-stub_http_server:8080; server ip-адрес-сервера-с-stub_http_server:8080; server ip-адрес-сервера-с-stub_http_server:8080; server ip-адрес-сервера-с-stub_http_server:8080; server ip-адрес-сервера-с-stub_http_server:8080; } server { listen 80; server_name vhost1; location / { proxy_pass http://backend; } } |
Add virtual hosts to the / etc / hosts file:
1 |
ip-адрес-сервера-с-nginx vhost1 |
HTTP server emulator
As an HTTP server emulator we will use nodejs-stub-server.
Nodejs-stub-server does not have rpm. Here https://github.com/patsevanton/nodejs-stub-server create rpm for it. Rpm will be collected using Fedora Copr
Install nodejs-stub-server package on upstream nginx rpm
1 2 3 4 |
yum -y install yum-plugin-copr yum copr enable antonpatsev/nodejs-stub-server yum -y install stub_http_server systemctl start stub_http_server |
Stress Testing
Testing done using Apache benchmark.
Install it:
1 |
yum install -y httpd-tools |
We start testing using Apache benchmark from 5 different servers:
1 2 3 4 5 |
while true; do ab -H "User-Agent: 1server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done while true; do ab -H "User-Agent: 2server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done while true; do ab -H "User-Agent: 3server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done while true; do ab -H "User-Agent: 4server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done while true; do ab -H "User-Agent: 5server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done |
Grafana Setup
You also need to create a table variable with the contents of nginx.access_log.
Singlestat Total Requests:
1 2 3 4 5 |
SELECT 1 as t, count(*) as c FROM $table WHERE $timeFilter GROUP BY t |
Singlestat Failed Requests:
1 2 3 4 5 |
SELECT 1 as t, count(*) as c FROM $table WHERE $timeFilter AND status NOT IN (200, 201, 401) GROUP BY t |
Singlestat Failing Percent:
1 2 3 |
SELECT 1 as t, (sum(status = 500 or status = 499)/sum(status = 200 or status = 201 or status = 401))*100 FROM $table WHERE $timeFilter GROUP BY t |
Singlestat Avg Response Time:
1 2 3 |
SELECT 1, avg(request_time) FROM $table WHERE $timeFilter GROUP BY 1 |
Singlestat Max Response Time:
1 2 3 4 |
SELECT 1 as t, max(request_time) as c FROM $table WHERE $timeFilter GROUP BY t |
Count Status:
1 |
$columns(status, count(*) as c) from $table |
To output data, you need to install the plugin and restart grafana.
1 2 |
grafana-cli plugins install grafana-piechart-panel service grafana-server restart |
Pie TOP 5 Status:
1 2 3 4 5 6 7 8 9 |
SELECT 1, /* fake timestamp value */ status, sum(status) AS Reqs FROM $table WHERE $timeFilter GROUP BY status ORDER BY Reqs desc LIMIT 5 |
Count http_user_agent:
1 |
$columns(http_user_agent, count(*) c) FROM $table |
GoodRate/BadRate:
1 |
$rate(countIf(status = 200) AS good, countIf(status != 200) AS bad) FROM $table |
Response Timing:
1 |
$rate(avg(request_time) as request_time) FROM $table |
Upstream response time (response time of the 1st upstream):
1 |
$rate(avg(arrayElement(upstream_response_time,1)) as upstream_response_time) FROM $table |
1 |
$columns(status, count(*) as c) from $table |
Conclusion:
Hopefully the community will get involved in developing / testing and using nginx-log-collector.
And someone, when he implements nginx-log-collector, will tell you how much he saved the disk, RAM, CPU.