Feat: integrations: clickhouse (#4879)

* chore: get built-in clickhouse integration started

* chore: update config pre-requisites for clickhouse integration

* chore: add details of metrics data collected for clickhouse integration

* chore: clickhouse integration: move list of data-collected to its own file

* chore: clickhouse integration: get overview dashboard started

* chore: start with logs collection instructions for clickhouse

* chore: regex parsing for clickhouse text logs

* chore: timestamp parsing for clickhouse logs

* chore: severity parsing for clickhouse logs

* chore: clickhouse logs parsing: move parsed message to body if available

* chore: update pre-reqs for collecting from system.query_log table

* feat: add instructions for collecting from system.query_log table

* feat: add logs attribs collected

* chore: some cleanup of clickhouse overview dashboard

* feat: finish up with clickhouse overview dashboard for clickhouse integration
This commit is contained in:
Raj Kamal Singh 2024-04-26 09:45:57 +05:30 committed by GitHub
parent b2c170c752
commit e6e0a59f5f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 18510 additions and 0 deletions

View File

@ -0,0 +1,125 @@
### Collect Clickhouse Logs
You can configure Clickhouse logs collection by providing the required collector config to your collector.
#### Create collector config file
Save the following config for collecting clickhouse logs in a file named `clickhouse-logs-collection-config.yaml`
```yaml
receivers:
filelog/clickhouse:
include: ["${env:CLICKHOUSE_LOG_FILE}"]
operators:
# Parse default clickhouse text log format.
# See https://github.com/ClickHouse/ClickHouse/blob/master/src/Loggers/OwnPatternFormatter.cpp
- type: recombine
source_identifier: attributes["log.file.name"]
is_first_entry: body matches '^\\d{4}\\.\\d{2}\\.\\d{2}\\s+'
combine_field: body
overwrite_with: oldest
- type: regex_parser
parse_from: body
if: body matches '^(?P<ts>\\d{4}\\.\\d{2}\\.\\d{2} \\d{2}:\\d{2}:\\d{2}.?[0-9]*)\\s+\\[\\s+(\\x1b.*?m)?(?P<thread_id>\\d*)(\\x1b.*?m)?\\s+\\]\\s+{((\\x1b.*?m)?(?P<query_id>[0-9a-zA-Z-_]*)(\\x1b.*?m)?)?}\\s+<(\\x1b.*?m)?(?P<log_level>\\w*)(\\x1b.*?m)?>\\s+((\\x1b.*?m)?(?P<clickhouse_component>[a-zA-Z0-9_]+)(\\x1b.*?m)?:)?\\s+(?s)(?P<message>.*)$'
regex: '^(?P<ts>\d{4}\.\d{2}\.\d{2} \d{2}:\d{2}:\d{2}.?[0-9]*)\s+\[\s+(\x1b.*?m)?(?P<thread_id>\d*)(\x1b.*?m)?\s+\]\s+{((\x1b.*?m)?(?P<query_id>[0-9a-zA-Z-_]*)(\x1b.*?m)?)?}\s+<(\x1b.*?m)?(?P<log_level>\w*)(\x1b.*?m)?>\s+((\x1b.*?m)?(?P<clickhouse_component>[a-zA-Z0-9_]+)(\x1b.*?m)?:)?\s+(?s)(?P<message>.*)$'
- type: time_parser
if: attributes.ts != nil
parse_from: attributes.ts
layout_type: gotime
layout: 2006.01.02 15:04:05.999999
location: ${env:CLICKHOUSE_TIMEZONE}
- type: remove
if: attributes.ts != nil
field: attributes.ts
- type: severity_parser
if: attributes.log_level != nil
parse_from: attributes.log_level
overwrite_text: true
# For mapping details, see getPriorityName defined in https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/InternalTextLogsQueue.cpp
mapping:
trace:
- Trace
- Test
debug: Debug
info:
- Information
- Notice
warn: Warning
error: Error
fatal:
- Fatal
- Critical
- type: remove
if: attributes.log_level != nil
field: attributes.log_level
- type: move
if: attributes.message != nil
from: attributes.message
to: body
- type: add
field: attributes.source
value: clickhouse
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
exporters:
# export to SigNoz cloud
otlp/clickhouse-logs:
endpoint: "${env:OTLP_DESTINATION_ENDPOINT}"
tls:
insecure: false
headers:
"signoz-access-token": "${env:SIGNOZ_INGESTION_KEY}"
# export to local collector
# otlp/clickhouse-logs:
# endpoint: "localhost:4317"
# tls:
# insecure: true
service:
pipelines:
logs/clickhouse:
receivers: [filelog/clickhouse]
processors: [batch]
exporters: [otlp/clickhouse-logs]
```
#### Set Environment Variables
Set the following environment variables in your otel-collector environment:
```bash
# path of Clickhouse server log file. must be accessible by the otel collector
# typically found at /var/log/clickhouse-server/clickhouse-server.log.
# Log file location can be found in clickhouse server config
# See https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#logger
export CLICKHOUSE_LOG_FILE="/var/log/clickhouse-server/server.log"
# Locale of the clickhouse server.
# Clickhouse logs timestamps in it's locale without TZ info
# Timezone setting can be found in clickhouse config. For details see https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#timezone
# Must be a IANA timezone name like Asia/Kolkata. For examples, see https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
export CLICKHOUSE_TIMEZONE="Etc/UTC"
# region specific SigNoz cloud ingestion endpoint
export OTLP_DESTINATION_ENDPOINT="ingest.us.signoz.cloud:443"
# your SigNoz ingestion key
export SIGNOZ_INGESTION_KEY="signoz-ingestion-key"
```
#### Use collector config file
Make the collector config file available to your otel collector and use it by adding the following flag to the command for running your collector
```bash
--config clickhouse-logs-collection-config.yaml
```
Note: the collector can use multiple config files, specified by multiple occurrences of the --config flag.

View File

@ -0,0 +1,82 @@
### Collect Clickhouse Metrics
You can configure Clickhouse metrics collection by providing the required collector config to your collector.
#### Create collector config file
Save the following config for collecting Clickhouse metrics in a file named `clickhouse-metrics-collection-config.yaml`
```yaml
receivers:
prometheus/clickhouse:
config:
global:
scrape_interval: 60s
scrape_configs:
- job_name: clickhouse
static_configs:
- targets:
- ${env:CLICKHOUSE_PROM_METRICS_ENDPOINT}
metrics_path: ${env:CLICKHOUSE_PROM_METRICS_PATH}
processors:
# enriches the data with additional host information
# see https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor#resource-detection-processor
resourcedetection/system:
# add additional detectors if needed
detectors: ["system"]
system:
hostname_sources: ["os"]
exporters:
# export to SigNoz cloud
otlp/clickhouse:
endpoint: "${env:OTLP_DESTINATION_ENDPOINT}"
tls:
insecure: false
headers:
"signoz-access-token": "${env:SIGNOZ_INGESTION_KEY}"
# export to local collector
# otlp/clickhouse:
# endpoint: "localhost:4317"
# tls:
# insecure: true
service:
pipelines:
metrics/clickhouse:
receivers: [prometheus/clickhouse]
# note: remove this processor if the collector host is not running on the same host as the clickhouse instance
processors: [resourcedetection/system]
exporters: [otlp/clickhouse]
```
#### Set Environment Variables
Set the following environment variables in your otel-collector environment:
```bash
# Prometheus metrics endpoint on the clickhouse server reachable from the otel collector.
# You can examine clickhouse server configuration to find it. For details see https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#prometheus
export CLICKHOUSE_PROM_METRICS_ENDPOINT="clickhouse:9363"
# Prometheus metrics path on the clickhouse server
# You can examine clickhouse server configuration to find it. For details see https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#prometheus
export CLICKHOUSE_PROM_METRICS_PATH="/metrics"
# region specific SigNoz cloud ingestion endpoint
export OTLP_DESTINATION_ENDPOINT="ingest.us.signoz.cloud:443"
# your SigNoz ingestion key
export SIGNOZ_INGESTION_KEY="signoz-ingestion-key"
```
#### Use collector config file
Make the collector config file available to your otel collector and use it by adding the following flag to the command for running your collector
```bash
--config clickhouse-metrics-collection-config.yaml
```
Note: the collector can use multiple config files, specified by multiple occurrences of the --config flag.

View File

@ -0,0 +1,80 @@
### Collect Clickhouse Query Logs
You can configure collection from system.query_log table in clickhouse by providing the required collector config to your collector.
#### Create collector config file
Save the following config for collecting clickhouse query logs in a file named `clickhouse-query-logs-collection-config.yaml`
```yaml
receivers:
clickhousesystemtablesreceiver/query_log:
dsn: "${env:CLICKHOUSE_MONITORING_DSN}"
cluster_name: "${env:CLICKHOUSE_CLUSTER_NAME}"
query_log_scrape_config:
scrape_interval_seconds: ${env:QUERY_LOG_SCRAPE_INTERVAL_SECONDS}
min_scrape_delay_seconds: ${env:QUERY_LOG_SCRAPE_DELAY_SECONDS}
exporters:
# export to SigNoz cloud
otlp/clickhouse-query-logs:
endpoint: "${env:OTLP_DESTINATION_ENDPOINT}"
tls:
insecure: false
headers:
"signoz-access-token": "${env:SIGNOZ_INGESTION_KEY}"
# export to local collector
# otlp/clickhouse-query-logs:
# endpoint: "localhost:4317"
# tls:
# insecure: true
service:
pipelines:
logs/clickhouse-query-logs:
receivers: [clickhousesystemtablesreceiver/query_log]
processors: []
exporters: [otlp/clickhouse-query-logs]
```
#### Set Environment Variables
Set the following environment variables in your otel-collector environment:
```bash
# DSN for connecting to clickhouse with the monitoring user
# Replace monitoring:<PASSWORD> with `username:password` for your monitoring user
# Note: The monitoring user must be able to issue select queries on system.query_log table.
export CLICKHOUSE_MONITORING_DSN="tcp://monitoring:<PASSWORD>@clickhouse:9000/"
# If collecting query logs from a clustered deployment, specify a non-empty cluster name.
export CLICKHOUSE_CLUSTER_NAME=""
# Rows from query_log table will be collected periodically based on this setting
export QUERY_LOG_SCRAPE_INTERVAL_SECONDS=20
# Must be configured to a value greater than flush_interval_milliseconds setting for query_log.
# This setting can be found in the clickhouse server config
# For details see https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#query-log
# Setting a large enough value ensures all query logs for a particular time interval have been
# flushed before an attempt to collect them is made.
export QUERY_LOG_SCRAPE_DELAY_SECONDS=8
# region specific SigNoz cloud ingestion endpoint
export OTLP_DESTINATION_ENDPOINT="ingest.us.signoz.cloud:443"
# your SigNoz ingestion key
export SIGNOZ_INGESTION_KEY="signoz-ingestion-key"
```
#### Use collector config file
Make the collector config file available to your otel collector and use it by adding the following flag to the command for running your collector
```bash
--config clickhouse-query-logs-collection-config.yaml
```
Note: the collector can use multiple config files, specified by multiple occurrences of the --config flag.

View File

@ -0,0 +1,42 @@
## Before You Begin
To configure metrics and logs collection for a Clickhouse server, you need the following.
### Ensure Clickhouse server is prepared for monitoring
- **Ensure that the Clickhouse server is running a supported version**
Clickhouse versions v23 and newer are supported.
You can use the following SQL statement to determine server version
```SQL
SELECT version();
```
- **If collecting metrics, ensure that Clickhouse is configured to export prometheus metrics**
If needed, please [configure Clickhouse to expose prometheus metrics](https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#prometheus).
- **If collecting query_log, ensure that there is a clickhouse user with required permissions**
To create a monitoring user for clickhouse, you can run:
```SQL
CREATE USER monitoring IDENTIFIED BY 'monitoring_password';
GRANT SELECT ON system.query_log to monitoring;
-- If monitoring a clustered deployment, also grant privilege for executing remote queries
GRANT REMOTE ON *.* TO 'monitoring' on CLUSTER 'cluster_name';
```
### Ensure OTEL Collector is running and has access to the Clickhouse server
- **Ensure that an OTEL collector is running in your deployment environment**
If needed, please [install SigNoz OTEL Collector](https://signoz.io/docs/tutorial/opentelemetry-binary-usage-in-virtual-machine/)
If already installed, ensure that the collector version is v0.88.0 or newer.
If collecting logs from system.query_log table, ensure that the collector version is v0.88.22 or newer.
Also ensure that you can provide config files to the collector and that you can set environment variables and command line flags used for running it.
- **Ensure that the OTEL collector can access the Clickhouse server**
In order to collect metrics, the collector must be able to reach clickhouse server and access the port on which prometheus metrics are being exposed.
In order to collect server logs, the collector must be able to read the Clickhouse server log file.
In order to collect logs from query_log table, the collector must be able to reach the server and connect to it as a clickhouse user with required permissions.

View File

@ -0,0 +1,33 @@
<svg version="1.1" id="Layer_1" xmlns:x="ns_extend;" xmlns:i="ns_ai;" xmlns:graph="ns_graphs;"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
viewBox="0 0 50.6 50.6" style="enable-background:new 0 0 50.6 50.6;" xml:space="preserve">
<style type="text/css">
.st0{fill:#FFFFFF;}
</style>
<metadata>
<sfw xmlns="ns_sfw;">
<slices>
</slices>
<sliceSourceBounds bottomLeftOrigin="true" height="24" width="24" x="0" y="0">
</sliceSourceBounds>
</sfw>
</metadata>
<g>
<g>
<path class="st0" d="M0.6,0H5c0.3,0,0.6,0.3,0.6,0.6V50c0,0.3-0.3,0.6-0.6,0.6H0.6C0.3,50.6,0,50.4,0,50V0.6C0,0.3,0.3,0,0.6,0z">
</path>
<path class="st0" d="M11.8,0h4.4c0.3,0,0.6,0.3,0.6,0.6V50c0,0.3-0.3,0.6-0.6,0.6h-4.4c-0.3,0-0.6-0.3-0.6-0.6V0.6
C11.3,0.3,11.5,0,11.8,0z">
</path>
<path class="st0" d="M23.1,0h4.4c0.3,0,0.6,0.3,0.6,0.6V50c0,0.3-0.3,0.6-0.6,0.6h-4.4c-0.3,0-0.6-0.3-0.6-0.6V0.6
C22.5,0.3,22.8,0,23.1,0z">
</path>
<path class="st0" d="M34.3,0h4.4c0.3,0,0.6,0.3,0.6,0.6V50c0,0.3-0.3,0.6-0.6,0.6h-4.4c-0.3,0-0.6-0.3-0.6-0.6V0.6
C33.7,0.3,34,0,34.3,0z">
</path>
<path class="st0" d="M45.6,19.7H50c0.3,0,0.6,0.3,0.6,0.6v10.1c0,0.3-0.3,0.6-0.6,0.6h-4.4c-0.3,0-0.6-0.3-0.6-0.6V20.3
C45,20,45.3,19.7,45.6,19.7z">
</path>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

View File

@ -0,0 +1,59 @@
{
"id": "clickhouse",
"title": "Clickhouse",
"description": "Monitor Clickhouse with metrics and logs",
"author": {
"name": "SigNoz",
"email": "integrations@signoz.io",
"homepage": "https://signoz.io"
},
"icon": "file://icon.svg",
"categories": [
"Database"
],
"overview": "file://overview.md",
"configuration": [
{
"title": "Prerequisites",
"instructions": "file://config/prerequisites.md"
},
{
"title": "Collect Metrics",
"instructions": "file://config/collect-metrics.md"
},
{
"title": "Collect Server Logs",
"instructions": "file://config/collect-logs.md"
},
{
"title": "Collect Query Logs",
"instructions": "file://config/collect-query-logs.md"
}
],
"assets": {
"logs": {
"pipelines": []
},
"dashboards": [
"file://assets/dashboards/overview.json"
],
"alerts": []
},
"connection_tests": {
"logs": {
"op": "AND",
"items": [
{
"key": {
"type": "tag",
"key": "source",
"dataType": "string"
},
"op": "=",
"value": "clickhouse"
}
]
}
},
"data_collected": "file://data-collected.json"
}

View File

@ -0,0 +1,7 @@
### Monitor Clickhouse with SigNoz
Collect key Clickhouse metrics and view them with an out of the box dashboard.
Collect and parse Clickhouse logs to populate timestamp, severity, and other log attributes for better querying and aggregation.
Collect clickhouse query logs from system.query_log table and view them in SigNoz