优质项目源码阅读,来自:
https://github.com/vectorengine/vectorsql
NOTICE: This project have moved to Databend
VectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.
- High Performance
- High Scalability
- High Reliability
$git clone https://github.com/vectorengine/vectorsql
$cd vectorsql
$make build
$./bin/vectorsql-server -c conf/vectorsql-default.toml
2020/01/27 19:02:39.245654 [DEBUG] Database->Attach Table:system.tables, engine:SYSTEM_TABLES <attachTable@database_system.go:116>
2020/01/27 19:02:39.245670 [DEBUG] Database->Attach Table:system.databases, engine:SYSTEM_DATABASES <attachTable@database_system.go:116>
2020/01/27 19:02:39.245680 [INFO] Database->Load Database:system <[email protected]:110>
2020/01/27 19:02:39.245794 [INFO] Listening for connections with native protocol (tcp)::9000 <[email protected]:33>
2020/01/27 19:02:39.245806 [INFO] Servers start... <[email protected]:62>
- clickhouse-client
$clickhouse-client --compression=0
VectorSQL :) SELECT SUM(IF(status!=200, 1, 0)) AS errors, SUM(IF(status=200, 1, 0)) as success, (errors/COUNT(server)) AS error_rate,(success/COUNT(server)) as success_rate, (SUM(response_time)/COUNT(server)) AS load_avg, MIN(response_time), MAX(response_time), path, server FROM logmock(rows->15) GROUP BY server, path HAVING errors>0 ORDER BY server ASC, load_avg DESC;
SELECT
SUM(IF(status != 200, 1, 0)) AS errors,
SUM(IF(status = 200, 1, 0)) AS success,
errors / COUNT(server) AS error_rate,
success / COUNT(server) AS success_rate,
SUM(response_time) / COUNT(server) AS load_avg,
MIN(response_time),
MAX(response_time),
path,
server
FROM logmock(rows -> 15)
GROUP BY
server,
path
HAVING errors > 0
ORDER BY
server ASC,
load_avg DESC
┌─errors─┬─success─┬─error_rate─┬─success_rate─┬─load_avg─┬─MIN(response_time)─┬─MAX(response_time)─┬─path───┬─server──────┐
│ 2 │ 1 │ 0.6667 │ 0.3333 │ 12 │ 10 │ 13 │ /login │ 192.168.0.1 │
│ 1 │ 5 │ 0.1667 │ 0.8333 │ 11.1667 │ 10 │ 12 │ /index │ 192.168.0.1 │
│ 1 │ 3 │ 0.25 │ 0.75 │ 11.25 │ 10 │ 14 │ /index │ 192.168.0.2 │
│ 1 │ 1 │ 0.5 │ 0.5 │ 11 │ 10 │ 12 │ /login │ 192.168.0.2 │
└────────┴─────────┴────────────┴──────────────┴──────────┴────────────────────┴────────────────────┴────────┴─────────────┘
↓ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.)
4 rows in set. Elapsed: 0.005 sec.
- http-client
curl -XPOST http://127.0.0.1:8123 -d "SELECT SUM(IF(status!=200, 1, 0)) AS errors, SUM(IF(status=200, 1, 0)) as success, (errors/COUNT(server)) AS error_rate,(success/COUNT(server)) as success_rate, (SUM(response_time)/COUNT(server)) AS load_avg, MIN(response_time), MAX(response_time), path, server FROM logmock(rows->15) GROUP BY server, path HAVING errors>0 ORDER BY server ASC, load_avg DESC"
2 1 0.6667 0.3333 12.0000 10 13 /login 192.168.0.1
1 5 0.1667 0.8333 11.1667 10 12 /index 192.168.0.1
1 3 0.2500 0.7500 11.2500 10 14 /index 192.168.0.2
1 1 0.5000 0.5000 11.0000 10 12 /login 192.168.0.2
Query language | Current version | Future versions | Example |
---|---|---|---|
Scans by Value | + | + | SELECT a,b |
Scans by Expression | + | + | SELECT IF(a>2,a,b),SUM(a) |
Filter by Value | + | + | WHERE a>10 |
Filter by Expression | + | + | WHERE a>(b+10) |
Group-Aggregate by Value | + | + | GROUP BY a |
Group-Aggregate by Expression | + | + | GROUP BY (a+1) |
Group-Having by Value | + | + | HAVING count_a>2 |
Group-Having by Expression | + | + | HAVING (count_a+1)>2 |
Order by Value | + | + | ORDER BY a desc |
Order by Expression | + | + | ORDER BY (a+b) |
Window Functions | - | + | |
Common Table Expressions | - | + | |
Join | - | + |
- Dataset: 10,000,000 (10 Million)
- Hardware: 16vCPUx16G KVM Cloud Instance
- Benchmark
Query | Cost(second) |
---|---|
SELECT COUNT(id) FROM testdata | 0.269s |
SELECT COUNT(id) FROM testdata WHERE id!=0 | 0.438s |
SELECT SUM(data1) FROM testdata | 0.287s |
SELECT SUM(data1) AS sum, COUNT(data1) AS count, sum/count AS avg FROM testdata | 1.814s |
SELECT MAX(id), MIN(id) FROM testdata | 0.473s |
SELECT COUNT(data1) AS count, data1 FROM testdata GROUP BY data1 ORDER BY count DESC LIMIT 10 | 0.728s |
SELECT email FROM testdata WHERE email like '%[email protected]%' LIMIT 1 | 0.076s |
SELECT COUNT(email) FROM testdata WHERE email like '%[email protected]%' | 1.470s |
SELECT data1 AS x, x - 1, x - 2, x - 3, count(data1) AS c FROM testdata GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10 | 2.396s |