单机事务数据库内核。
围绕执行tpch query而设计的,有计算引擎和存储引擎。
文档:快速实现数据库内核
两个bin:
plandb:psql交互执行
tester:非交互。
- create schema
创建schema. 内置的schema是
public
。
create schema if not exists s1;
- create table 创建表。
CREATE TABLE NATION ( N_NATIONKEY INTEGER NOT NULL,
N_NAME VARCHAR(25) /*CHAR(25)*/ NOT NULL,
N_REGIONKEY INTEGER NOT NULL,
N_COMMENT VARCHAR(152),
PRIMARY KEY (N_NATIONKEY)
);
- copy 从parquet,csv文件中导入数据到表中。
copy nation
from '/home/pengzhen/Documents/tpch-parquet/nation.parquet'
with (FORMAT 'parquet');
-
insert ... values
-
tpch 22条query
tpch query 1
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '112 day'
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus
;
使用psql连接交互执行。
make plandb
./plandb
psql -h 127.0.0.1
make tester
配置文件tester.toml
配置文件搜索路径。以下路径都没有配置文件,报错退出。
./
./etc/tpch/1g/
执行支持的语句。不限于tpch的query.
tester tpch1gddl --ddl 'sql'
创建tpch表并从parquet中导入数据。
tester tpch1gddl --ddl "
CREATE TABLE NATION ( N_NATIONKEY INTEGER NOT NULL,
N_NAME VARCHAR(25) /*CHAR(25)*/ NOT NULL,
N_REGIONKEY INTEGER NOT NULL,
N_COMMENT VARCHAR(152),
PRIMARY KEY (N_NATIONKEY)
);
CREATE TABLE REGION ( R_REGIONKEY INTEGER NOT NULL,
R_NAME VARCHAR(25) /*CHAR(25)*/ NOT NULL,
R_COMMENT VARCHAR(152),
PRIMARY KEY (R_REGIONKEY)
);
CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR VARCHAR(25) /*CHAR(25)*/ NOT NULL,
P_BRAND VARCHAR(10) /*CHAR(10)*/ NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER VARCHAR(10) /*CHAR(10)*/ NOT NULL,
P_RETAILPRICE DECIMAL(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL,
PRIMARY KEY (P_PARTKEY)
);
CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER NOT NULL,
S_NAME VARCHAR(25) /*CHAR(25)*/ NOT NULL,
S_ADDRESS VARCHAR(40) NOT NULL,
S_NATIONKEY INTEGER NOT NULL,
S_PHONE VARCHAR(15) /*CHAR(15)*/ NOT NULL,
S_ACCTBAL DECIMAL(15,2) NOT NULL,
S_COMMENT VARCHAR(101) NOT NULL,
PRIMARY KEY (S_SUPPKEY)
);
CREATE TABLE PARTSUPP ( PS_PARTKEY INTEGER NOT NULL,
PS_SUPPKEY INTEGER NOT NULL,
PS_AVAILQTY INTEGER NOT NULL,
PS_SUPPLYCOST DECIMAL(15,2) NOT NULL,
PS_COMMENT VARCHAR(199) NOT NULL,
PRIMARY KEY (PS_PARTKEY, PS_SUPPKEY)
);
CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER NOT NULL,
C_NAME VARCHAR(25) NOT NULL,
C_ADDRESS VARCHAR(40) NOT NULL,
C_NATIONKEY INTEGER NOT NULL,
C_PHONE VARCHAR(15) /*CHAR(15)*/ NOT NULL,
C_ACCTBAL DECIMAL(15,2) NOT NULL,
C_MKTSEGMENT VARCHAR(10) /*CHAR(10)*/ NOT NULL,
C_COMMENT VARCHAR(117) NOT NULL,
PRIMARY KEY (C_CUSTKEY)
);
CREATE TABLE ORDERS ( O_ORDERKEY BIGINT NOT NULL,
O_CUSTKEY INTEGER NOT NULL,
O_ORDERSTATUS VARCHAR(1)/*CHAR(1)*/ NOT NULL,
O_TOTALPRICE DECIMAL(15,2) NOT NULL,
O_ORDERDATE DATE NOT NULL,
O_ORDERPRIORITY VARCHAR(15) /*CHAR(15)*/ NOT NULL,
O_CLERK VARCHAR(15) /*CHAR(15)*/ NOT NULL,
O_SHIPPRIORITY INTEGER NOT NULL,
O_COMMENT VARCHAR(79) NOT NULL,
PRIMARY KEY (O_ORDERKEY)
);
CREATE TABLE LINEITEM ( L_ORDERKEY BIGINT NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY INTEGER /*DECIMAL(15,2)*/ NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG VARCHAR(1) NOT NULL,
L_LINESTATUS VARCHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT VARCHAR(25) /*CHAR(25)*/ NOT NULL,
L_SHIPMODE VARCHAR(10) /*CHAR(10)*/ NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL,
PRIMARY KEY (L_ORDERKEY, L_LINENUMBER)
);
copy nation
from '/home/pengzhen/Documents/tpch-parquet/nation.parquet'
with (FORMAT 'parquet');
copy region
from '/home/pengzhen/Documents/tpch-parquet/region.parquet'
with (FORMAT 'parquet');
copy part
from '/home/pengzhen/Documents/tpch-parquet/part.parquet'
with (FORMAT 'parquet');
copy supplier
from '/home/pengzhen/Documents/tpch-parquet/supplier.parquet'
with (FORMAT 'parquet');
copy partsupp
from '/home/pengzhen/Documents/tpch-parquet/partsupp.parquet'
with (FORMAT 'parquet');
copy customer
from '/home/pengzhen/Documents/tpch-parquet/customer.parquet'
with (FORMAT 'parquet');
copy orders
from '/home/pengzhen/Documents/tpch-parquet/orders.parquet'
with (FORMAT 'parquet');
copy lineitem
from '/home/pengzhen/Documents/tpch-parquet/lineitem.parquet'
with (FORMAT 'parquet');
"
tester help tpch1g
//测试tpch query
tester tpch1g
--query_id int
运行指定序号的query。范围[1,22]。为0时,按顺序执行22条query。
--data_path string
tpch1g 数据位置
--data_format string
tpch1g 数据格式。csv,parquet
--result_path string
query结果位置
--need_headline bool
query结果第一行为headline行
tpch1g测试结果
**注意:在忽略精度和结果标题前提下,进行对比
tpch 1g qX | status | 与duckdb相同 | 与mo相同 |
---|---|---|---|
q1 | right | y | y |
q2 | right | n (s_ddress,s_comment 不同) | y |
q3 | right | y | y |
q4 | right | y | y |
q5 | right | y | y |
q6 | right | y | y |
q7 | right | y | y |
q8 | right | y | y |
q9 | right | y | y |
q10 | right (use topN further) | n (c_address,c_comment 不同) | y |
q11 | right | y | y |
q12 | right | y | y |
q13 | almost right. (duckdb convert left join to inner join) | n | almost y. count(NULL) is diff |
q14 | right | y | y |
q15 | right | n (s_address 不同) | y |
q16 | right | y | y |
q17 | right | y | y |
q18 | right | y | y |
q19 | right | y | y |
q20 | right | n (s_address 不同) | y |
q21 | right | y | y |
q22 | right | y | y |
**