大家好,我是砸锅。一个摸鱼八年的后端开发。熟悉 Go、Lua。第十五天还是继续和大家一起学习 Rust😊

設計一個 SQL 查询工具

流程大部分都是从各种数据源里操作数据,例如数据获取、过滤、投影、排序。首先需要一个 SQL 解析器,然后需要一个将数据源加载为 DataFrame。所以选定了 sqlparser-rs polars ,然后接下来就是将 sqlparser 解析出来的抽象语法树(Abstract Syntax Tree),映射到 polars 的 DataFrame 的操作

抽象语法树是用来描述复杂语法规则的工具,SQL 或者 DSL、编程语言都有通过 AST 描述,polars 内部也有自己的 AST,所以 SQL 查询工具的需求本质就是如何将一个 SQL AST 转换成另一个 DataFrame AST,在两个数据结构之间进行转换

对于不熟悉的第三方库,可以先写 example 测试一下

Cargo.toml

[package] name = "queryer" version = "0.1.0" edition = "2021" [[example]] name = "dialect" [dependencies] anyhow = "1" # 错误处理 async-trait = "0.1" # 允许 trait 里有 async fn sqlparser = "0.10" # SQL 解析器 polars = { version = "0.15" , features = [ "json" , "lazy" ] } # DataFrame 库 reqwest = { version = "0.11" , default-features = false , features = [ "rustls-tls" , ] } # 我们的老朋友 HTTP 客户端 tokio = { version = "1" , features = [ "fs" ] } # 我们的老朋友异步库,我们这里需要异步文件处理 tracing = "0.1" # 日志处理 [dev-dependencies] tracing-subscriber = "0.2" # 日志处理 tokio = { version = "1" , features = [ "full" , ] } # 在 example 下我们需要更多的 tokio feature

dialect.rs

use sqlparser::{dialect::GenericDialect, parser::Parser};
fn main() {
    tracing_subscriber::fmt::init();
    let sql = "SELECT a a1, b, 123, myfunc(b), * \
    FROM data_source \
    WHERE a > b AND b < 100 AND c BETWEEN 10 AND 20 \
    ORDER BY a DESC, b \
    LIMIT 50 OFFSET 10";
    let ast = Parser::parse_sql(&GenericDialect::default(), sql);
    println!("{:#?}", ast)

执行结果:

queryer git:(master) ✗ cargo run --example dialect
    Finished dev [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/examples/dialect`
        Query(
            Query {
                with: None,
                body: Select(
                    Select {
                        distinct: false,
                        top: None,
                        projection: [
                            ExprWithAlias {
                                expr: Identifier(
                                    Ident {
                                        value: "a",
                                        quote_style: None,
                                alias: Ident {
                                    value: "a1",
                                    quote_style: None,

完整代码:github.com/tyrchen/gee…

最终解析效果:

➜  queryer git:(master) ✗ cargo run --example covid --quiet
Ok(shape: (13, 5)
+-----------------------+--------------+-----------+--------------+------------+
| location              | total_cases  | new_cases | total_deaths | new_deaths |
| ---                   | ---          | ---       | ---          | ---        |
| str                   | f64          | f64       | f64          | f64        |
+=======================+==============+===========+==============+============+
| "Asia"                | 2.08859351e8 | 8.4798e4  | 1.605175e6   | 377        |
+-----------------------+--------------+-----------+--------------+------------+
| "Colombia"            | 6.352923e6   | 4567      | 1.42186e5    | 101        |
+-----------------------+--------------+-----------+--------------+------------+
| "Germany"             | 3.7739472e7  | 1.1317e4  | 1.65441e5    | 127        |
+-----------------------+--------------+-----------+--------------+------------+
| "High income"         | 4.26794547e8 | 1.01266e5 | 2.838402e6   | 376        |
+-----------------------+--------------+-----------+--------------+------------+
| ...                   | ...          | ...       | ...          | ...        |
+-----------------------+--------------+-----------+--------------+------------+
| "Spain"               | 1.3731478e7  | 8801      | 1.18434e5    | 251        |
+-----------------------+--------------+-----------+--------------+------------+
| "Sweden"              | 2.693458e6   | 2985      | 2.3279e4     | 259        |
+-----------------------+--------------+-----------+--------------+------------+
| "United Kingdom"      | 2.4274361e7  | 1.5121e4  | 2.17262e5    | 1007       |
+-----------------------+--------------+-----------+--------------+------------+
| "Upper middle income" | 1.4404686e8  | 2.0479e4  | 2.592276e6   | 194        |
+-----------------------+--------------+-----------+--------------+------------+
| "World"               | 6.70246597e8 | 1.22817e5 | 6.823644e6   | 588        |
+-----------------------+--------------+-----------+--------------+------------+
➜  queryer git:(master) ✗
    后端要努力
        后端开发 @ 某互联网公司
      
粉丝