MaxComputeSQL支持的聚合函数的命令格式,参数及示例_云原生大数据计算服务 MaxCompute(MaxCompute)-阿里云帮助中心

聚合（Aggregate）函数的输入与输出是多对一的关系，即将多条输入记录聚合成一条输出值，可以与 MaxCompute SQL 中的 group by 语句配合使用。本文为您提供 MaxCompute SQL 支持的聚合函数的命令格式、参数说明及示例，指导您使用聚合函数完成开发。

MaxCompute SQL 支持的聚合函数如下。

函数	功能
ANY_VALUE	在指定范围内任选一个值返回。
APPROX_DISTINCT	返回输入的非重复值的近似数目。
ARG_MAX	返回指定列的最大值对应行的列值。
ARG_MIN	返回指定列的最小值对应行的列值。
AVG	计算平均值。
BITWISE_AND_AGG	计算输入 Value 的 bit AND 聚合值。
BITWISE_OR_AGG	计算输入 Value 的 bit OR 聚合值。
COLLECT_LIST	将指定的列聚合为一个数组。
COLLECT_SET	将指定的列聚合为一个无重复元素的数组。
COUNT	计算记录数。
COUNT_IF	计算指定表达式为 True 的记录数。
COVAR_POP	计算指定两个数值列的总体协方差。
COVAR_SAMP	计算指定两个数值列的样本协方差。
HISTOGRAM	构造输入 Map 的 Key 值出现次数的 Map。
MAP_AGG	构造两个输入字段的 Map。
MAP_UNION	对输入 Map 进行 Union 操作来构造输出 Map。
MAP_UNION_SUM	对输入 Map 进行 Union 操作并对相同 Key 的 Value 求和来构造输出 Map。
MAX	计算最大值。
MAX_BY	返回指定列的最大值对应行的列值。
MEDIAN	计算中位数。
MIN	计算最小值。
MIN_BY	返回指定列的最小值对应行的列值。
MULTIMAP_AGG	构造两个输入字段的 Map，第一个字段作为 Map 的 Key，第二个字段构造数组作为 Map 的 Value。
NUMERIC_HISTOGRAM	统计指定列的近似直方图。
PERCENTILE	计算精确百分位数，适用于小数据量。
PERCENTILE_APPROX	计算近似百分位数，适用于大数据量。
STDDEV	计算总体标准差。
STDDEV_SAMP	计算样本标准差。
SUM	计算汇总值。
VAR_SAMP	计算指定数值列的样本方差。
VARIANCE/VAR_POP	计算指定数值列的方差。
WM_CONCAT	用指定的分隔符连接字符串。

注意事项

升级到 MaxCompute 2.0 后，产品扩展了部分函数。如果您用到的函数涉及新数据类型（TINYINT、SMALLINT、INT、FLOAT、VARCHAR、TIMESTAMP 或 BINARY），在使用扩展函数时，需要执行如下语句开启新数据类型开关：

Session 级别：如果使用新数据类型，您需要在 SQL 语句前加上语句 set odps.sql.type.system.odps2=true; ，并与 SQL 语句一起提交执行。
Project 级别：Project Owner 可根据需要对 Project 进行设置，等待 10~15 分钟后才会生效。命令如下。
```
setproject odps.sql.type.system.odps2=true;
```
对 setproject 的详细说明请参见项目空间操作。关于开启 Project 级别数据类型的注意事项，请参见数据类型版本说明。
单个 Worker 里的元素数量不能超过两百万。

在同一条 SQL 语句中同时使用多个聚合函数时，如果项目资源不足，会出现内存溢出问题，请您根据实际业务情况优化 SQL 或购买计算资源。

聚合函数语法

聚合函数的语法声明如下。

<aggregate_name>(<expression>[,...]) [within group (order by <col1>[,<col2>…])] [filter (where <where_condition>)]

<aggregate_name>(<expression>[,...]) ：内建聚合函数或用户自定义聚合函数 UDAF ，具体格式以实际聚合函数语法为准。

within group (order by <col1>[,<col2>…]) ：当聚合函数中携带该表达式时，默认会对 <col1>[,<col2>…] 的输入数据进行升序排列。如果需要降序排列，表达式为 within group (order by <col1>[,<col2>…] [desc]) 。

在使用该表达式时，您需要注意：

仅支持 WM_CONCAT 、 COLLECT_LIST 、 COLLECT_SET 及 UDAF 使用该表达式。
一个 SELECT 语句中如果多个聚合函数携带 within group (order by <col1>[,<col2>…]) 表达式时， order by <col1>[,<col2>…] 必须相同。
如果聚合函数的参数中携带了 DISTINCT 关键字， order by <col1>[,<col2>…] 中只允许使用 DISTINCT 的列。即 order by 的列集合应该是 DISTINCT 列集合的子集，并且 <col1>[,<col2>…] 的字段类型要和聚合函数的入参类型保持一致。

--示例一，对输入数据升序排列后输出。
select 
  wm_concat(',', y) within group (order by y)
from values('k', 1),('k', 3),('k', 2) as t(x, y)
group by x;
--返回结果如下。
+------------+------------+
| x          | _c1        |
+------------+------------+
| k          | 1,2,3      |
+------------+------------+
--示例二，对输入数据降序排列后输出。
select 
  wm_concat(',', y) within group (order by y desc)
from values('k', 1),('k', 3),('k', 2) as t(x, y)
group by x;
--返回结果如下。
+------------+------------+
| x          | _c1        |
+------------+------------+
| k          | 3,2,1      |
+------------+------------+
--示例三
select id,
wm_concat(distinct ',', name) within group (order by name desc)
from values('k', '1'),('k', '3'),('k', '2') as t(id, name)
group by id;
--返回结果如下。
+------------+------------+
| id         | _c1        |
+------------+------------+
| k          | 3,2,1      |
+------------+------------+
--示例四
--由于聚合函数的参数中携带了DISTINCT关键字，此时wm_concat函数中bigint类型的入参sal会被隐式转换为string类型，
--为了与wm_concat函数的入参类型保持一致，则order by sal中需要使用cast将sal转换为string类型，否则会导致报错。
select deptno,
wm_concat(distinct ',', sal) 
within group (order by cast(sal as STRING ) desc) 
from emp group by deptno order by deptno;
--返回结果如下。
+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 5000,2450,1300 |
| 20         | 800,3000,2975,1100 |
| 30         | 950,2850,1600,1500,1250 |
+------------+------------+

--示例一，过滤并聚合数据。
select
  sum(x),
  sum(x) filter (where y > 1),
  sum(x) filter (where y > 2)
  from values(null, 1),(1, 2),(2, 3),(3, null) as t(x, y);
--返回结果如下。
+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 6          | 3          | 2          |
+------------+------------+------------+
--示例二，使用多个聚合函数过滤并聚合数据。
select
  count_if(x > 2),
  sum(x) filter (where y > 1),
  sum(x) filter (where y > 2)
  from values(null, 1),(1, 2),(2, 3),(3, null) as t(x, y);
--返回结果如下。
+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 1          | 3          | 2          |
+------------+------------+------------+

create table if not exists emp
   (empno bigint,
    ename string,
    job string,
    mgr bigint,
    hiredate datetime,
    sal bigint,
    comm bigint,
    deptno bigint);
tunnel upload emp.txt emp;

7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

<aggregate_name>(<expression>[,...]) [filter (where <where_condition>)]

select sum(sal) filter (where deptno=10), sum(sal) filter (where deptno=20), sum(sal) filter (where deptno=30) from emp;

+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 17500      | 10875      | 9400       |
+------------+------------+------------+

```
any_value(<colname>)
```

select any_value(ename) from emp;

+------------+
| _c0        |
+------------+
| SMITH      |
+------------+

select deptno, any_value(ename) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | CLARK      |
| 20         | SMITH      |
| 30         | ALLEN      |
+------------+------------+

```
approx_distinct(<colname>)
```

select approx_distinct(sal) from emp;

+-------------------+
| numdistinctvalues |
+-------------------+
| 12                |
+-------------------+

select deptno, approx_distinct(sal) from emp group by deptno;

+------------+-------------------+
| deptno     | numdistinctvalues |
+------------+-------------------+
| 10         | 3                 |
| 20         | 4                 |
| 30         | 5                 |
+------------+-------------------+

arg_max(<valueToMaximize>, <valueToReturn>)

select arg_max(sal, ename) from emp;

+------------+
| _c0        |
+------------+
| KING       |
+------------+

select deptno, arg_max(sal, ename) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | KING       |
| 20         | SCOTT      |
| 30         | BLAKE      |
+------------+------------+

arg_min(<valueToMinimize>, <valueToReturn>)

select arg_min(sal, ename) from emp;

+------------+
| _c0        |
+------------+
| SMITH      |
+------------+

select deptno, arg_min(sal, ename) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | MILLER     |
| 20         | SMITH      |
| 30         | JAMES      |
+------------+------------+

```
DECIMAL｜DOUBLE  avg(<colname>)
```

输入类型	返回类型
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

select avg(sal) from emp;

+------------+
| _c0        |
+------------+
| 2222.0588235294117 |
+------------+

select deptno, avg(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2916.6666666666665 |
| 20         | 2175.0     |
| 30         | 1566.6666666666667 |
+------------+------------+

```
bigint bitwise_and_agg(bigint value)
```

select id, bitwise_and_agg(v) from
    values (1L, 2L), (1L, 1L), (2L, null), (1L, null) t(id, v) group by id;

+------------+------------+
| id         | _c1        |
+------------+------------+
| 1          | 0          |
| 2          | NULL       |
+------------+------------+

```
bigint bitwise_or_agg(bigint value)
```

select id, bitwise_or_agg(v) from
    values (1L, 2L), (1L, 1L), (2L, null), (1L, null) t(id, v) group by id;

+------------+------------+
| id         | _c1        |
+------------+------------+
| 1          | 3          |
| 2          | NULL       |
+------------+------------+

```
array collect_list(<colname>)
```

select collect_list(sal) from emp;

+------------+
| _c0        |
+------------+
| [800,1600,1250,2975,1250,2850,2450,3000,5000,1500,1100,950,3000,1300,5000,2450,1300] |
+------------+

select deptno, collect_list(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [2450,5000,1300,5000,2450,1300] |
| 20         | [800,2975,3000,1100,3000] |
| 30         | [1600,1250,1250,2850,1500,950] |
+------------+------------+

select deptno, collect_list(distinct sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [1300,2450,5000] |
| 20         | [800,1100,2975,3000] |
| 30         | [950,1250,1500,1600,2850] |
+------------+------------+

```
array collect_set(<colname>)
```

select collect_set(sal) from emp;

+------------+
| _c0        |
+------------+
| [800,950,1100,1250,1300,1500,1600,2450,2850,2975,3000,5000] |
+------------+

select deptno, collect_set(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [1300,2450,5000] |
| 20         | [800,1100,2975,3000] |
| 30         | [950,1250,1500,1600,2850] |
+------------+------------+

```
bigint count([distinct|all] <colname>)
```

select count(*) from emp;

+------------+
| _c0        |
+------------+
| 17         |
+------------+

select deptno, count(*) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 6          |
| 20         | 5          |
| 30         | 6          |
+------------+------------+

select count(distinct deptno) from emp;

+------------+
| _c0        |
+------------+
| 3          |
+------------+

```
bigint count_if(boolean <expr>)
```

select count_if(sal > 1000), count_if(sal <=1000) from emp;

+------------+------------+
| _c0        | _c1        |
+------------+------------+
| 15         | 2          |
+------------+------------+

double covar_pop(<colname1>, <colname2>)

--sal_new为新薪资列。
alter table emp add columns (sal_new bigint);
insert overwrite table emp select empno, ename, job, mgr, hiredate, sal, comm, deptno, sal+1000 from emp;

select covar_pop(sal, sal_new) from emp;

+------------+
| _c0        |
+------------+
| 1594550.1730103805 |
+------------+

select deptno, covar_pop(sal, sal_new) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2390555.5555555555 |
| 20         | 1009500.0  |
| 30         | 372222.2222222222 |
+------------+------------+

double covar_samp(<colname1>, <colname2>)

--sal_new为新薪资列。
alter table emp add columns (sal_new bigint);
insert overwrite table emp select empno, ename, job, mgr, hiredate, sal, comm, deptno, sal+1000 from emp;

select covar_samp(sal, sal_new) from emp;

+------------+
| _c0        |
+------------+
| 1694209.5588235292 |
+------------+

select deptno, covar_samp(sal, sal_new) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2868666.6666666665 |
| 20         | 1261875.0  |
| 30         | 446666.6666666666 |
+------------+------------+

```
map<K, bigint> histogram(K input);
```

select histogram(a) from values
    ('hi'), (null), ('apple'), ('pie'), ('apple') t(a);

+----------------------------+
| _c0                        |
+----------------------------+
| {"pie":1,"hi":1,"apple":2} |
+----------------------------+

```
map<K, V> map_agg(K a, V b);
```

select map_agg(a, b) from
        values (1L, 'apple'), (2L, 'hi'), (null, 'good'), (1L, 'pie') t(a, b);

+------------------------+
| _c0                    |
+------------------------+
| {"2":"hi","1":"apple"} |
+------------------------+

```
map<K, V> map_union(map<K, V> input);
```

select map_union(a) from values
    (map(1L, 'hi', 2L, 'apple', 3L, 'pie')), (map(1L, 'good', 4L, 'this')), (null) t(a);

+-----------------------------------------------+
| _c0                                           |
+-----------------------------------------------+
| {"4":"this","1":"good","2":"apple","3":"pie"} |
+-----------------------------------------------+

map<K, V> map_union_sum(map<K, V> input);

select map_union_sum(a) from values
    (map('hi', 2L, 'apple', 3L, 'pie', 1L)), (map('apple', null, 'hi', 4L)), (null) t(a);

+----------------------------+
| _c0                        |
+----------------------------+
| {"apple":3,"hi":6,"pie":1} |
+----------------------------+

```
max(<colname>)
```

select max(sal) from emp;

+------------+
| _c0        |
+------------+
| 5000       |
+------------+

select deptno, max(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 5000       |
| 20         | 3000       |
| 30         | 2850       |
+------------+------------+

max_by(<valueToReturn>,<valueToMaximize>)

select max_by(ename,sal) from emp;

+------------+
| _c0        |
+------------+
| KING       |
+------------+

select deptno, max_by(ename,sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | KING       |
| 20         | SCOTT      |
| 30         | BLAKE      |
+------------+------------+

double median(double <colname>)
decimal median(decimal <colname>)

输入类型	返回类型
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

select median(sal) from emp;

+------------+
| _c0        |
+------------+
| 1600.0     |
+------------+

select deptno, median(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2450.0     |
| 20         | 2975.0     |
| 30         | 1375.0     |
+------------+------------+

```
min(<colname>)
```

select min(sal) from emp;

+------------+
| _c0        |
+------------+
| 800        |
+------------+

select deptno, min(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1300       |
| 20         | 800        |
| 30         | 950        |
+------------+------------+

min_by(<valueToReturn>,<valueToMinimize>)

 select min_by(ename,sal) from emp;

+------------+
| _c0        |
+------------+
| SMITH      |
+------------+

select deptno, min_by(ename,sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | MILLER     |
| 20         | SMITH      |
| 30         | JAMES      |
+------------+------------+

map<K, array<V>> multimap_agg(K a, V b);

select multimap_agg(a, b) from
        values (1L, 'apple'), (2L, 'hi'), (null, 'good'), (1L, 'pie') t(a, b);

+----------------------------------+
| _c0                              |
+----------------------------------+
| {"2":["hi"],"1":["apple","pie"]} |
+----------------------------------+

map<double key, double value> numeric_histogram(bigint <buckets>,
                                                double <colname>
                                                [, double <weight>])

select numeric_histogram(5, sal) from emp;

+------------+
| _c0        |
+------------+
| {"1328.5714285714287":7.0,"2450.0":2.0,"5000.0":2.0,"875.0":2.0,"2956.25":4.0} |
+------------+

select numeric_histogram(5, sal, deptno) from emp;

+------------+
| _c0        |
+------------+
| {"2944.4444444444443":90.0,"2450.0":20.0,"5000.0":20.0,"890.0":50.0,"1350.0":160.0} |
+------------+

double percentile(bigint <colname>, <p>)
--以数组形式返回多个百分位精确计算结果。
array percentile(bigint <colname>, array(<p1> [, <p2>...]))

select percentile(sal, 0.3) from emp;

+------------+
| _c0        |
+------------+
| 1290.0     |
+------------+

select deptno, percentile(sal, 0.3) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1875.0     |
| 20         | 1475.0     |
| 30         | 1250.0     |
+------------+------------+

set odps.sql.type.system.odps2=true;
select deptno, percentile(sal, array(0.3, 0.5, 0.8)) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [1875.0,2450.0,5000.0] |
| 20         | [1475.0,2975.0,3000.0] |
| 30         | [1250.0,1375.0,1600.0] |
+------------+------------+

double percentile_approx (double <colname>[, double <weight>], <p> [, <B>]))
--以数组形式返回多个百分位近似计算结果。
array<double> percentile_approx (double <colname>
                                 [, double <weight>],
                                 array(<p1> [, <p2>...])
                                 [, <B>])

select percentile_approx(sal, 0.3) from emp;

+------------+
| _c0        |
+------------+
| 1252.5     |
+------------+

select deptno, percentile_approx(sal, 0.3) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1300.0     |
| 20         | 950.0      |
| 30         | 1070.0     |
+------------+------------+

set odps.sql.type.system.odps2=true;
select deptno, percentile_approx(sal, array(0.3, 0.5, 0.8), 1000) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [1300.0,1875.0,3470.000000000001] |
| 20         | [950.0,2037.5,2987.5] |
| 30         | [1070.0,1250.0,1580.0] |
+------------+------------+

select deptno, percentile_approx(sal, deptno, array(0.3, 0.5, 0.8), 1000)
  from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | [1300.0,1875.0,3470.0] |
| 20         | [950.0,2037.5,2987.5] |
| 30         | [1070.0,1250.0,1580.0] |
+------------+------------+

double stddev(double <colname>)
decimal stddev(decimal <colname>)

输入类型	返回类型
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

select stddev(sal) from emp;

+------------+
| _c0        |
+------------+
| 1262.7549932628976 |
+------------+

select deptno, stddev(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1546.1421524412158 |
| 20         | 1004.7387720198718 |
| 30         | 610.1001739241043 |
+------------+------------+

double stddev_samp(double <colname>)
decimal stddev_samp(decimal <colname>)

输入类型	返回类型
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

select stddev_samp(sal) from emp;

+------------+
| _c0        |
+------------+
| 1301.6180541247609 |
+------------+

select deptno, stddev_samp(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1693.7138680032901 |
| 20         | 1123.3320969330487 |
| 30         | 668.3312551921141 |
+------------+------------+

DECIMAL｜DOUBLE｜BIGINT  sum(<colname>)

输入类型	返回类型
TINYINT	BIGINT
SMALLINT	BIGINT
INT	BIGINT
BIGINT	BIGINT
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

select sum(sal) from emp;

+------------+
| _c0        |
+------------+
| 37775      |
+------------+

select deptno, sum(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 17500      |
| 20         | 10875      |
| 30         | 9400       |
+------------+------------+

```
double var_samp(<colname>)
```

select var_samp(sal) from emp;

+------------+
| _c0        |
+------------+
| 1694209.5588235292 |
+------------+

select deptno, var_samp(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2868666.666666667 |
| 20         | 1261875.0  |
| 30         | 446666.6666666667 |
+------------+------------+

double variance(<colname>)
double var_pop(<colname>)

select variance(sal) from emp;
--等效于如下语句。
select var_pop(sal) from emp;

+------------+
| _c0        |
+------------+
| 1594550.1730103805 |
+------------+

select deptno, variance(sal) from emp group by deptno;
--等效于如下语句。
select deptno, var_pop(sal) from emp group by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2390555.5555555555 |
| 20         | 1009500.0  |
| 30         | 372222.22222222225 |
+------------+------------+

string wm_concat(string <separator>, string <colname>)

select wm_concat(',', ename) from emp;

+------------+
| _c0        |
+------------+
| SMITH,ALLEN,WARD,JONES,MARTIN,BLAKE,CLARK,SCOTT,KING,TURNER,ADAMS,JAMES,FORD,MILLER,JACCKA,WELAN,TEBAGE |
+------------+

select deptno, wm_concat(',', ename) from emp group by deptno order by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | CLARK,KING,MILLER,JACCKA,WELAN,TEBAGE |
| 20         | SMITH,JONES,SCOTT,ADAMS,FORD |
| 30         | ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES |
+------------+------------+

select deptno, wm_concat(distinct ',', sal) from emp group by deptno order by deptno;

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1300,2450,5000 |
| 20         | 1100,2975,3000,800 |
| 30         | 1250,1500,1600,2850,950 |
+------------+------------+

select deptno, wm_concat(',',sal) within group(order by sal) from emp group by deptno order by deptno;

+------------+------------+
|deptno|_c1|
+------------+------------+
|10|1300,1300,2450,2450,5000,5000|
|20|800,1100,2975,3000,3000|
|30|950,1250,1250,1500,1600,2850|
+------------+------------+

聚合函数

注意事项

聚合函数语法

示例数据

过滤条件表达式

ANY_VALUE

APPROX_DISTINCT

ARG_MAX

ARG_MIN

AVG

BITWISE_AND_AGG

BITWISE_OR_AGG

COLLECT_LIST

COLLECT_SET

COUNT

COUNT_IF

COVAR_POP

COVAR_SAMP

HISTOGRAM

MAP_AGG

MAP_UNION

MAP_UNION_SUM

MAX

MAX_BY

MEDIAN

MIN

MIN_BY

MULTIMAP_AGG

NUMERIC_HISTOGRAM

PERCENTILE

PERCENTILE_APPROX

STDDEV

STDDEV_SAMP

SUM

VAR_SAMP

VARIANCE/VAR_POP

WM_CONCAT