Java8 收集Stream中的元素

所谓收集Stream中的元素，就是将Stream转换成普通的数据集合，如Array，list，Set等。

1. 转换成Array

Sream::toArray() 方法，传入一个数组构造器，可以指定数组类型，否则生成一个Object[]数组。

Stream<Integer> nums = Stream.of(1, 2, 3);
//Object[] objectArray = nums.toArray();
Integer[] intArray = nums.toArray(Integer[]::new);

2. 转换成Iterator

iterator是Java8之前用于遍历集合元素的，调用 Stream::iterator() 方法，可生成Stream中元素的迭代器iterator。

Stream<String> words = Stream.of("city", "country", "world");
Iterator<String> iterator = words.iterator();
while(iterator.hasNext()) {
  System.out.println(iterator.next());
3. 转换成任意集合对象
Stream::collect(Supplier, BiConsumer, BiConsumer)方法，接收三个参数：
一个能创建目标类型实例的方法，例如HashSet的构造函数。
添加元素到目标中的方法，例如add方法。
将两个对象整合到一起种的方法。
Stream转换成Map（转换成Set和List类似）。[吐槽] Eclipse对复杂类型的推导有问题，我用的Eclipse版本是最新的，Version: Mars.2 (4.5.2)，但是下面的代码，Eclipse无法推导出lambda表达式中的map和otherMap的类型。如我注释掉的代码，在Eclipse中，要强制类型转换，否则编译不过。NetBeans是可以推导出下面Lambda表达式参数的类型。
Stream<String> words = Stream.of("city", "country", "world");
HashMap<String, Integer> resultMap = words.collect(() -> {
  return new HashMap();
}, (map, word) -> {
  map.put(word, word.length());
  //Map.class.cast(map).put(word, word.length());
}, (map, otherMap) -> {
  map.putAll(otherMap);
  //Map.class.cast(map).putAll((Map) otherMap);
目标对象不一定是集合，它可以是一个StringBuilder对象或者一个自己构造的数据集合对象。Stream转换成StringBuilder：
Stream<String> words = Stream.of("city", "country", "world");
StringBuilder builder = words.collect(StringBuilder::new, (strBuilder, s) -> {
  StringBuilder.class.cast(strBuilder).append(s);
}, (strBuilder, otherStrBuilder) -> {
  StringBuilder.class.cast(strBuilder).append(otherStrBuilder.toString());
System.out.println(builder.toString());
4. 转换成Set和List
向collect方法传递三个参数，生成集合对象，有点麻烦。实际中，我们不需要这么做，因为Collectors为常用的收集类型提供了各个工厂方法。
生成list、Set：
stream.collect(Collectors.toList());
stream.collect(Collectors.toSet());
默认情况下，生成的是ArrayList和HashSet。要指定List或Set类型，使用如下方式：
stream.collect(Collectors.toCollection(LinkedList::new));
stream.collect(Collectors.toCollection(TreeSet::new));
5. 字符串拼接
类似Guava中的Joiner。当Stream中的元素都是String时，可以拼接成一个字符串，并指定分隔符，拼接后字符串的前缀、后缀。
Stream<String> strs = Stream.of("Hello", "the", "stream");
// No delimiter
String hail1 = strs.collect(Collectors.joining());
// Result: Hellothestream
// Specify a delimiter
String hail2 = strs.collect(Collectors.joining("， "));
// Result: Hello， the， stream
// Specify a delimiter, prefix and suffix.
String hail3 = strs.collect(Collectors.joining(", ", "__", "**"));
// Result: __Hello, the, stream**
6. 普通Stream使用SummaryStatistics
对于原子类型的Stream，调用summaryStatistics()方法，会生成对应的IntSummaryStatistics, DoubleSummaryStatistics, LongSummaryStatictics。
普通的Stream，没有summaryStatistics()方法，我们可以使用Collectors中的summarizingInt, summarizingDouble和summarizingLong，生成对应的SummaryStatistics对象。
Stream<Beef> beefs = ...;
IntSummaryStatistics beefPriceSummary = 
    beefs.collect(Collectors.summarizingInt(Beef::getPrice));
beefPriceSummary.getAverage();
beefPriceSummary.getSum();
beefPriceSummary.getMax();
beefPriceSummary.getMin();
beefPriceSummary.getCount();
7. 转换成Map
Collectors.toMap方法：
参数1：指定map的key。
参数2：指定map的value。
[重载] 参数3：当多个元素有相同的key时，如何处理。默认抛出异常“ java.lang.IllegalStateException”。
[重载] 参数4：生成什么样的Map。默认生成HashMap。
Stream<Beef> beefs = Stream.of(new Beef(1, 12.5), new Beef(1, 12), new Beef(2, 8.8));
// Specify map key and map value.
Map<Integer, Double> map1 = beefs.collect(Collectors.toMap(
		Beef::getId,
		Beef::getPrice));
// Handle two values have same key.
Map<Integer, Double> map2 = beefs.collect(Collectors.toMap(
		Beef::getId,
		Beef::getPrice,
		(existingValue, newVaule) -> existingValue));
// Specify using TreeMap other than default HashMap.
Map<Integer, Double> map3 = beefs.collect(Collectors.toMap(
		Beef::getId,
		Beef::getPrice,
		(existingValue, newValue) -> { throw new IllegalStateException(); },
		TreeMap::new));
上面三个toMap方法，对应有三个toConcurrentMap方法，用于并行Stream，提高效率。Stream只提供了并行生成Map的方法，没有并行生成List和Set的方法。Function.identity()指元素本身，相当于(element) –> element;
Stream words = Stream.of("Sun", "Earth", "Moon");
ConcurrentHashMap<String, Integer> wordMap = 
    words.parallel().collect(Collectors.toConcurrentMap(
        Function.identity(),
        String::length,
        (existingValue, newValue) –> newValue,
        ConcurrentHashMap::new));
8. 对Stream分组
下面所有例子，Stream中的数据都是Locale对象，所以先认识下Locale。
java.util.Locale是JDK自带的一个数据类，构造函数第一个参数是Language Code；第二个参数是Country Code；第三个参数不常用，作为扩展。例如中国大陆，Language Code是ZH，Country Code是CN，new Locale(“ZH”, “CN”)就构造一个属于中国大陆的Locale对象。Locale.getAvailableLocales()返回Java所支持的所有Locale数组。
说明：Language Code 和 Country Code是由国际标准化组织的ISO 3166制定的，方便国际上的交流。台湾是new Locale(“ZH”, “TW”)，香港是new Locale(“ZH”, “HK”)。
8.1 根据国家分组
groupingBy(Function)方法，会产生一个以Function返回值作为Key的Map。下面的例子，生成Map的KEY是Country Code。
Stream<Locale> locales = Stream.of(Locale.getAvailableLocales());
Map<String, List<Locale>> countryToLocales = 
    locales.collect(Collectors.groupingBy(Locale::getCountry));
System.out.println(countryToLocales.get("US"));
// Output: [en_US, es_US]
P.S. groupingByConcurrent()方法，用于并发Stream，提高处理速度。
8.2 根据英语，分成两组：说英语和不说英语
使用Collectors.partitionBy(Predicate)，Predicate函数返回Boolea值，True为一组，False为一组。
Map<Boolean, List<Locale>> enToLocales = locales.collect(Collectors.partitioningBy(
    locale -> locale.getLanguage().equals("en")));
System.out.print(enToLocales.get(true).size());
// Output: 12
8.3 分组的同时，处理分组的数据
上面的例子，分组后的Value是一个List，包含该组的所有元素。可以提供一个转换器，更灵活地控制分组后的Value。为了代码简洁，下面的例子，假设都静态引入了Java.util.stream.Collectors.*。
使用Set收集分组后的Value
默认的Value集合是List，有时Set更好，可以去掉重复。
Map<String, Set<Locale>> countryToLocaleSet = 
    locales.collect(groupingBy(Locale::getCountry, toSet()));
统计每组的个数
counting方法，统计每组元素的个数。下面的例子：统计每个国家，说几种语言。
Map<String, Long> countryToLocaleCounts = 
    locales.collect(groupingBy(Locale::getCountry, counting()));
对每组的元素或元素的属性求和
summing(Int | Long | Double)方法，接收一个Function，会对Function的返回值求和。下面的例子：统计每个State的人口总和。
Map<State, Integer> stateToPopulation = 
    cities.collect(groupingBy(City::getState, summingInt(City::getPopulation)));
找每组的最大值和最小值
maxBy 和 minBy方法，接收一个比较器，找出最大元素和最小元素。下面的例子：找出每个州人口最多的城市。
Map<State, Optional<City>> stateToLargeCity = 
    cities.collect(groupingBy(City::getState, maxBy(Comparator.comparing(City::getPopulation))));
对分组数据，进行mapping
mapping(mapper, downstream)，先将元素进行map（转换），然后将转换后的数据进行收集处理。下面的例子：统计每个State，名字最长的城市。
Map<State, Optional<String>> stateToLongestCityName = 
    cities.collect(
        groupingBy(City::getState, 
        mapping(City::getName, 
            maxBy(Comparator.comparing(String::length)))));
上面统计每个国家所有语言集合的例子，使用mapping是一个更好的解决方案：
Map<String, Set<String>> countryToLanguages = 
    locales.collect(groupingBy(Locale::getCountry, mapping(l -> l.getLanguage(), toSet())));
对每个分组，生成SummaryStatistics
summaringInt, summaringLong, summaringDouble方法，分别生成IntSummaryStatistics, LongSummaryStatistics和DoubleSummaryStatistic。
在groupingBy时，直接对元素或元素某个属性生成SummaryStatistics。下面的例子：生成每个State的各个城市人口数的SummaryStatistics。
Map<State, IntSummaryStatistics> stateToCityPopulationSummary = 
    cities.collect(groupingBy(City::getState, summarizingInt(city -> city.getPopulation())));
也可以在mapping的时候生成，下面的例子和上面的结果一样：
Map<State, IntSummaryStatistics> stateToCityPopulationSummary = 
    cities.collect(groupingBy(City::getState, 
                   mapping(City::getPopulation, summarizingInt(p -> p))));
对分组数据，进行reducing
reducing(binaryOperator)：对元素本身聚合
reducing(identity, binaryOperator)：指定第一个数据，进行聚合
reducing(indentity, mapper, binaryOperator)：指定第一个数据，指定map函数，对map后的数据聚合
下面的例子：根据City所在的State分组，将每组城市名字用逗号（，）拼接成一个字符串。
Map<State, String> stateToCityNames = 
    cities.collect(groupingBy(
        City::getState, reducing("", City::getName, (c1, c2) -> c1 + "," + c2)));
downstream收集器总结
使用downstream收集器可以产生非常复杂的表达式，只有在使用groupingBy或者partitioningBy产生“downstream”map时，才使用它们，其它情况下，直接对Stream进行操作便可。
					Category: Java8
								标签：Java