The Streams API

Collectors in Depth

15 min Lesson 7 of 13

Collectors in Depth

By the time you reach this lesson you already know how to call collect(Collectors.toList()) to materialise a stream into a list. That is just the surface. The java.util.stream.Collectors utility class ships with over twenty factory methods, and four of them — groupingBy, partitioningBy, joining, and counting — cover the majority of real-world aggregation work. Mastering them lets you replace loops that span a dozen lines with a single, readable expression.

counting — the simplest aggregation

Collectors.counting() is a downstream collector that counts the elements flowing into it. On its own it is not very interesting — you would just call stream.count() — but it becomes powerful when composed inside another collector.

import java.util.List;
import java.util.stream.Collectors;

List<String> words = List.of("apple", "fig", "banana", "avocado", "blueberry", "date");

long total = words.stream().collect(Collectors.counting());
System.out.println(total); // 6

You will see counting() again when we look at groupingBy.

groupingBy — splitting a stream into buckets

Collectors.groupingBy(classifier) partitions stream elements into a Map<K, List<V>> where every element that produces the same key lands in the same bucket.

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

record Employee(String name, String department, double salary) {}

List<Employee> employees = List.of(
    new Employee("Alice",   "Engineering", 95_000),
    new Employee("Bob",     "Engineering", 88_000),
    new Employee("Carol",   "Marketing",   72_000),
    new Employee("Dave",    "Marketing",   68_000),
    new Employee("Eve",     "HR",          61_000)
);

Map<String, List<Employee>> byDept =
    employees.stream()
             .collect(Collectors.groupingBy(Employee::department));

byDept.forEach((dept, list) ->
    System.out.println(dept + ": " + list.stream()
                                         .map(Employee::name)
                                         .toList()));
// Engineering: [Alice, Bob]
// Marketing:   [Carol, Dave]
// HR:          [Eve]

The real power arrives when you add a downstream collector as a second argument. Instead of collecting the bucket members into a list, you can aggregate them further:

// Count employees per department
Map<String, Long> countByDept =
    employees.stream()
             .collect(Collectors.groupingBy(
                 Employee::department,
                 Collectors.counting()
             ));
// {Engineering=2, Marketing=2, HR=1}

// Average salary per department
Map<String, Double> avgSalaryByDept =
    employees.stream()
             .collect(Collectors.groupingBy(
                 Employee::department,
                 Collectors.averagingDouble(Employee::salary)
             ));
// {Engineering=91500.0, Marketing=70000.0, HR=61000.0}

Multi-level grouping: the downstream collector can itself be another groupingBy. You can create Map<String, Map<String, Long>> structures — department then seniority level, for example — with no imperative loops at all.

partitioningBy — a boolean split

Collectors.partitioningBy(predicate) is a specialised form of groupingBy where the key is always a boolean. The result is a Map<Boolean, List<T>> with exactly two entries: true and false.

Map<Boolean, List<Employee>> highEarners =
    employees.stream()
             .collect(Collectors.partitioningBy(
                 e -> e.salary() >= 80_000
             ));

System.out.println("High earners: " + highEarners.get(true)
                                                  .stream()
                                                  .map(Employee::name)
                                                  .toList());
// High earners: [Alice, Bob]

System.out.println("Others: " + highEarners.get(false)
                                            .stream()
                                            .map(Employee::name)
                                            .toList());
// Others: [Carol, Dave, Eve]

Like groupingBy, it accepts a downstream collector as a second argument:

Map<Boolean, Long> counts =
    employees.stream()
             .collect(Collectors.partitioningBy(
                 e -> e.salary() >= 80_000,
                 Collectors.counting()
             ));
// {false=3, true=2}

When to choose partitioningBy over groupingBy: whenever your classifier is inherently binary — active/inactive, pass/fail, above-threshold/below-threshold. partitioningBy makes the intent crystal-clear and always guarantees both keys exist in the result map (even if one bucket is empty), whereas groupingBy only includes keys that actually appear in the data.

joining — assembling strings from a stream

Collectors.joining() concatenates a stream of String elements into a single string. Three overloads are available:

joining() — plain concatenation, no separator.
joining(delimiter) — elements separated by delimiter.
joining(delimiter, prefix, suffix) — wraps the result too.

List<String> tags = List.of("java", "streams", "collectors", "functional");

// plain
String plain = tags.stream().collect(Collectors.joining());
System.out.println(plain); // javastreamscollectorsfunctional

// comma-separated
String csv = tags.stream().collect(Collectors.joining(", "));
System.out.println(csv); // java, streams, collectors, functional

// SQL-style IN clause
String inClause = tags.stream()
                      .collect(Collectors.joining("', '", "('", "')"));
System.out.println(inClause); // ('java', 'streams', 'collectors', 'functional')

joining only works on streams of String. If your stream holds objects you must call .map(Object::toString) (or a more specific mapper) before collecting. Forgetting this causes a compile-time type error.

Composing collectors — a realistic example

Real code often chains all of these together. Suppose you need a report that shows, per department, the comma-separated list of employee names:

Map<String, String> namesByDept =
    employees.stream()
             .collect(Collectors.groupingBy(
                 Employee::department,
                 Collectors.mapping(
                     Employee::name,
                     Collectors.joining(", ")
                 )
             ));

namesByDept.forEach((dept, names) ->
    System.out.println(dept + " -> " + names));
// Engineering -> Alice, Bob
// Marketing   -> Carol, Dave
// HR          -> Eve

Here Collectors.mapping() is used as a downstream adapter: it first maps each Employee to its name (a String), then feeds those strings into joining. This three-level composition replaces what would otherwise be a nested loop with a map of lists, a second loop, and a StringBuilder.

Summary

The four collectors you learned in this lesson unlock the core of data-aggregation work in Java:

counting() — counts elements, most useful as a downstream collector.
groupingBy(classifier) — buckets elements by key; compose a downstream collector to aggregate each bucket.
partitioningBy(predicate) — binary split; always produces both keys; clearer intent than a boolean groupingBy.
joining(delimiter, prefix, suffix) — assembles string streams; requires a string-typed stream.

In the next lesson we will look at numeric streams — IntStream, LongStream, and DoubleStream — and the specialised numeric collectors that complement them.