Grouping queries in CakePHP 4

Posted on December 26th 2020 in PHP by Pim Debaere.

Grouping queries in CakePHP 4

Grouping queries in SQL is easy, so it should come as no surprise that it doesn't have to be complex in CakePHP either. Yet there are several ways, two of which I am discussing today. But let's start with a description of the situation.

Situation

We have a database containing information of television shows. It has different tables, like actors, shows and the intermediate table actors_shows. The latter is necessary because there is a many-to-many relationship between the first two and it also includes the role(s) of an actor in the show.

A simple GROUP BY

We now want to write a query, where we get a simple overview of the television shows in which a particular actor played.

1
2
3
4
5
6
7
SELECT shows.title
FROM actors
JOIN actors_shows
  ON actors_shows.actor_id = actors.id
JOIN shows 
  ON shows.id = actors_shows.show_id
WHERE actors.name = 'Seth MacFarlane'
Fig. 1 – Select all shows that include a given actor.

Running the above query, we soon notice that in this case certain series appear more than once, as Seth MacFarlane takes on both the roles of Stan Smith and Roger in the animated series American Dad!. We solve this quickly by adding a GROUP BY at the end of the query.

1
2
3
4
5
6
7
8
SELECT shows.title
FROM actors
JOIN actors_shows
  ON actors_shows.actor_id = actors.id
JOIN shows 
  ON shows.id = actors_shows.show_id
WHERE actors.name = 'Seth MacFarlane'
GROUP BY actors_shows.show_id
Fig. 2 – Select all shows that include a given actor, grouped by the show.

In CakePHP we can write this query as follows.

1
2
3
4
5
6
7
8
9
10
11
12
$actor = $this->Actors->find()
  ->select(['Actors.id'])
  ->where(['Actors.name' => 'Seth MacFarlane'])
  ->contain(['ActorsShows' => function(Query $query) {
    return $query
      ->select(['ActorsShows.actor_id', 'ActorsShows.serie_id'])
      ->group(['ActorsShows.serie_id'])
      ->contain(['Shows' => function(Query $query) {
        return $query
          ->select(['Shows.title']);
        }]);
      }]);
Fig. 3 – Grouping query in CakePHP.

The above gives the same result as in SQL, but what if we have an extra field in the table actors_shows that also contains the role of the actor. We may want to group by television series, but we also want to be able to see the role(s).

The function GROUP_CONCAT

In other words, we now want to write a query, where we have an overview of all the roles that a particular actor took on, grouped by television series. In SQL there is a function GROUP_CONCAT for this.

1
2
3
4
5
6
7
8
SELECT series.title, GROUP_CONCAT(actors_series.role SEPARATOR '/') AS roles
FROM actors
JOIN actors_series
  ON actors_series.actor_id = actors.id
JOIN series 
  ON series.id = actors_series.serie_id 
WHERE actors.name = 'Seth MacFarlane'
GROUP BY actors_series.serie_id
Fig. 4 – SQL query to merge roles per television series.

Note that we can also provide GROUP_CONCAT with a separator, such as a forward slash (/) in this case. In no time we transformed this and added it to the piece of CakePHP we wrote earlier, as it works the same way as other standard embedded SQL functions. We call the function with the identifier in an array as parameter.

1
2
3
4
5
6
7
8
9
10
11
12
$actor = $this->Actors->find()
  ->select(['Actors.id'])
  ->where(['Actors.name' => 'Seth MacFarlane'])
  ->contain(['ActorsShows' => function(Query $query) {
    return $query
      ->select(['ActorsShows.actor_id', 'ActorsShows.serie_id', 'role' => $query->func()->GROUP_CONCAT(['role SEPARATOR \'/\'' => 'identifier'])])
      ->group(['ActorsShows.serie_id'])
      ->contain(['Shows' => function(Query $query) {
        return $query
          ->select(['Shows.title']);
        }]);
      }]);
Fig. 5 – Using GROUP_CONCAT in CakePHP.