AoC 2023 D2P1: List of Lists of Lists Extraction

Day 2′s problem 1 requires parsing this format:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green

to determine which rows have no more than 12 red, 13 green, and 14 blue cubes in the bag, then summing the game ID of those rows. In other words, interpreting lists of colors from lists of pulls from a list of games.

my %max = ( red => 12, green => 13, blue => 14);

After my standard opening, define the maximum permissible count of each color. Note Perl’s syntax here: It’s such a common operation to make a hash with string keys that as long as the strings contain no spaces nor special characters that would require disambiguation, the strings needn’t be quoted.

game: while (<>) {

Then loop through all the lines of the input, with a new twist: I have a label to mark the name of this loop.

# Split into header and list of pulls.
my ($i, $pulls) = /^Game\s+(\d+):\s+(.*)/;

Each row should start with ^ the word Game, then whitespace characters \s (at least 1 +), then some digits \d+ that we hang onto for later ( ), a : followed by more whitespace \s+ [and note that this whitespace might be optional and the repeat character could be changed from one or more + to zero or more *], and then capture ( ) everything else on the input line.

Then take the two things that were captured (game number and the rest of the line) and assign those to two new variables, $i and $pulls, which are local to this code block.

As I also used it yesterday, the regular expression matching operation // will match against the built-in loop-index variable $_ (in this case, containing each line of input) if not given a different string against which to match.

foreach my $pull (split(/;\s*/, $pulls)) {

Now within each line of input, split the list of pulls on semicolon optionally followed by any amount of whitespace; loop sequentially through that list foreach; and put each value into a new, local loop index variable $pull. That is, loop through the semicolon-delimited lists of separate pulls from the bag.

foreach my $colorcount (split(/,\s*/, $pull)) {

Within each of those pulls from the bag, loop through the comma-delimited list of colors and the count of the number of cubes of that color.

my ($count, $color) = $colorcount =~ /^(\d+)\s+(\w+)$/
or die "unintelligible color count \"$colorcount\" " .
"on row $.\n";

Here, use a regular expression against an explicitly-specified variable $colorcount =~ // rather than against the intrinsic row-of-input variable $_. Each color and count should begin with some digits, then some whitespace, then end with some "word characters" \w+. Extract the digits and the word into new local variables.

And then if unable to match the expected format, stop the program and report what we failed to parse and which line of input $. it was. Why would the content not match the expected format? In the real world and in the puzzle, because I made an unwarranted assumption about what the format actually is -- common when determining the format of someone else's data by visual inspection. In the real world, because files are corrupted -- a partial row of output filled a block that hit a quota or filled a partition, or many other reasons. In the real world and in the puzzle, because I wrote my regular expression wrong -- like in the first draft of the program when I sleepily reversed the positions of the count and color. Why not exercise this level of caution at every step along the way? In the real world, I do. In these puzzles, I find that AoC is very generous about keeping format consistent and only messes with your unwarranted assumptions in content; so I tend to exercise this level of caution in the deepest of nested loops to ensure that the previous regular expressions have really matched something and given me a string to work with.

next game if $count > $max{$color};

Then the crux of the code: check whether the number of cubes $count of the current color $color is more than the previously-defined maximum count of this color. If so, skip control flow to the next iteration next of the loop, normally the tightest-enclosing loop, but here the loop specified by label game. We only need to see one color with too many cubes out of the three possible colors, in one pull out of however many pulls were made from this bag, to know that this particular bag doesn't meet the maximum-per-color specification; so we don't need to consider the other colors in this pull, nor the other pulls in this bag; it's already rejected.

# All the pulls passed.
}
}
$sum += $i;
}

If we make it through all of the colors in this pull and make it through all of the pulls in this game without exceeding any max, then this game is valid and we're to add its index to the sum that we report at the end.

Full Program

#!/usr/bin/perl

use warnings;
use strict;

my %max = ( red => 12, green => 13, blue => 14);

my $sum = 0;

game: while (<>) {
# Split into header and list of pulls.
my ($i, $pulls) = /^Game\s+(\d+):\s+(.*)/;

foreach my $pull (split(/;\s*/, $pulls)) {
foreach my $colorcount (split(/,\s*/, $pull)) {
my ($count, $color) = $colorcount =~ /^(\d+)\s+(\w+)$/
or die "unintelligible color count \"$colorcount\" " .
"on row $.\n";
next game if $count > $max{$color};
} # colors
} # pulls

# All the pulls passed.
$sum += $i;
}

print "sum is $sum\n";

Leave a Reply