AoC 2023 D1P1: Extract Digits from Text

As with most Advent of Code problems, today’s first problem has a lot of cute story and some ambiguity in the phrasing that needs to be teased out and disambiguated.

It works out to: Extract the first and last digits on each line of text, use them to construct a two-digit number, and sum those numbers. Unstated but visible in the example is that if a line has a single digit in it, you use that for both digits of that line’s contribution.

The sample input:

1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet

This produces 12 + 38 + 15 + 77 = 142.

My program:

#!/usr/bin/perl

use warnings;
use strict;

my $sum;

I’m in the habit of always using syntax and type warnings unless I’m pulling in an external library that doesn’t check clean (in which case I’ll activate strict checking after bringing in the library). Using those forces me to define running-total variables at the top instead of having them lazily instantiated on first use.

while (<>) {

This is the standard Perl syntax to loop through all lines of input, whether fed to the program as stdin or provided as command-line arguments.

# Extract first and last digits (if separate) and add to running total.
/^[^\d]*(\d).*(\d)[^\d]*$/ and $sum += 10 * $1 + $2;

Here I’m using lots of idiomatic Perl programming, or at least these were the idioms when I learned it.

The and means evaluate (perform) the right expression if the left expression is true; so if this line matches some pattern, then add something to our running total. and is a lower priority operator than … just about everything else … so it just does this logic and doesn’t risk glomming the separate expressions together and changing what either of them does.

In the pattern, ^ matches the start of the line and $ matches the end of the line. \d matches a digit; [^\d] matches anything except a digit; . matches any character. * matches zero or more of something. ( ) captures whatever’s inside it to use later. So this allows any number of non-digits at the beginning of a line, then matches and saves the first digit, then matches any amount of stuff, then matches and saves the last digit, then allows any number of non-digits at the end of the line.

$1 and $2 then have the results of their respective parenthesized matches — the first and last digits, if there were indeed two. We increment the running total by 10 * the first digit + the second digit.

I’ll try to find a nice WordPress plugin for displaying code cleanly; losing my indentation is a little soul-crushing even though it doesn’t change the meaning of the code.

# Extract lone digit and add value to running total.
/^[^\d]*(\d)[^\d]*$/ and $sum += 10 * $1 + $1;
}

Now do the same for lines that have a single digit in them.

Note that we haven’t modified the line of text in the buffer; and these two matches are mutually exclusive; so there’s no need to explicitly skip the rest of the loop after the first match.

print "sum is $sum\n";

And then show the result.

I saved this as d01p1, copied the sample input from AoC into d01-example1, copied my real unique input from AoC into d01-input, ran

d01p1 d01-example1

and confirmed that it matched the AoC expected result, ran

d01p1 d01-input

and pasted the result into AoC, and had correctly completed part 1 and was now able to see part 2.

The Whole Program

#!/usr/bin/perl

use warnings;
use strict;

my $sum;

while (<>) {
# Extract first and last digits (if separate) and add to running total.
/^[^\d]*(\d).*(\d)[^\d]*$/ and $sum += 10 * $1 + $2;

# Extract lone digit and add value to running total.
/^[^\d]*(\d)[^\d]*$/ and $sum += 10 * $1 + $1;
}

print "sum is $sum\n";

Leave a Reply