Advent of Code 2023¶
I was planning on writing more about this because I spent a bit more time on it this year, but for reasons I won't go into right now I've just not had the energy or inclination to do more, but there are a few comments I think I'd like to make about it and my thoughts on the focus of the puzzles.
Day 1 Part 2¶
First up, I'll start with something that is the bain of every software engineer... badly specified problems. That is to say problems where things like edge cases may not be fully known or understood, and even some core information is missing from the problem definition.
This puzzle required the program to read numbers from an input file. This is the example they provided.
1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet
You took the first and last numbers from each line to form a two digit number, so in this case 12, 38, 15 and 77. Easy enough.
I solved this by pushing each line through a couple of regular expressions that found the first and last digit...
Easy enough and it worked great. Then you get part two.
two1nine
eightwothree
abcone2threexyz
xtwone3four
4nineeightseven2
zoneight234
7pqrstsixteen
Suddenly you have to handle numbers as text, and they give you the results they would expect from this sample which are 29, 83, 13, 24, 42, 14 and 76. But what do you notice about those results? There is no uncertainty. Now consider this example.
If you are looking for the first number, you could search for the textual representation (one in this case) and replace them with the single digit. That would yield the line.
Which using the rules from part 1 would yield the result 11. But it turns out this is not what you should do, you should treat that as 18, i.e. you text processing should not be greedy and consume the source data, it should instead read ahead looking for digits (1..9) and their equivalent textual repesentation (one, two, three, four etc.) and as they are found push them into some structure so you can simply read the first and last ones.
const
DIGITTEXT: array[1..9] of string = (
'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'
);
function parseToDigits(src: string): string;
var
srcIdx: integer;
lcSrc: string;
digitLoop: integer;
begin
lcSrc := lowerCase(src);
srcIdx := 1;
result := '';
while (srcIdx <= length(src)) do
begin
if (charInSet(src[srcIdx], ['0'..'9'])) then
begin
// Simple numeric char, transfer directly
result := result + src[srcIdx];
end
else
begin
if (charInSet(lcSrc[srcIdx], ['e', 'f', 'n', 'o', 's', 't'])) then
begin
for digitLoop := low(DIGITTEXT) to high(DIGITTEXT) do
begin
if (copy(lcSrc, srcIdx, length(DIGITTEXT[digitLoop])) = DIGITTEXT[digitLoop]) then
begin
result := result + chr(48+digitLoop);
break;
end;
end;
end;
end;
inc(srcIdx);
end;
end;
And that works a treat as I get a string of the digits found in the source data, from which I can simply get the first and last to generate my two digit result for that line.
With no clear guidance on how to handle this situation (the example data provided for the puzzle doesn't appear to clarify how to handle it... at least not that I can see, nor was I able to find any clarifying information elsewhere in that puzzle) you are left to try various different options with only the ability to get the required answer and submit it to see if you are correct or not. Now to be clear, it could be that I missed a hint about how to handle the situation with overlapping textual representations, but I've looked and looked and I just can't find any, so I resorted to relying on information I found on Reddit, but this is the kind of edge case clarification that can often lead to issues in software design specifications and it highlights the importance of such simple things.
My General Advice For Taking Part¶
Having had a couple of goes now, I'm clearly not a seasoned competitor but I do think I can offer some advice if you fancy having a go.
The Key Things You Need To Do¶
The puzzles typically provide some data which you have to parse and then process in a particular way in order to extract some answer. So, the three key things you're going to need to be able to do with relative ease are:-
- Parse plain text data to extract required values
- Store that data efficiently to allow for future processing
- Process the data to extract the required answer
Parsing Plain Text Data¶
To make your life easy, learn how to parse text files efficiently. I loaded the source data into a TStringList and then read each line as I needed. I could have used a more traditional file handling approach and read the data using TextFile and ReadLn, but I figured it might be nice to have it all in memory available for re-use... in reality I don't think I needed to reuse any of it once I'd done my initial parsing.
The methods I employed to solve the problems I did this year were simple regular expressions and more advanced parsing routines that were typically written on the fly for each particular problem. I could have optimised the time it took me to complete the problems by having a library of common routines, and this is something I will probably look at producing for next years event 😄
Data Parsing Takeaways
- Learn some common parsing techniques as practice (e.g. learn how to parse a CSV well)
- Learn the basics of pattern matching with regular expressions
Storing Data¶
When storing your data, try and think about what you actually need to store and how (during processing) you are going to be needing to access it. i.e. can you simply stuff it in an array or are you going to need more complex access to it than say a simple index? For example are you going to want to find items with a particular piece of data present?
If you need more complex access to it, you may need to consider secondary data stores to facilitate that access (dictionaries for example), or maybe you can keep it simple and add markers during the parsing phase. Wherever you can, keep it simple.
I used Delphi for my solutions and as a consequence I have access to lovely object oriented options when it comes to representing/working with my data, but I elected to use simple record types which I stored in dynamic arrays. Whilst records are simply definitions of what you are storing, they can include helper methods and I made use of this feature for doing repetitive things like intialising the structure and some of the processing/parsing tasks.
Data Storage Takeaways
- Try to store your data in a way that makes processing efficient
- Try to store only what you actually need
- Make use of things like dictionaries when appropriate otherwise keep things simple and use basic storage mechanisms like arrays
Processing Data¶
I would try and roll this up into the two previous steps as much as possible. If you can do all the processing whilst you're reading the source data then you've not got to cycle through a whole bunch of data again to extract some information from it. Clearly this isn't always going to be possible, but if you want to write efficient processing routines you should always be looking to maximise the benefit you gain from a single processing iteration, particularly when dealing with large datasets.
The idea of a large dataset can be deceptive, for example Day 4 Part 2 starts with 194 cards read from the source data, at the end of solving the problem though I had 5,554,894 cards in memory where I had duped them to make my solver simple. In total my solution spent 26ms solving the problem which I don't think was too bad especially as it uses dynamic arrays and can handle variable data sizes (the example data for day 4 has less numbers on each line than the real data - my solution handles that with no code changes). But I'm not trying to turn this into a my solution is faster than your solution post, because that's not the take away...
Data Processing Takeaways
- Try to do as much processing as you can during the initial parsing/storage phase so you can avoid unecessary iterations through the whole dataset
- Don't be afraid to store lots more data if it makes processing simpler (simpler can often mean less expensive in terms of time)
- Be wary of variability between the sample data and the real puzzle data