Today’s problem is a good illustration of the difference between a naive algorithm and a more clever one. I solved the first part of the puzzle with a “naive” brute-force algorithm, but I was indeed forced to use a more “clever” algorithm in the second part.

Problem description

A chemical formula (string composed of capital letters) is given, with a list of chemical reactions (pair insertions) with the format AB -> C. This notation means that for every AB pair, a C is inserted between the A and the B.

All insertions are considered simultaneously, and pairs overlap, so the second element of a pair is the first element of the following pair. This process is repeated a given number of times (different for part 1 and part 2).

Part 1

To begin, the chemical reaction is repeated 10 times.

First, let’s write a little parsing function. The ruleset is stored in an HashMap, where keys are the pairs, and values are the inserted elements.

pub fn parse_rules(input: &Vec<String>) -> HashMap<String, char> {
    let mut rules = HashMap::<String, char>::new();

    for i in 2..input.len() {
        let tokens = input[i].split(" -> ").collect::<Vec<&str>>();
        rules.insert(tokens[0].to_string(), tokens[1].chars().nth(0).unwrap());
    }

    rules
}

The calculations are done in the solve_part1() function, with a “naive” algorithm. The algorithm merely computes the formulas generated by the chemical reaction step after step, and counts at the end.

const STEPS1: usize = 10;

pub fn solve_part1(input: &Vec<String>) -> usize {
    let mut last_formula = input[0].clone();
    let rules = parse_rules(input);
    
    let first_char = last_formula.chars().next().unwrap();
    let mut next_formula = String::new();

    for _step in 1..=STEPS1 {
        next_formula.push(first_char);

        for w in last_formula.clone().chars().collect::<Vec<char>>().windows(2) {
            let mut key = String::new();
            key.push(*w.first().unwrap());
            key.push(*w.last().unwrap());

            match rules.get(&key) {
                Some(v) => { next_formula.push(*v) },
                None => {}
            }
            next_formula.push(*w.last().unwrap())
        }

        last_formula = next_formula.clone();
        next_formula = String::new();
    }

    let mut frequencies = HashMap::new();
    for ch in last_formula.chars() {
        *frequencies.entry(ch).or_insert(0) += 1;
    }

    frequencies.values().max().unwrap() - frequencies.values().min().unwrap()
}

The most difficult thing for me was to compile the code. I did not know how to correct the error E0716 “A temporary value is being dropped while a borrow is still in active use” and I spent more than 1 hour to get the correct answer to the part 1. The purely algorithmic part did not pose any challenge.

Part 2

Now, 40 steps must be done from the initial formula. The issue with the previously used algorithm is that its temporal complexity and its spatial complexity (i.e the amount of memory used) are O(e n)O(e^n), as the length of our chemical formula grows exponentially. You know, Computer Science is a field where exponential growths really exist, as opposed to another currently popular field. Having 30 more steps to execute means that the execution time and the needed RAM would be 1 billion times greater now.

Fortunately, we are not required to compute the formula generated by the reaction, only the elements respective frequencies.

In order to do this, we will first improve the part 1 parsing by generating a different data structure. We still build a HashMap, but the values are now the pairs generated with a rule, not the inserted character.

pub fn rules_pairs(rules: &HashMap<String, char>) -> HashMap<String, (String, String)> {
    let mut pairs = HashMap::new();

    for (k, v) in rules {
        let mut pair1 = String::new();
        let mut pair2 = String::new();

        pair1.push(k.chars().nth(0).unwrap());
        pair1.push(*v);
        pair2.push(*v);
        pair2.push(k.chars().nth(1).unwrap());

        pairs.insert(k.clone(), (pair1, pair2));
    }

    pairs
}

This additional operation allows us to compute a pair frequencies map after each step. Contrary to the formula size, the pair frequencies map size remains constant. So, this algorithm has a temporal complexity of O(n)O(n) (linear) and a spatial complexity in O(1)O(1) (constant).

After the execution of the 40 steps, elements frequencies are computed from pair frequencies, and from the first and the last element of the formula. It is important to note that the first and last element of the formula belong to only one pair, and never change during the reactions.

const STEPS2: usize = 40;

pub fn solve_part2(input: &Vec<String>) -> u64 {
    let formula = input[0].clone();
    let first_char = formula.chars().next().unwrap();
    let last_char = formula.chars().last().unwrap();
    let pairs = rules_pairs(&parse_rules(input));
    
    let mut pair_frequencies = HashMap::new();
    let mut next_pair_frequencies = HashMap::new();

    for w in formula.chars().collect::<Vec<char>>().windows(2) {
        let mut pair = String::new();
        pair.push(*w.first().unwrap());
        pair.push(*w.last().unwrap());
        *pair_frequencies.entry(pair).or_insert(0) += 1;
    }

    for _step in 1..=STEPS2 {
        for (k, v) in pair_frequencies {
            *next_pair_frequencies.entry(pairs.get(&k).unwrap().0.clone()).or_insert(0) += v;
            *next_pair_frequencies.entry(pairs.get(&k).unwrap().1.clone()).or_insert(0) += v;
        }
        pair_frequencies = next_pair_frequencies.clone();
        next_pair_frequencies = HashMap::new();
    }

    let mut frequencies = HashMap::new();
    for (k, v) in &pair_frequencies {
        *frequencies.entry(k.chars().next().unwrap()).or_insert(0) += v;
        *frequencies.entry(k.chars().last().unwrap()).or_insert(0) += v;
    }
    for (k, v) in frequencies.iter_mut() {
        *v /= 2;
        if *k == first_char || *k == last_char {
            *v += 1;
        }
    }

    (frequencies.values().max().unwrap() - frequencies.values().min().unwrap())
}

Performances

Part 1 runs in 1200 µs, and part 2 in only 600 µs. Ironically, the “clever” algorithm takes less time than the “naive” one, even with 4 times as many steps to compute.