Sequences

In my opinion, the most interesting problems are those that appear challenging at first but seem trivial once you simply change your perspective. At the beginning of my journey with competitive programming, the problem “0-1 Sequences,” found on the Kattis Problem Archive, was undoubtedly one of them. In this blog post, I will explain my eventual solution to this problem.

To begin, let’s discuss the problem statement. As input, you are given a sequence consisting of three symbols (?, 0, and 1). Suppose there are exactly \(n\) question marks. Then \(2^n\) sequences are formed by replacing each ? with either a 0 or a 1. The number of inversions of a sequence is defined to be the minimum number of adjacent swaps required to sort it in non-decreasing order. Your task is to calculate the sum of the inversion counts of these sequences, modulo \(10^9 + 7\)

To make the problem statement more tangible, let’s consider an example. Assume we are given the sequence 01?0. Then we can create exactly 2 sequences: 0100 and 0110. Sorting either sequence requires only 2 inversions (010000100001 and 011001010011). Thus, the result is 4.

What will happen if we expand this sequence by adding a fifth element at the end? Will knowing the solution for the initial sequence simplify solving the case of the extended one? Actually, it will. We will consider each of the three possible expansions separately.

First, suppose that a 0 is added. We will be working with the sequence 01?00. Once again, there are 2 possible sequences: 01000 and 01100. Let’s focus on the second one. We already know that its prefix 0110 can be sorted in non-decreasing order using 2 inversions. Hence, converting 01100 into 00110 requires 2 moves. To conclude the sorting process, we only need to shift each 1 to the right, which can be accomplished with 2 inversions. Consequently, sorting this sequence requires 4 inversions in total. Analogously, the first sequence can be sorted with 3 moves.

I hope it is clear that in general, adding a 0 to the end of a 0-1 sequence increases its number of inversions by the number of ones it contains. Hence, the total inversion count will increase by the total number of ones found in all sequences.

Let’s move on to the next case. Suppose that a 1 is added instead. We will be dealing with the sequence 01?01. We have already found that all the sequences derived from 01?0 can be sorted with 4 inversions in total. Since appending an element greater than or equal to all existing elements of a sequence preserves its non-decreasing order, the inversion count will remain constant.

The case where we append a ? can be thought of as a combination of the previous ones. To put it simply, the inversion count of 01?0? is equal to the combined inversion counts of 01?00 and 01?01.

Having thoroughly analyzed a concrete example, I believe we can proceed to the general solution, in which we will iteratively compute the solutions for the prefixes of the provided sequences.

First of all, we will initialize three variables: inversion_count, one_count, and sequence_count. The last will be set to 1 to account for the empty sequence. Then we will load the input sequence.

#include <iostream>

int main() {
    long long inversion_count = 0;
    long long one_count = 0;
    long long sequence_count = 1;

    std::string sequence;
    std::cin >> sequence;

    return 0;
}

Next, we will process it iteratively, considering each case. When we encounter a 0 we need to increase inversion_count by one_count. If the next character is a 1, we have to increase one_count by sequence_count since a 1 will be appended to each sequence. The final case is slightly complex. First of all, we need to assign inversion_count to the sum of inversion_count and inversion_count + one_count to account for the 2 cases. Then we must update one_count. At this stage, every sequence is duplicated, and either a 0 or a 1 is appended to it. Hence, one_count will be doubled and incremented by sequence_count (note that sequence_count has not been modified yet). Finally, sequence_count will be doubled. During each update, it is crucial to remember to take the remainder module \(10^9 + 7\) to avoid integer overflow.

#include <iostream>

int main() {
    long long inversion_count = 0;
    long long one_count = 0;
    long long sequence_count = 1;

    long long mod = 1'000'000'007;

    std::string sequence;
    std::cin >> sequence;

    for (const char& token : sequence) {
        if (token == '0')
            inversion_count = (inversion_count + one_count) % mod;
        else if (token == '1')
            one_count = (one_count + sequence_count) % mod;
        else {
            inversion_count = (2 * inversion_count + one_count) % mod;
            one_count = (2 * one_count + sequence_count) % mod;
            sequence_count = 2 * sequence_count % mod;
        }
    }

    std::cout << inversion_count;

    return 0;
}

The method of solving this problem can be summarized with the following sentence: “Break the problem down into simpler subproblems and then successively combine their solutions to obtain the final one,” which is called dynamic programming.