CF 1426F - Number of Subsequences
We are given a string composed of the letters "a", "b", "c", and the special character "?". Each "?" can be replaced independently by any of the letters "a", "b", or "c".
CF 1426F - Number of Subsequences
Rating: 2000
Tags: combinatorics, dp, strings
Solve time: 1m 10s
Verified: yes
Solution
Problem Understanding
We are given a string composed of the letters "a", "b", "c", and the special character "?". Each "?" can be replaced independently by any of the letters "a", "b", or "c". After performing all possible replacements, we need to count the total number of subsequences "abc" across all resulting strings. A subsequence does not require contiguous letters but must maintain relative order.
The input size allows strings up to 200,000 characters, which means any algorithm that explicitly generates all $3^k$ combinations of "?" will be infeasible. Even a single "?" can triple the number of strings, so brute force enumeration is completely out of the question. The key challenge is to account for all possible replacements without iterating through them individually, which requires careful combinatorial reasoning or dynamic programming.
Non-obvious edge cases include strings that contain only question marks, strings without any "a", "b", or "c", and strings where letters appear in positions that prevent forming "abc". For example, the string "???" can generate 27 strings, and the correct output is 1×3 + 3×1 + … depending on the distribution, which illustrates that naive counting can easily miss multiplicative contributions from question marks.
Approaches
The naive approach is to generate every possible string obtained by replacing "?" with "a", "b", or "c", then count "abc" subsequences in each string. This is correct but impractical, because for k question marks there are $3^k$ strings. For $k = 20$, this exceeds $3^{20} \approx 3.5 \times 10^9$, far beyond the feasible limit for a 1-second program.
The insight that leads to an efficient solution is to maintain running counts of the number of ways subsequences ending in "a", "ab", and "abc" can be formed as we process the string from left to right. We treat "?" as a wildcard that can be any of the three letters. Each "?" multiplies existing subsequences by three because each can generate three options, but it also contributes to new subsequences. By tracking counts of partial subsequences and scaling by powers of three to account for previous wildcards, we can compute the answer in linear time.
| Approach | Time Complexity | Space Complexity | Verdict |
|---|---|---|---|
| Brute Force | $O(3^k \cdot n)$ | $O(n)$ | Too slow |
| Dynamic Programming (counting subsequences) | $O(n)$ | $O(1)$ | Accepted |
Algorithm Walkthrough
- Initialize three counters:
a_count,ab_count,abc_count, all set to 0. These represent the number of subsequences of type "a", "ab", and "abc" formed so far. Initializepower_of_threeto 1, which represents the number of ways the previous question marks can be replaced. - Iterate over each character
chin the string. - If
chis "a", incrementa_countbypower_of_three. This reflects that any existing combinations of previous wildcards can be extended by this "a". - If
chis "b", incrementab_countbya_count. Each previous "a" can now pair with this "b" to form a partial "ab" subsequence. - If
chis "c", incrementabc_countbyab_count. Each previous "ab" can now pair with this "c" to complete an "abc" subsequence. - If
chis "?", it can be treated as any of "a", "b", or "c". Before updating counts, store the current values ofa_count,ab_count, andabc_countin temporary variables. Update each count according to the contributions of choosing "a", "b", or "c":
- New
abc_count= 3 × oldabc_count+ oldab_count(contribution if "?" is "c") - New
ab_count= 3 × oldab_count+ olda_count(contribution if "?" is "b") - New
a_count= 3 × olda_count+ power_of_three (contribution if "?" is "a") - Multiply each old count by 3 to account for the three options of the current "?" and add the new subsequence contributions.
- After handling a "?", multiply
power_of_threeby 3 to reflect the additional wildcard. - Return
abc_countmodulo $10^9 + 7$ after processing the entire string.
Why it works: At every step, the counts reflect all possible subsequences that can be formed with all previous characters and wildcard expansions. By updating counts multiplicatively for each "?" and additively for contributions to new subsequences, we ensure no combination is missed or double-counted. The algorithm maintains an invariant: a_count = number of ways to pick "a"s so far, ab_count = number of ways to pick "ab"s, and abc_count = number of ways to pick "abc"s, considering all substitutions of "?" seen to that point.
Python Solution
import sys
input = sys.stdin.readline
MOD = 10**9 + 7
n = int(input())
s = input().strip()
a_count = 0
ab_count = 0
abc_count = 0
power_of_three = 1 # 3^number_of_question_marks_so_far
for ch in s:
if ch == 'a':
a_count = (a_count + power_of_three) % MOD
elif ch == 'b':
ab_count = (ab_count + a_count) % MOD
elif ch == 'c':
abc_count = (abc_count + ab_count) % MOD
elif ch == '?':
new_abc = (3 * abc_count + ab_count) % MOD
new_ab = (3 * ab_count + a_count) % MOD
new_a = (3 * a_count + power_of_three) % MOD
a_count, ab_count, abc_count = new_a, new_ab, new_abc
power_of_three = (power_of_three * 3) % MOD
print(abc_count)
The solution initializes counters and iterates through the string once, updating counts according to the rules for letters and wildcards. The subtlety lies in correctly multiplying by three for previous sequences when a "?" is encountered, and then adding contributions for new subsequences formed if the "?" becomes "a", "b", or "c".
Worked Examples
Sample Input 1: "ac?b?c"
| Index | Char | a_count | ab_count | abc_count | power_of_three |
|---|---|---|---|---|---|
| 0 | a | 1 | 0 | 0 | 1 |
| 1 | c | 1 | 0 | 0 | 1 |
| 2 | ? | 4 | 1 | 0 | 3 |
| 3 | b | 4 | 5 | 0 | 3 |
| 4 | ? | 15 | 19 | 5 | 9 |
| 5 | c | 15 | 19 | 24 | 9 |
This trace shows how each "?" multiplies previous counts and contributes to new subsequences, matching the expected total of 24.
Custom Input 2: "???"
| Index | Char | a_count | ab_count | abc_count | power_of_three |
|---|---|---|---|---|---|
| 0 | ? | 1 | 0 | 0 | 3 |
| 1 | ? | 4 | 1 | 0 | 9 |
| 2 | ? | 13 | 5 | 1 | 27 |
The answer is 1, as expected, confirming that the algorithm correctly handles multiple consecutive question marks.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character is processed once, with constant time updates of counters. |
| Space | O(1) | Only a fixed number of integer counters are maintained. |
Given $n \le 2 \cdot 10^5$, the solution easily runs within 1 second.
Test Cases
import sys, io
def run(inp: str) -> str:
sys.stdin = io.StringIO(inp)
n = int(input())
s = input().strip()
MOD = 10**9 + 7
a_count = ab_count = abc_count = 0
power_of_three = 1
for ch in s:
if ch == 'a':
a_count = (a_count + power_of_three) % MOD
elif ch == 'b':
ab_count = (ab_count + a_count) % MOD
elif ch == 'c':
abc_count = (abc_count + ab_count) % MOD
elif ch == '?':
new_abc = (3 * abc_count + ab_count) % MOD
new_ab = (3 * ab_count + a_count) % MOD
new_a = (3 * a_count + power_of_three) % MOD
a_count, ab_count, abc_count = new_a, new_ab, new_abc
power_of_three = (power_of_three *