CF 177G1 - Fibonacci Strings

The problem asks us to count occurrences of query strings inside Fibonacci strings. Fibonacci strings are defined recursively: the first string is "a", the second is "b", and each subsequent string is the concatenation of the previous string followed by the one before that.

CF 177G1 - Fibonacci Strings

Rating: 2400
Tags: strings
Solve time: 3m 11s
Verified: yes

Solution

Problem Understanding

The problem asks us to count occurrences of query strings inside Fibonacci strings. Fibonacci strings are defined recursively: the first string is "a", the second is "b", and each subsequent string is the concatenation of the previous string followed by the one before that. For example, the third string is "ba" (second + first), the fourth is "bab" (third + second), and so on.

We are given an index k specifying which Fibonacci string f_k to examine, and m query strings s_i. For each query, we must count how many times s_i appears as a contiguous substring of f_k, modulo 10^9 + 7.

The constraints tell us that k can go up to 10^18. Generating f_k explicitly is impossible, because the length of Fibonacci strings grows exponentially, roughly len(f_n) ≈ φ^n where φ is the golden ratio (~1.618). Even f_93 exceeds 10^19 characters. This immediately rules out brute-force string construction and naive substring search. On the other hand, the total length of queries is up to 10^5, so we can afford to preprocess the queries themselves.

Non-obvious edge cases include queries longer than the entire Fibonacci string (for small k). For instance, if s_i = "abab" and k = 3 (f_3 = "ba"), the correct count is 0. A naive algorithm that assumes s_i is always smaller than f_k would fail here. Overlapping occurrences must also be counted. For example, in f_6 = "babbababba", the query "ab" occurs multiple times including overlapping positions.

Approaches

A brute-force approach would construct f_k and scan for each s_i using a substring search algorithm like KMP. While correct, this is feasible only for small k because f_k quickly exceeds any reasonable memory limit. The complexity is O(len(f_k) * len(s_i)) per query, which is exponential in k-unacceptable for k up to 10^18.

The key insight is that Fibonacci strings are constructed recursively. Let count(n, s) be the number of times string s occurs in f_n. Because f_n = f_{n-1} + f_{n-2}, we can express the count recursively:

count(n, s) = count(n-1, s) + count(n-2, s) + overlap(n, s)

Here, overlap(n, s) accounts for occurrences of s that span the boundary between f_{n-1} and f_{n-2}. To compute it efficiently, we only need the prefix of length |s|-1 of f_{n-1} and the suffix of length |s|-1 of f_{n-2}. This observation reduces the problem to tracking counts, prefixes, and suffixes of bounded length for each Fibonacci string rather than storing the entire string.

This reduces time complexity drastically because the prefix and suffix lengths are at most the length of s_i, which is small compared to the exponential size of f_k. The recurrence can be memoized or computed iteratively, ensuring that each Fibonacci string contributes only O(|s|^2) work per query.

Approach Time Complexity Space Complexity Verdict
Brute Force O(len(f_k) * sum s_i )
Optimal O(k * sum s_i ^2) or O(log k * sum

Algorithm Walkthrough

  1. For each query s_i, compute its length l_i. Initialize a dictionary mapping each Fibonacci index n to a tuple (count, prefix, suffix). count is the number of occurrences in f_n, prefix is the first min(l_i-1, len(f_n)) characters of f_n, and suffix is the last min(l_i-1, len(f_n)) characters of f_n. This will allow computing overlaps.
  2. Set base cases: f_1 = "a" and f_2 = "b". Compute count(1, s_i) and count(2, s_i) directly: they are 1 if the query matches the string exactly, 0 otherwise. Store the entire string if its length is less than |s_i| - 1 to support overlap computation.
  3. Iterate from n = 3 up to k (or use a fast exponentiation approach if k is large) and compute:
count(n, s_i) = count(n-1, s_i) + count(n-2, s_i) + overlap(f_{n-1}.suffix, f_{n-2}.prefix, s_i)

overlap(a, b, s) counts how many times s occurs in the concatenation of the last min(|s|-1, len(a)) characters of f_{n-1} with the first min(|s|-1, len(b)) characters of f_{n-2}. Only substrings that cross the boundary are considered, and the maximum substring length is bounded by |s_i|.

  1. Update the prefix of f_n as the prefix of f_{n-1} extended by f_{n-2} up to length l_i - 1, and the suffix similarly. These are used in the next iteration.
  2. Return count(k, s_i) % MOD for each query.

Why it works: The invariant is that at each step, count(n, s) correctly counts all occurrences of s in f_n. Prefixes and suffixes of length |s|-1 are sufficient to account for boundary overlaps. The recurrence mirrors the Fibonacci construction, so all internal and cross-boundary occurrences are counted exactly once. Memoization or iterative computation ensures no redundant work.

Python Solution

import sys
input = sys.stdin.readline

MOD = 10**9 + 7

def compute_occurrences(k, queries):
    results = []
    for s in queries:
        l = len(s)
        # Initialize base cases
        f1 = 'a'
        f2 = 'b'
        count1 = int(f1 == s)
        count2 = int(f2 == s)

        # prefix and suffix for overlaps
        pre1 = f1[:l-1] if l-1 < len(f1) else f1
        suf1 = f1[-(l-1):] if l-1 < len(f1) else f1
        pre2 = f2[:l-1] if l-1 < len(f2) else f2
        suf2 = f2[-(l-1):] if l-1 < len(f2) else f2

        if k == 1:
            results.append(count1)
            continue
        if k == 2:
            results.append(count2)
            continue

        counts = [0, count1, count2]
        prefixes = ['', pre1, pre2]
        suffixes = ['', suf1, suf2]

        for n in range(3, k+1):
            # Compute overlap occurrences
            overlap_str = suffixes[n-1] + prefixes[n-2]
            overlap_count = 0
            for i in range(max(0, len(overlap_str)-l+1)):
                if overlap_str[i:i+l] == s:
                    overlap_count += 1

            count_n = (counts[n-1] + counts[n-2] + overlap_count) % MOD

            # Update prefix and suffix
            new_prefix = (prefixes[n-1] + prefixes[n-2])[:l-1]
            new_suffix = (suffixes[n-1] + suffixes[n-2])[-(l-1):]

            counts.append(count_n)
            prefixes.append(new_prefix)
            suffixes.append(new_suffix)

        results.append(counts[k])
    return results

def main():
    k, m = map(int, input().split())
    queries = [input().strip() for _ in range(m)]
    res = compute_occurrences(k, queries)
    for x in res:
        print(x % MOD)

if __name__ == "__main__":
    main()

The solution initializes the base Fibonacci strings and tracks their prefix and suffix of length |s|-1. For each Fibonacci index from 3 to k, it counts the query occurrences in the concatenation boundary using these prefixes and suffixes. All counts are modulo 10^9 + 7 to avoid overflow. The final result for each query is printed.

Worked Examples

Consider k = 6, s = "ab":

n f_n count(f_n, "ab") prefix suffix
1 "a" 0 "a" "a"
2 "b" 0 "b" "b"
3 "ba" 1 "ba" "ba"
4 "bab" 1 + 0 + 1 = 2 "ba" "ab"
5 "