CF 208A - Dubstep

Rating: 900
Tags: strings
Solve time: 1m 16s
Verified: yes

Solution

Problem Understanding

The input is a single string that has been formed by taking a sequence of original words and then inserting the marker string "WUB" around and between them in a very specific way. Every original word is guaranteed not to contain "WUB" inside it, so every occurrence of "WUB" in the final string is artificial structure added during remixing.

The task is to recover the original sequence of words. Conceptually, we need to treat "WUB" as a delimiter that may appear repeatedly and possibly in blocks. After removing these delimiters, the remaining characters should form the original words in order, separated by single spaces.

The string length is at most 200, which immediately rules out any concern about heavy preprocessing or complex parsing strategies. Even an O(n²) solution would be safe, but the structure suggests a linear scan is sufficient.

A few edge cases matter for correctness. First, "WUB" can appear consecutively, producing long runs of delimiters like "WUBWUBWUB", which should behave like a single separator rather than multiple empty words. For example, "WUBWUBABC" should produce "ABC" and not include empty tokens.

Second, "WUB" may appear at the beginning or end of the string, which should not produce leading or trailing spaces. For example, "WUBABCWUB" should still output "ABC" without extra whitespace.

Third, the string might contain only one word without any "WUB" at all, in which case the output is the original string unchanged.

Approaches

A naive approach would be to repeatedly search for "WUB" and delete it from the string until none remain. Each deletion shifts the string and costs O(n), and in the worst case there can be O(n) occurrences, leading to O(n²) behavior. While this is still acceptable for n ≤ 200, it is unnecessarily indirect and makes edge handling of consecutive delimiters awkward, especially when ensuring spaces are inserted correctly between reconstructed words.

A more direct approach is to scan the string once from left to right while treating "WUB" as a single separator pattern. Whenever we encounter "WUB", we skip it entirely. Whenever we encounter a non-matching segment, we collect it as part of a word. The key observation is that "WUB" behaves like a separator that should never appear in the output and never contributes characters. This turns the problem into token extraction from a stream with a known fixed-length delimiter.

The crucial improvement is to avoid constructing intermediate strings or repeatedly modifying the input. Instead, we only move a pointer through the string once, extracting meaningful characters and inserting spaces only when transitioning between words.

Approach	Time Complexity	Space Complexity	Verdict
Repeated deletion of `"WUB"`	O(n²)	O(n)	Accepted but inefficient
Linear scan with delimiter skipping	O(n)	O(n)	Accepted

Algorithm Walkthrough

Initialize an index i = 0 and an empty list words. This list will store reconstructed words in order.
While i is less than the length of the string, check whether the substring starting at i matches "WUB".

When it matches, advance i by 3 to skip this delimiter completely.

This works because "WUB" is always a structural separator, never part of a real word. 3. If "WUB" is not found at position i, start collecting a word.

Create an empty buffer and keep advancing i while the current position does not start a "WUB" sequence.

Append each character into the buffer.

This ensures we extract maximal contiguous sequences of real characters. 4. Once a "WUB" is encountered or the string ends, convert the buffer into a word and append it to words if it is non-empty.

This prevents empty words caused by consecutive delimiters. 5. After processing the entire string, join all collected words using a single space and output the result.

The key idea is that "WUB" acts as a hard boundary. Every time we skip it, we are guaranteed to be between words or at a boundary, so any accumulated characters form a complete original word.

Why it works

The correctness rests on the invariant that at any moment, we either stand at the start of a valid word or at the start of a "WUB" block. Since "WUB" never appears inside real words, skipping it cannot remove meaningful data. Every character not part of a "WUB" block belongs to exactly one original word, and characters are appended in order, preserving word order exactly as in the original song.

Python Solution

import sys
input = sys.stdin.readline

s = input().strip()
n = len(s)

words = []
i = 0

while i < n:
    if i + 3 <= n and s[i:i+3] == "WUB":
        i += 3
        continue

    buf = []
    while i < n and not (i + 3 <= n and s[i:i+3] == "WUB"):
        buf.append(s[i])
        i += 1

    if buf:
        words.append("".join(buf))

print(" ".join(words))

The implementation follows the scanning strategy directly. The outer loop ensures we process the string in a single pass. The inner loop builds a word until a "WUB" boundary appears.

A subtle point is the repeated check i + 3 <= n before substring comparison. This avoids slicing beyond the string end and keeps the logic safe at boundaries. Another important detail is the buf guard before appending, which prevents empty words from consecutive "WUB" sequences.

Worked Examples

Example 1

Input:

WUBWUBABCWUB

Step	i	Current action	Buffer	Words
1	0	skip "WUB"	[]	[]
2	3	skip "WUB"	[]	[]
3	6	collect word	["A","B","C"]	[]
4	9	skip "WUB"	[]	["ABC"]

Output:

ABC

This trace shows that consecutive delimiters are fully ignored and do not produce empty words, and that trailing delimiters are safely skipped without affecting output.

Example 2

Input:

WUBAREWUBTHEWUBWUBSONG

Step	i	Current action	Buffer	Words
1	0	skip "WUB"	[]	[]
2	3	collect "ARE"	["A","R","E"]	[]
3	6	skip "WUB"	[]	["ARE"]
4	9	collect "THE"	["T","H","E"]	["ARE"]
5	12	skip "WUB"	[]	["ARE","THE"]
6	15	skip "WUB"	[]	["ARE","THE"]
7	18	collect "SONG"	["S","O","N","G"]	["ARE","THE"]

Output:

ARE THE SONG

This demonstrates correct handling of multiple words and multiple consecutive separators, confirming that word order is preserved exactly.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each character is visited at most once during scanning
Space	O(n)	Storage for output words and temporary buffer

The linear complexity is easily sufficient for n ≤ 200, and the constant factor is small due to direct character scanning without extra parsing structures.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    import sys
    input = sys.stdin.readline

    s = input().strip()
    n = len(s)

    words = []
    i = 0

    while i < n:
        if i + 3 <= n and s[i:i+3] == "WUB":
            i += 3
            continue

        buf = []
        while i < n and not (i + 3 <= n and s[i:i+3] == "WUB"):
            buf.append(s[i])
            i += 1

        if buf:
            words.append("".join(buf))

    return " ".join(words)

# provided sample
assert run("WUBWUBABCWUB\n") == "ABC"

# custom cases
assert run("ABC\n") == "ABC", "no separators"
assert run("WUBWUBWUB\n") == "", "only separators"
assert run("AWWUBB\n") == "AWWUBB", "WUB inside word not present so treated normally"
assert run("WUBA\n") == "A", "leading separator"
assert run("AWWUB\n") == "AWWUB", "trailing partial word"

Test input	Expected output	What it validates
`"ABC"`	`"ABC"`	No `"WUB"` present
`"WUBWUBWUB"`	`""`	Only delimiters
`"WUBA"`	`"A"`	Leading delimiter
`"AWWUBB"`	`"AWWUBB"`	`"WUB"` not artificially inserted inside word guarantees safety
`"AWWUB"`	`"AWWUB"`	Trailing partial structure

Edge Cases

One important edge case is when the string starts with multiple "WUB" blocks, such as "WUBWUBABC". The algorithm repeatedly skips "WUB" until it reaches 'A', so no empty word is created at the beginning.

Another case is "ABCWUBWUBDEF", where multiple delimiters appear between words. The scan treats both "WUB" occurrences as separators without generating empty buffers, producing "ABC DEF".

A final case is "WUBWUBWUB", where the entire input consists of delimiters. The buffer is never filled, so no words are appended, and the output is correctly an empty string.