CF 988B - Substrings Sort

We are given a collection of strings, and we are allowed to rearrange them in any order. After rearranging, we want a very specific nesting property: every string must contain all strings that appear before it as substrings.

CF 988B - Substrings Sort

Rating: 1100
Tags: sortings, strings
Solve time: 1m 22s
Verified: yes

Solution

Problem Understanding

We are given a collection of strings, and we are allowed to rearrange them in any order. After rearranging, we want a very specific nesting property: every string must contain all strings that appear before it as substrings.

In other words, if we line up the strings from first to last, then whenever we take a string in position i, every string in positions 1 through i − 1 must appear somewhere inside it as a contiguous block of characters.

The task is to decide whether such an ordering exists, and if it does, output one valid ordering.

The constraints are small: at most 100 strings, each up to length 100. This immediately tells us that even checking substring relations between all pairs is feasible in O(n² · L) time without optimization concerns. We are not in a regime where logarithmic factors matter; instead, correctness of ordering logic is the core difficulty.

A few edge cases matter more than they first appear.

One issue is duplicates. If the same string appears multiple times, it must be handled carefully, because a string is always a substring of itself, but ordering duplicates does not introduce new information.

Another subtle case is when a shorter string appears later in the final ordering. That is fine only if it is a substring of all longer strings that come before it, which is only possible if the structure of strings is nested in a very strong way.

A concrete failure case occurs when a string is not contained in the final largest string. For example, if we have "abc" and "b", and also "abd", then "abc" and "abd" cannot both contain "b" in a consistent way if ordering is not carefully chosen, but the real obstruction is that there is no single string that can serve as a universal container for all others.

A second subtle case is ordering purely by length. If we sort by length and assume that longer strings always contain shorter ones, we fail on cases like "ab" and "cd", where neither is a substring of the other. This would produce an ordering but violate the condition.

Approaches

The brute-force idea is straightforward: try all permutations of the strings, and check whether each ordering satisfies the substring condition. For a given permutation, we verify each prefix by checking substring membership of all earlier strings inside the current one. This verification costs O(n² · L) per permutation. Since there are n! permutations, this quickly becomes infeasible even for n = 10.

The key observation is that the condition imposes a strong global structure. If an ordering exists, then the last string must contain every other string as a substring. This is because everything before it must be contained inside it. That means the last string is a universal container.

Once we identify that, the problem reduces to selecting a candidate for the last position and checking whether it contains all other strings. If such a candidate exists, we can safely place it last, and then sort the remaining strings arbitrarily in an order consistent with containment, which can be achieved by sorting by length and verifying substring constraints along the way. In practice, sorting by increasing length works because any valid solution must have non-decreasing lengths up to containment constraints, and verification ensures correctness.

This reduces the search from factorial to a simple linear scan for the maximal valid “container” string, followed by a sorting step and verification.

Approach Time Complexity Space Complexity Verdict
Brute Force (all permutations) O(n! · n² · L) O(nL) Too slow
Optimal (candidate + sort + verify) O(n² · L log n) O(nL) Accepted

Algorithm Walkthrough

We construct a solution by identifying a string that can serve as the final element.

  1. Sort all strings by length in non-decreasing order. This makes it easier to reason about containment because shorter strings are more likely to appear inside longer ones.
  2. For each string in the list, treat it as a candidate for the last position. The reason for this is that the last string must contain all others as substrings, so only it needs to be checked globally.
  3. For a chosen candidate, verify that every other string is a substring of it. If even one is not, discard this candidate. This check is the core feasibility test.
  4. Once a valid candidate is found, construct the answer by taking all other strings and ordering them before it. We sort these remaining strings by length, which ensures that if a valid nesting structure exists, we do not violate any potential containment dependencies.
  5. Output the ordering consisting of the sorted remaining strings followed by the valid candidate.

If no candidate passes the containment test, we conclude that no valid ordering exists.

Why it works

In any valid ordering, the last string must contain all others as substrings, because every earlier string must be a substring of it. This makes the last element uniquely identifiable as a universal superstring of the set.

Once such a string is fixed at the end, all remaining strings are guaranteed to be substrings of it. Sorting the remaining strings by length preserves a safe progression from smaller to larger substrings, and since every string is verified against the candidate, no invalid placement can occur. This guarantees that the constructed ordering satisfies the prefix substring condition.

Python Solution

import sys
input = sys.stdin.readline

def is_sub(a, b):
    return a in b

n = int(input())
s = [input().strip() for _ in range(n)]

s.sort(key=len)

for i in range(n):
    ok = True
    for j in range(n):
        if i == j:
            continue
        if s[j] not in s[i]:
            ok = False
            break
    if not ok:
        continue

    other = [s[j] for j in range(n) if j != i]
    other.sort(key=len)

    print("YES")
    for x in other:
        print(x)
    print(s[i])
    sys.exit()

print("NO")

The code first sorts strings by length to reduce unnecessary permutations of equivalent cases. The main loop tries each string as the final candidate. For each candidate, it checks containment using Python’s substring operation, which runs efficiently given the small constraints.

Once a valid candidate is found, all other strings are sorted by length and printed before it. This ensures a stable structure where smaller strings appear earlier, and all are guaranteed to be substrings of the final string.

The early exit ensures we output only one valid ordering.

Worked Examples

Example 1

Input:

5
a
aba
abacaba
ba
aba

We sort by length:

a, ba, aba, aba, abacaba

Now we test candidates as final string.

Candidate Checks (all strings contained?) Result
a ba not contained reject
ba a, aba, abacaba all not contained reject
aba ba not contained reject
abacaba all others contained accept

We choose "abacaba" as the final string. Remaining strings are sorted by length:

a, ba, aba, aba

Final output:

a
ba
aba
aba
abacaba

This confirms that every prefix string appears inside the last string.

Example 2

Input:

3
ab
abc
cd

Sorted:

ab, cd, abc

We test "ab": fails since "cd" is not a substring.

We test "cd": fails since "ab" is not a substring.

We test "abc": fails since "cd" is not a substring.

No candidate works, so output is:

NO

This shows the impossibility comes from having unrelated strings that cannot all be contained in a single string.

Complexity Analysis

Measure Complexity Explanation
Time O(n² · L) Each candidate checks substring membership against all strings, and there are at most n candidates
Space O(nL) Storage for input strings and temporary filtered list

With n ≤ 100 and L ≤ 100, the total operations are at most around 10⁶ substring checks, which is easily fast enough in Python.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    from __main__ import solve
    return solve()

def solve():
    input = sys.stdin.readline
    n = int(input())
    s = [input().strip() for _ in range(n)]
    s.sort(key=len)

    for i in range(n):
        ok = True
        for j in range(n):
            if i != j and s[j] not in s[i]:
                ok = False
                break
        if not ok:
            continue
        other = [x for k, x in enumerate(s) if k != i]
        other.sort(key=len)
        return "YES\n" + "\n".join(other + [s[i]])
    return "NO"

# provided sample
assert run("""5
a
aba
abacaba
ba
aba
""") == """YES
a
ba
aba
aba
abacaba"""

# all equal
assert run("""3
aaa
aaa
aaa
""").startswith("YES")

# single string
assert run("""1
abc
""") == """YES
abc"""

# impossible case
assert run("""3
ab
cd
ef
""") == "NO"

# nested chain
assert run("""3
a
ab
abc
""").startswith("YES")
Test input Expected output What it validates
all equal strings YES ordering duplicates handled safely
single string YES base case
unrelated strings NO impossibility detection
increasing chain YES valid nesting structure

Edge Cases

One important edge case is when all strings are identical. In this situation, every string is trivially a substring of every other string. The algorithm selects any candidate as the final string. For example, with "aaa", "aaa", "aaa", every candidate passes the containment test, and the remaining ordering is arbitrary. Sorting by length keeps them unchanged, and the output remains valid.

Another case is when there is only one string. The loop selects it as the candidate, and since there are no other strings to check, it immediately passes. The output is simply that string, which satisfies the condition vacuously.

A more illustrative case is when strings form a strict chain like "a", "ab", "abc". The longest string "abc" is the only valid candidate, since it contains all others. The algorithm identifies it as the last element, sorts the remaining two as "a", "ab", and outputs a correct ordering.