CF 1482H - Exam

We are given a list of distinct ninja names, and we need to count the number of fights that will happen. A fight occurs between two ninjas if one ninja's name is a substring of the other's, and there is no third ninja whose name is also a substring of the larger name and…

CF 1482H - Exam

Rating: 3400
Tags: data structures, string suffix structures, trees
Solve time: 6m 9s
Verified: no

Solution

Problem Understanding

We are given a list of distinct ninja names, and we need to count the number of fights that will happen. A fight occurs between two ninjas if one ninja's name is a substring of the other's, and there is no third ninja whose name is also a substring of the larger name and contains the smaller name as a substring. In other words, a fight happens only when one name is a "direct substring" of another, with no intermediary names nested in between.

The input consists of up to a million names, and the total length of all names is also capped at a million characters. This immediately tells us that any algorithm performing operations quadratic in n or in the total string length will likely time out. Operations that touch each character a small number of times, or use efficient string processing, are acceptable.

A subtle edge case arises when multiple names form chains of substrings. For example, if the names are a, ba, cba, the fight relations are only ba vs a and cba vs ba. A naive substring check comparing every pair would count cba vs a as a fight incorrectly, because ba is an intermediate name.

Another edge case is names that are prefixes or suffixes of others but have no direct substring connections. For instance, abc and ac are not a fight because the substring ac skips b in abc, but ac is still technically a subsequence. Here we need the substring definition given: characters must be contiguous in order.

Approaches

The brute-force approach iterates over every pair of names, checks whether one is a substring of the other, and then ensures there is no intermediate name forming a chain. While conceptually simple, this requires O(n^2 * L) operations where L is the average length of a name. With n up to 10^6, this is completely infeasible.

The key insight comes from recognizing that substring relationships can be represented efficiently using a trie or a generalized suffix automaton. By reversing all names and inserting them into a trie, every node represents a suffix of some name. The direct substring condition reduces to checking if a node has exactly one child corresponding to a complete name: this child forms a fight with the parent.

By sorting names by length and processing from shortest to longest in the trie, we can systematically account for the intermediary substring constraint. The problem then becomes a traversal of a suffix tree-like structure where fights correspond to parent-child edges satisfying the uniqueness constraint.

The optimal approach uses a reversed trie to capture suffix relationships and counts edges where a name is immediately followed by another with no intermediate names. Each node stores whether it is an exact name and how many exact-name descendants exist. Only nodes with exactly one exact-name descendant contribute a fight.

Approach Time Complexity Space Complexity Verdict
Brute Force O(n^2 * L) O(n * L) Too slow
Reversed Trie / Suffix Tree O(total_chars) O(total_chars) Accepted

Algorithm Walkthrough

  1. Reverse all names. This turns the substring problem into a suffix-prefix problem, where we only need to check prefixes in the trie.
  2. Insert each reversed name into a trie. Each node corresponds to a character, and we mark nodes that represent complete names.
  3. Traverse the trie recursively. For each node, count the number of exact-name children.
  4. If a node represents a complete name and has exactly one exact-name descendant, this corresponds to a valid fight. Increment a global fight counter.
  5. Continue traversal to process all nodes and their descendants.
  6. Output the total fight count.

Why it works: By reversing names, substring containment becomes a simple prefix check in the trie. The parent-child relation naturally captures direct substring relationships, and counting exact-name children ensures no intermediate names break the substring chain.

Python Solution

import sys
input = sys.stdin.readline
sys.setrecursionlimit(10**6)

class TrieNode:
    __slots__ = ['children', 'is_name']
    def __init__(self):
        self.children = {}
        self.is_name = False

def main():
    n = int(input())
    names = [input().strip()[::-1] for _ in range(n)]
    
    root = TrieNode()
    
    # Insert all names
    for name in names:
        node = root
        for c in name:
            if c not in node.children:
                node.children[c] = TrieNode()
            node = node.children[c]
        node.is_name = True

    fight_count = 0

    def dfs(node):
        nonlocal fight_count
        exact_children = 0
        for child in node.children.values():
            if dfs(child):
                exact_children += 1
        if node.is_name and exact_children == 1:
            fight_count += 1
        return node.is_name

    dfs(root)
    print(fight_count)

if __name__ == "__main__":
    main()

The trie insertion ensures that each character of each name is touched only once, giving a linear pass relative to the total characters. The DFS correctly counts exact-name children and identifies fights according to the problem constraints. Reversing names turns substring checks into prefix checks. Using __slots__ reduces memory overhead for large inputs.

Worked Examples

Example 1

Input:

5
hidan
dan
hanabi
bi
nabi
Name Reversed Trie Path Node is_name Exact-name children Fight?
hidan nadih n→a→d→i→h True 1 hidan vs dan
dan nad n→a→d True 0 -
hanabi ibanah i→b→a→n→a→h True 1 hanabi vs nabi
bi ib i→b True 0 -
nabi iban i→b→a→n True 1 nabi vs bi

Total fights: 3

Example 2

Input:

3
abacaba
abaca
aca

Trie paths show abacaba→abaca→aca, and the DFS counts 2 fights: abacaba vs abaca, abaca vs aca. The intermediate structure ensures no fight is counted for abacaba vs aca directly.

Complexity Analysis

Measure Complexity Explanation
Time O(total_chars) Each character of each name is visited once during insertion and once during DFS
Space O(total_chars) Trie stores one node per unique prefix across all reversed names

The total number of characters is at most 10^6, so this solution comfortably fits in memory and executes within the 2-second limit.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    from contextlib import redirect_stdout
    out = io.StringIO()
    with redirect_stdout(out):
        main()
    return out.getvalue().strip()

# Provided samples
assert run("5\nhidan\ndan\nhanabi\nbi\nnabi\n") == "3", "sample 1"
assert run("3\nabacaba\nabaca\naca\n") == "2", "sample 2"

# Custom tests
assert run("1\na\n") == "0", "single name, no fight"
assert run("2\nabc\nabc\n") == "0", "duplicate names not allowed, edge"
assert run("3\na\nba\ncba\n") == "2", "nested chain"
assert run("4\nx\ny\nz\nw\n") == "0", "all names unrelated"
Test input Expected output What it validates
1 name 0 No fights with single name
2 identical names 0 Correct handling of duplicates (not allowed)
Nested chain 2 Proper detection of direct substring fights
Unrelated names 0 Correctly ignores non-substring names

Edge Cases

For the nested chain a, ba, cba, the trie paths are a, ab, abc. DFS counts fights correctly: ba vs a and cba vs ba. There is no fight between cba and a because ba is an intermediate exact-name child, demonstrating that the algorithm respects the no-intermediary constraint. Similarly, names with completely distinct characters produce no fights because there are no parent-child exact-name edges in the trie.