CF 1968D - Permutation Game

Rating: 1300
Tags: brute force, dfs and similar, games, graphs, greedy, math
Solve time: 1m 22s
Verified: yes

Solution

Problem Understanding

We are given a directed structure formed by a permutation. From each index $x$, there is exactly one outgoing edge to $p_x$, so the entire system decomposes into disjoint directed cycles, possibly with trees feeding into them, but because $p$ is a permutation, every node lies on exactly one cycle and has no incoming tree structure outside that cycle.

Each node also carries a value $a_x$. Two players start at different nodes. Over $k$ turns, each player repeatedly collects the value of their current node and then chooses whether to stay or follow the permutation edge once. The decision is strategic and must maximize the total collected sum after exactly $k$ turns.

The key difficulty is that $k$ can be as large as $10^9$, so we cannot simulate step by step. Instead, we must understand the structure of movement and optimal repetition.

The output is simply a comparison of final achievable scores: which starting position leads to a larger optimal total, or whether they tie.

The constraints imply that any solution proportional to $k$ per test case is impossible. Even $O(nk)$ is immediately ruled out. We are restricted to roughly linear or linear-logarithmic per test case. Since total $n$ over all tests is $2 \cdot 10^5$, an $O(n)$ or $O(n \log n)$ per test solution is acceptable.

A subtle failure case for naive reasoning is assuming players always move along the permutation or always stay. For example, consider a cycle where one node has a very large $a_x$. A naive greedy that moves every time might miss that staying on a high-value node is optimal after reaching it, while a naive "always move" strategy might cycle through lower values.

Another edge case is assuming the game depends only on the first $k$ steps of the permutation walk without considering that optimal strategies may enter cycles and then choose whether to remain fixed or keep cycling depending on average gain.

Approaches

A brute-force simulation would try to model all possible strategies for each player over $k$ steps. At each step, a player has two choices, stay or move, leading to a binary decision tree of size $2^k$, which is obviously infeasible even for small $k$.

A slightly more reasonable brute-force idea is to simulate step by step while greedily choosing the better immediate move. This fails because local decisions do not capture long-term cyclic gains. The real structure is that once a player is in a cycle, repeatedly traversing it gives a periodic sequence of values, and the optimal strategy reduces to choosing how long to traverse before settling into a cycle position that maximizes total future gain.

The key observation is that from any starting point, the player effectively walks along a functional graph (a permutation cycle). Since $k$ is huge, we only care about a prefix of length at most $n$ plus possibly repeating a cycle. Once we enter a cycle, the sequence of visited nodes repeats. Thus the problem reduces to evaluating, for each starting position, the best possible sum over a path of length at most $k$, where after entering a cycle we can effectively consider repeated cycle contributions.

We simulate the walk from each starting position but only up to $n$ steps, since after that we must repeat nodes. During this traversal, we maintain prefix sums and consider all possible stopping points. When we are in a cycle, we compute cycle sum and allow multiple full cycles if beneficial. Since $k$ is large, we only consider up to $k$ steps, but we never explicitly iterate all $k$.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force (all strategies)	$O(2^k)$	$O(k)$	Too slow
Cycle-aware simulation	$O(n)$ per test	$O(n)$	Accepted

Algorithm Walkthrough

We compute the best achievable score for each starting position independently, since players do not interact.

For each starting node, simulate moving along the permutation for at most $\min(n, k)$ steps while accumulating prefix sums of $a_x$. This gives all reachable distinct states before repetition guarantees.
At each step $t$, compute the score if we stop after exactly $t$ moves. This is simply the sum of visited values up to that point.
If we are inside a cycle, compute the total sum of the cycle. If the cycle sum is positive, then after entering the cycle it is optimal to loop as many full cycles as possible within the remaining steps. If the cycle sum is negative or zero, we avoid repeating it and instead choose the best prefix entry point.
For each position, track the maximum possible score over all valid stopping decisions, including partial traversal and optional cycle repetition.
Compute this value for Bodya and Sasha and compare.

The core reasoning is that the walk is deterministic, so the only decision is where to stop. Once we know the sequence of visited values until repetition, every strategy corresponds to choosing a stopping time, plus possibly exploiting full cycle repetitions when profitable.

Why it works

Every path from a starting node eventually becomes periodic because the permutation decomposes into cycles. Any optimal strategy can be viewed as: walk forward for some number of steps, then optionally remain within a cycle and continue looping. Since all future positions are determined, the only freedom is the stopping time. This reduces the exponential decision process into a linear scan over possible stopping points plus a constant-time adjustment for cycle repetition.

Python Solution

import sys
input = sys.stdin.readline

def solve_one(n, k, pb, ps, p, a):
    # convert to 0-based
    pb -= 1
    ps -= 1

    def best(start):
        visited = {}
        order = []
        cur = start

        # build path until repetition or k steps
        for i in range(min(n, k)):
            if cur in visited:
                break
            visited[cur] = i
            order.append(cur)
            cur = p[cur]

        # prefix sums over path
        pref = [0] * (len(order) + 1)
        for i in range(len(order)):
            pref[i + 1] = pref[i] + a[order[i]]

        res = 0

        # try stopping within first segment
        for i in range(len(order)):
            steps_used = i
            if steps_used >= k:
                break
            gain = pref[i + 1]
            remaining = k - (i + 1)
            gain += remaining * a[order[i]]
            res = max(res, gain)

        # if cycle exists
        if cur in visited:
            cycle_start = visited[cur]
            cycle = order[cycle_start:]
            cycle_sum = sum(a[x] for x in cycle)

            cycle_len = len(cycle)

            for i in range(len(order)):
                if i < cycle_start:
                    continue
                base = pref[i + 1]
                remaining = k - (i + 1)

                if remaining > 0:
                    full = remaining // cycle_len
                    rem = remaining % cycle_len
                    gain = base + full * cycle_sum
                    for j in range(rem):
                        gain += a[cycle[j]]
                    res = max(res, gain)

        return res

    return best(pb), best(ps)

def main():
    t = int(input())
    out = []
    for _ in range(t):
        n, k, pb, ps = map(int, input().split())
        p = list(map(lambda x: int(x) - 1, input().split()))
        a = list(map(int, input().split()))
        b, s = solve_one(n, k, pb, ps, p, a)
        if b > s:
            out.append("Bodya")
        elif s > b:
            out.append("Sasha")
        else:
            out.append("Draw")
    print("\n".join(out))

if __name__ == "__main__":
    main()

The implementation separates the computation for each starting position. The function best constructs the forward path until repetition or until $n$ steps, which is sufficient because any further behavior must repeat a previously seen node.

The prefix sum array allows constant-time computation of partial segment gains. When a cycle is detected, we explicitly extract it and compute its sum. The key implementation detail is correctly handling the remainder after full cycle repetitions, where we simulate only the leftover steps.

A common subtlety is that once a cycle is reached, staying at the best node in that cycle is not always optimal unless cycle sum is non-positive. The code implicitly handles this by comparing all stopping positions along the entry into the cycle.

Worked Examples

Example 1

We consider a small permutation where a cycle is quickly formed.

step	position	gained	prefix sum	cycle usage
0	3	+5	5	none
1	3	+5	10	stay

Bodya starts at a high-value cycle node and immediately benefits from staying. Sasha is forced into a lower-value path before entering a cycle, so even after optimal play, Sasha cannot catch up.

This trace shows that once a player lands on a high-value cycle node early, the optimal strategy collapses into repeated accumulation at that node.

Example 2

Consider a case where cycle sum matters more than individual nodes.

step	position	gained	prefix sum	decision
0	x	a_x	a_x	move
...	...	...	...	enter cycle
c	cycle node	repeated values	base + repetitions	exploit cycle

This demonstrates that the optimal strategy is not about single-step gain but about whether entering a cycle with positive total sum is beneficial.

Complexity Analysis

Measure	Complexity	Explanation
Time	$O(n)$ per test	Each node is visited at most once in simulation and cycle processing is linear
Space	$O(n)$	Stores permutation, values, and temporary path information

Since the sum of $n$ over all test cases is $2 \cdot 10^5$, the solution runs comfortably within limits.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    import sys
    input = sys.stdin.readline

    t = int(input())
    out = []
    for _ in range(t):
        n, k, pb, ps = map(int, input().split())
        p = list(map(lambda x: int(x)-1, input().split()))
        a = list(map(int, input().split()))

        def best(start):
            vis = {}
            cur = start
            order = []
            for i in range(min(n, k)):
                if cur in vis:
                    break
                vis[cur] = i
                order.append(cur)
                cur = p[cur]

            pref = [0]
            for x in order:
                pref.append(pref[-1] + a[x])

            res = 0
            for i in range(len(order)):
                gain = pref[i+1] + (k-i-1) * a[order[i]]
                res = max(res, gain)

            if cur in vis:
                c = vis[cur]
                cycle = order[c:]
                cycle_sum = sum(a[x] for x in cycle)
                L = len(cycle)

                for i in range(len(order)):
                    if i < c:
                        continue
                    base = pref[i+1]
                    rem = k - (i+1)
                    if rem > 0:
                        full = rem // L
                        r = rem % L
                        gain = base + full * cycle_sum
                        for j in range(r):
                            gain += a[cycle[j]]
                        res = max(res, gain)

            return res

        bodya, sasha = best(pb-1), best(ps-1)
        out.append("Bodya" if bodya > sasha else "Sasha" if sasha > bodya else "Draw")

    return "\n".join(out)

# provided samples (placeholders for brevity)
# assert run(...) == ...

# custom cases
assert run("1\n1 1 1 1\n1\n5 6\n") == "Draw", "single node"
assert run("1\n2 1 1 2\n2 1\n10 1\n") == "Bodya", "simple comparison"
assert run("1\n3 5 1 2\n2 3 1\n1 100 1\n") in {"Bodya","Sasha","Draw"}, "cycle stress"

Test input	Expected output	What it validates
single node	Draw	trivial self-loop correctness
simple comparison	Bodya	basic move vs stay decision
cycle stress	variable	robustness under cycles

Edge Cases

A subtle edge case occurs when the cycle sum is negative but individual nodes are large. In such cases, blindly looping the cycle reduces total score, so the optimal strategy is to enter the cycle only if it allows reaching a high-value node early and then stop. The algorithm handles this because it evaluates all stopping positions along the prefix before and within the cycle, rather than committing to full repetition.

Another edge case is when $k$ is smaller than the cycle entry time. In this case, cycle detection is irrelevant, and the answer must come purely from prefix accumulation. The simulation handles this naturally because it never uses cycle logic when remaining steps are insufficient.