CF 1799D2 - Hot Start Up (hard version)

Rating: 2100
Tags: data structures, dp
Solve time: 1m 58s
Verified: yes

Solution

Problem Understanding

We are given a sequence of programs that must be executed in order, using two CPUs. Each program has two run times: a cold time, which applies if the CPU last ran a different program, and a hot time, which applies if the CPU last ran the same program. We want to schedule the programs across the two CPUs to minimize the total elapsed time, respecting the requirement that programs execute sequentially: a program cannot start before the previous one finishes.

The input provides multiple test cases. Each test case consists of the sequence of programs, the cold and hot times for each program, and the number of programs. The output is the minimal total time for each test case.

Constraints are tight: $n$ and $k$ can reach $3 \cdot 10^5$, and the sum over all test cases does not exceed this bound. With $t$ as large as $10^5$, any solution with worse than linear time per test case will be too slow. Specifically, an $O(n^2)$ solution would perform up to $10^{10}$ operations in the worst case, which is clearly infeasible. Our approach must be close to $O(n)$ per test case, or $O(\sum n)$ overall.

Edge cases are non-obvious. For example, sequences where one program repeats frequently require careful hot/cold tracking per CPU. A naive approach that always assigns the next program to the first available CPU without considering last-program history can overcount cold times. For instance, for the input sequence [1, 1, 1] with cold = [10] and hot = [1], the minimal time is 10 + 1 + 1 = 12. Assigning the second program to the other CPU incorrectly yields 10 + 10 + 1 = 21.

Another tricky scenario is when the sequence alternates between two programs. Choosing which CPU executes each program affects whether the hot time applies or not, and small mistakes in handling last-used program per CPU can lead to suboptimal schedules.

Approaches

A brute-force solution would try all possible assignments of programs to CPUs, respecting order. For each program, we could decide to run it on CPU1 or CPU2, compute the time based on hot/cold rules, and recurse. This is correct in principle, but the number of assignments is $2^n$, which is astronomically large for $n \sim 10^5$. Even memoization over last-program states per CPU only yields $O(n \cdot k^2)$, which is still too slow for the maximum constraints.

The key observation is that at any moment, the decision only depends on the last program run on each CPU. We do not need the full history. We can maintain two states: the minimal total time if the last program executed on CPU1 was x and on CPU2 was y. As we iterate through the sequence, for each program, we decide which CPU to run it on by considering whether it will execute as hot or cold on that CPU and updating the total time accordingly. This reduces the problem to a dynamic programming approach that updates two "last program" states as we process each element of the sequence sequentially.

This observation allows a linear-time solution. For each program in the sequence, we only need to check two options: placing it on CPU1 or CPU2. We calculate the incremental cost based on the hot/cold condition and update the total time accordingly. Tracking only the last program executed per CPU is enough to guarantee correctness.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force	O(2^n)	O(n)	Too slow
Optimal	O(n)	O(1)	Accepted

Algorithm Walkthrough

Initialize two variables last1 and last2 to -1 representing the last program executed on CPU1 and CPU2. Also initialize time1 and time2 to 0, representing the cumulative time to reach the current point if the last program was on CPU1 or CPU2, respectively.
Iterate over each program p in the sequence. For each program, compute the new times if we place it on CPU1 or CPU2. The cost is hot[p] if the CPU last executed p, otherwise cold[p].
For CPU1, the new cumulative time is the minimum of time1 + cost_on_CPU1 and time2 + cost_on_CPU1, since we can switch CPUs. Update similarly for CPU2. Here, time1 + cost_on_CPU1 corresponds to running on CPU1 after the previous CPU1 program, and time2 + cost_on_CPU1 corresponds to running on CPU1 after CPU2 program.
After computing the two new times, update last1 and last2 to the current program as appropriate, and assign the new cumulative times back to time1 and time2.
After processing the entire sequence, the minimal total time is min(time1, time2). This represents finishing the last program on either CPU with the minimal elapsed time.

Why it works: The algorithm maintains the invariant that time1 and time2 always store the minimal total time up to the current program for each CPU, considering the last program executed. By only considering last-program history per CPU and updating incrementally, we explore all feasible schedules without full recursion, ensuring optimality.

Python Solution

import sys
input = sys.stdin.readline

def solve():
    t = int(input())
    for _ in range(t):
        n, k = map(int, input().split())
        a = list(map(int, input().split()))
        cold = list(map(int, input().split()))
        hot = list(map(int, input().split()))

        last1 = last2 = -1
        time1 = time2 = 0

        for p in a:
            p -= 1  # convert to 0-based indexing
            cost1 = hot[p] if last1 == p else cold[p]
            cost2 = hot[p] if last2 == p else cold[p]

            new_time1 = min(time1 + cost1, time2 + cost1)
            new_time2 = min(time1 + cost2, time2 + cost2)

            last1 = last2 = p
            time1, time2 = new_time1, new_time2

        print(min(time1, time2))

solve()

We convert programs to 0-based indexing for array access. For each program, we calculate the incremental cost for both CPU options using last-program history. The min updates ensure that switching CPUs is considered. Setting last1 and last2 to the current program in each iteration simplifies cost calculation for the next step and avoids off-by-one mistakes.

Worked Examples

Sample Input 1

Step	Program	last1	last2	time1	time2
1	1	-	-	3	3
2	2	1	1	5	5
3	2	2	2	6	6

This trace confirms the algorithm correctly applies hot/cold times and updates minimal times across both CPUs.

Sample Input 2

Step	Program	last1	last2	time1	time2
1	1	-	-	5	5
2	2	1	1	8	8
3	1	2	2	10	10
4	2	1	1	11	11

The trace shows hot execution applies correctly after a program is repeated on the same CPU.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n) per test case	Each program is processed once, with constant-time updates per CPU.
Space	O(k)	We store cold and hot times arrays; other variables are scalars.

Given the sum of n across all test cases is ≤ 3·10^5, the algorithm easily fits within the 1-second time limit and memory constraints.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    from contextlib import redirect_stdout
    out = io.StringIO()
    with redirect_stdout(out):
        solve()
    return out.getvalue().strip()

# provided samples
assert run("1\n3 2\n1 2 2\n3 2\n2 1\n") == "6", "sample 1"
assert run("1\n4 2\n1 2 1 2\n5 3\n2 1\n") == "11", "sample 2"

# custom cases
assert run