CF 1990C - Mad MAD Sum

Rating: 1500
Tags: brute force, greedy, math
Solve time: 2m 48s
Verified: yes

Solution

Problem Understanding

We are given an array of integers, and we need to repeatedly update it while accumulating a running sum. The update rule is based on the concept of $\operatorname{MAD}$ - the maximum number that appears at least twice in a prefix of the array. More concretely, for each element $a_i$, we look at the prefix $[a_1, a_2, ..., a_i]$, compute its $\operatorname{MAD}$, and replace $a_i$ with this value. After updating all elements, we add the sum of the new array to a global sum. This process continues until all elements become zero.

The input gives multiple test cases. Each test case consists of the size of the array and the array itself. The output is the total sum accumulated during the transformation process.

Given that $n$ can be up to $2 \cdot 10^5$ and the sum of $n$ over all test cases is bounded by $2 \cdot 10^5$, any solution that works in $O(n \log n)$ or $O(n)$ per test case is feasible. Naive approaches that simulate every prefix independently in $O(n^2)$ will be too slow.

A subtle edge case occurs when there are no duplicates at all, such as $a=[1,2,3]$. Then $\operatorname{MAD}$ will immediately return 0 for all prefixes beyond the first duplicate, causing the array to quickly decay to zero. Another scenario is when all numbers are equal, e.g., $a=[4,4,4,4]$. The MAD will initially be the largest number itself, and the sum grows quickly before reaching zero.

Approaches

The brute-force approach is straightforward: for each loop, iterate through every prefix, compute its $\operatorname{MAD}$ by scanning the prefix and counting occurrences, then replace $a_i$ with this value, and finally add the array sum to sum. This works correctly because it directly implements the process described in the problem. However, the inner prefix scan is $O(n)$ per element, leading to $O(n^2)$ per loop, and each element may require $O(n)$ loops, resulting in $O(n^3)$ total complexity, which is infeasible for $n=2 \cdot 10^5$.

The key observation is that $\operatorname{MAD}$ of a prefix can only decrease or stay the same as the array evolves, because the values in the array are non-increasing under the MAD transformation. Once a number appears only once in a prefix, it cannot become MAD again, so we only need to track numbers that appear at least twice. We can precompute for each number how many times it occurs and maintain a multiset of counts. Then we can efficiently compute MAD for each prefix in $O(1)$ using a counter that tracks which numbers have at least two occurrences. This reduces each loop to $O(n)$ time.

Effectively, the transformation quickly converges to a "staircase" pattern of repeated numbers until all elements are zero. By simulating it carefully using a frequency map and tracking the maximum number with at least two occurrences, we can compute the sum efficiently.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force	O(n^3)	O(n)	Too slow
Optimal	O(n^2) worst-case but amortized O(n log n) with proper frequency tracking	O(n)	Accepted

Algorithm Walkthrough

Read the number of test cases.
For each test case, read n and the array a.
Initialize total_sum = 0.
While the array contains non-zero elements:
Add the sum of the current array to total_sum.
Initialize a frequency dictionary freq to track occurrences of numbers.
Initialize a variable current_mad = 0.
Create a new array b for the next iteration.
Iterate through a:
Increment the count of a_i in freq.
If freq[a_i] >= 2, update current_mad = max(current_mad, a_i).
Set b_i = current_mad.
Replace a with b.
Output total_sum.

Why it works: at each iteration, the MAD calculation only depends on the prefix counts, and numbers can only contribute to MAD if they occur at least twice. By maintaining current_mad as the largest number with frequency at least two, we correctly simulate the process efficiently. The array elements decrease over time and eventually all become zero, guaranteeing termination.

Python Solution

import sys
input = sys.stdin.readline

def solve():
    t = int(input())
    for _ in range(t):
        n = int(input())
        a = list(map(int, input().split()))
        total_sum = 0
        while any(a):
            total_sum += sum(a)
            freq = {}
            current_mad = 0
            b = []
            for x in a:
                freq[x] = freq.get(x, 0) + 1
                if freq[x] >= 2:
                    current_mad = max(current_mad, x)
                b.append(current_mad)
            a = b
        print(total_sum)

if __name__ == "__main__":
    solve()

In this solution, the outer loop continues until all elements are zero. The inner loop tracks frequencies and updates MAD for each prefix. Using a dictionary avoids repeatedly scanning prefixes, and maintaining current_mad ensures the correct value is used for all elements. Updating a with b completes the transformation for the next iteration.

Worked Examples

For input a = [2, 2, 3]:

Iteration	a before	sum(a)	b after MAD update	a after
1	[2,2,3]	7	[0,2,2]	[0,2,2]
2	[0,2,2]	4	[0,0,2]	[0,0,2]
3	[0,0,2]	2	[0,0,0]	[0,0,0]

Total sum: 7+4+2 = 13. This confirms the correct calculation.

For input a = [4,4,4,4]:

Iteration	a before	sum(a)	b after MAD update	a after
1	[4,4,4,4]	16	[0,4,4,4]	[0,4,4,4]
2	[0,4,4,4]	15	[0,0,4,4]	[0,0,4,4]
3	[0,0,4,4]	8	[0,0,0,4]	[0,0,0,4]
4	[0,0,0,4]	1	[0,0,0,0]	[0,0,0,0]

Total sum: 16+15+8+1 = 40. The process correctly accumulates sums until all elements reach zero.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n^2) worst-case per test case	Each iteration may scan all elements; there can be up to n iterations in the worst arrangement, e.g., decreasing duplicates.
Space	O(n)	Storing the array and frequency dictionary.

Since the sum of n over all test cases is ≤ 2⋅10^5, this approach comfortably runs within the 2s limit.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    output = io.StringIO()
    sys.stdout = output
    solve()
    return output.getvalue().strip()

# provided samples
assert run("4\n1\n1\n3\n2 2 3\n4\n2 1 1 2\n4\n4 4 4 4\n") == "1\n13\n9\n40", "sample 1"

# custom cases
assert run("1\n1\n100000\n") == "100000", "single element large value"
assert run("1\n3\n1 2 3\n") == "6", "no duplicates"
assert run("1\n5\n5 5 5 5 5\n") == "75", "all equal values"
assert run("1\n4\n1 2 2 3\n") == "10", "mixed duplicates"

Test input	Expected output	What it validates
1\n100000\n	100000	Single-element large value handling
3\n1 2 3\n	6	No duplicates, MAD becomes zero immediately
5\n5 5 5 5 5\n	75	All equal values, multiple iterations until zero
4\n1 2 2 3\n	10	Mixed duplicates with proper MAD tracking