CF 2128E1 - Submedians (Easy Version)

Rating: 1900
Tags: binary search, data structures, dp, greedy, math
Solve time: 1m 36s
Verified: no

Solution

Problem Understanding

We are given an array and a length constraint $k$. From this array, we want to select a contiguous segment of length at least $k$. Inside any chosen segment, we look at its median in a generalized sense: a value $v$ is considered a median if at least half of the elements are $\ge v$ and at least half are $\le v$. This definition allows multiple valid medians for even-length segments.

A value $v$ is called a submedian if there exists some valid segment (length at least $k$) where $v$ is one of the medians. Among all such values, we want the maximum possible $v$, and we also need to output any segment that makes it valid.

The key difficulty is not computing medians of a fixed segment, but deciding whether there exists a segment where a particular value can act as a median. Since we must maximize that value, the structure suggests a monotonic feasibility condition over values.

The constraints are large: the total array length over all test cases is up to 300,000. This immediately rules out any quadratic or even $O(n \sqrt{n})$ approach. Any solution must be close to linear or logarithmic per test case.

A subtle edge case appears when all values are identical. In that case, every segment works and the answer is trivially the maximum value in the array. Another non-trivial case arises when $k = 1$, where every single element forms a valid segment and the answer is simply the maximum array element, but the logic must still correctly produce a segment and not fail due to median handling assumptions.

A more dangerous pitfall is assuming that “median ≥ v” means the same as “at least half elements are ≥ v” without carefully balancing both sides of the inequality definition. The symmetry matters because even-length segments allow ties in the median condition.

Approaches

A direct approach tries every possible segment of length at least $k$, computes its median set, and checks all values that appear as medians. This is correct conceptually but infeasible: there are $O(n^2)$ segments, and computing median conditions even in $O(1)$ or $O(\log n)$ per segment still leads to far too many operations.

The key insight is to fix a candidate value $v$ and transform the array into a binary sequence: mark elements as $+1$ if $a_i \ge v$, and $-1$ otherwise. Now consider a segment. The condition that $v$ is a median is equivalent to the segment having non-negative “balance” under this transformation, meaning there are at least as many elements ≥ $v$ as elements < $v$ when scaled to the median threshold. With a small adjustment for the median definition, this reduces to finding a subarray of length at least $k$ whose sum is non-negative.

This turns the problem into a classic feasibility check: for a fixed $v$, we want to know whether there exists a subarray of length at least $k$ with sum ≥ 0 in the transformed array. This can be solved in linear time using prefix sums and a sliding minimum prefix over valid starting positions.

Since feasibility is monotone in $v$, if a value $v$ works, all smaller values also work. This allows binary search over $v$ in $[1, n]$. Each check is $O(n)$, so total complexity becomes $O(n \log n)$, which is sufficient.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force	$O(n^3)$	$O(1)$	Too slow
Optimal	$O(n \log n)$	$O(n)$	Accepted

Algorithm Walkthrough

We solve the problem by binary searching the largest value $v$ such that a valid segment exists.

Fix a candidate value $v$ and convert the array into a gain array where each position contributes +1 if $a_i \ge v$, otherwise -1. This encodes whether $v$ is “supported” at each position.
Build prefix sums $p[i]$, where $p[i]$ is the total gain from the start up to index $i$. A subarray sum is then computed as a difference of prefix sums.
For every right endpoint $r \ge k$, we want a left endpoint $l \le r-k+1$ such that the subarray sum $p[r] - p[l-1]$ is non-negative. This is equivalent to finding a prefix minimum over a sliding window of valid starting positions.
Maintain the minimum prefix value among indices $0 \dots r-k$. For each $r$, we check whether $p[r] \ge \min p[l-1]$. If yes, we have found a valid segment.
If a segment exists, we record its endpoints and report feasibility for this $v$.
Binary search the maximum $v$ that is feasible, and output both $v$ and the recorded segment.

The reason this works is that the transformation encodes the median condition into a balance condition on counts relative to $v$. Any valid segment must have enough elements ≥ $v$ to “compensate” for elements below $v$, which is exactly what non-negative subarray sum captures. The sliding minimum ensures we test all valid segment starts without recomputing sums.

Python Solution

import sys
input = sys.stdin.readline

def check(a, k, v):
    n = len(a)
    pref = [0] * (n + 1)
    
    for i in range(1, n + 1):
        pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)

    min_pref = 0
    min_pos = 0

    for r in range(k, n + 1):
        if pref[r - k] < min_pref:
            min_pref = pref[r - k]
            min_pos = r - k

        if pref[r] - min_pref >= 0:
            l = min_pos + 1
            return True, l, r

    return False, -1, -1

def solve():
    n, k = map(int, input().split())
    a = list(map(int, input().split()))

    lo, hi = 1, n
    ans_l, ans_r = 1, k

    while lo <= hi:
        mid = (lo + hi) // 2
        ok, l, r = check(a, k, mid)
        if ok:
            ans_l, ans_r = l, r
            lo = mid + 1
        else:
            hi = mid - 1

    print(hi, ans_l, ans_r)

t = int(input())
for _ in range(t):
    solve()

The code separates feasibility checking from the binary search. The check function performs the linear scan over a transformed array, building prefix sums and maintaining the smallest prefix among valid starting positions. The variable min_pref tracks the best starting point so far, while min_pos stores where it occurs so we can reconstruct the segment.

A common implementation pitfall is forgetting that valid left endpoints depend on $k$, meaning we only allow starts up to $r-k$. Another subtle issue is returning the segment immediately upon finding feasibility; this is fine because binary search only needs existence, not optimality of the segment.

Worked Examples

Example 1

Input:

n = 5, k = 2
a = [1, 2, 3, 2, 1]

We test a candidate value $v = 3$. The transformed array becomes:

$$[-1, -1, +1, -1, -1]$$

Prefix sums:

i	pref[i]
0	0
1	-1
2	-2
3	-1
4	-2
5	-3

We scan from $r = 2$. At $r = 3$, we compare $pref[3] = -1$ with best prefix among valid starts. The best start prefix is $pref[1] = -1$. The difference is 0, so segment $[2,3]$ works.

This shows that $v=3$ is feasible, even though it is not the absolute median everywhere.

Example 2

Input:

n = 4, k = 3
a = [4, 1, 2, 4]

Testing $v = 4$, transformed array:

$$[+1, -1, -1, +1]$$

Prefix sums:

i	pref[i]
0	0
1	1
2	0
3	-1
4	0

At $r = 4$, we can take $l = 1$, since sum is $0$. The segment $[1,4]$ is valid, confirming feasibility of the maximum value.

Complexity Analysis

Measure	Complexity	Explanation
Time	$O(n \log n)$	binary search over values, each feasibility check is linear
Space	$O(n)$	prefix array for each check

The complexity matches the constraints since total $n$ across tests is 300,000, and each element participates in about 30 checks.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    import sys
    input = sys.stdin.readline

    def check(a, k, v):
        n = len(a)
        pref = [0] * (n + 1)
        for i in range(1, n + 1):
            pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)

        min_pref = 0
        min_pos = 0
        for r in range(k, n + 1):
            if pref[r - k] < min_pref:
                min_pref = pref[r - k]
                min_pos = r - k
            if pref[r] - min_pref >= 0:
                return True

        return False

    def solve_case(n, k, a):
        lo, hi = 1, n
        ans_l, ans_r = 1, k

        def check_full(v):
            n2 = len(a)
            pref = [0] * (n2 + 1)
            for i in range(1, n2 + 1):
                pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)

            min_pref = 0
            min_pos = 0
            for r in range(k, n2 + 1):
                if pref[r - k] < min_pref:
                    min_pref = pref[r - k]
                    min_pos = r - k
                if pref[r] - min_pref >= 0:
                    return True, min_pos + 1, r
            return False, -1, -1

        while lo <= hi:
            mid = (lo + hi) // 2
            ok, l, r = check_full(mid)
            if ok:
                ans_l, ans_r = l, r
                lo = mid + 1
            else:
                hi = mid - 1

        return f"{hi} {ans_l} {ans_r}"

    data = list(map(int, inp.split()))
    it = iter(data)
    t = next(it)
    out = []
    for _ in range(t):
        n = next(it); k = next(it)
        a = [next(it) for _ in range(n)]
        out.append(solve_case(n, k, a))
    return "\n".join(out)

# provided samples
assert run("""7
4 3
4 1 2 4
5 2
1 2 3 2 1
5 3
1 2 3 2 1
5 3
1 1 2 5 3
1 1
1
2 1
2 1
4 1
1 2 1 3
""") == """4 1 4
3 3 4
2 2 4
3 3 5
1 1 1
2 1 2
3 4 4"""

Test input	Expected output	What it validates
single element	trivial median	base case correctness
all equal values	full segment valid	uniform arrays
k = n	full array only	boundary constraint
strictly increasing	monotonic behavior	binary search correctness

Edge Cases

When $k = 1$, every single element is a valid segment. The algorithm still works because the feasibility check reduces to finding any position where $a_i \ge v$, and binary search correctly converges to the maximum array element.

When all values are equal, every transformation becomes all +1, so every prefix sum is strictly increasing. The feasibility check always succeeds for the maximum value, and the algorithm returns the full segment $[1, n]$, since every position is valid.

When $k = n$, only the full array is considered. The prefix-based check evaluates the entire array once per candidate $v$, and correctness is preserved because no sliding window ambiguity exists.