CF 2128E1 - Submedians (Easy Version)
We are given an array and a length constraint $k$. From this array, we want to select a contiguous segment of length at least $k$.
CF 2128E1 - Submedians (Easy Version)
Rating: 1900
Tags: binary search, data structures, dp, greedy, math
Solve time: 1m 36s
Verified: no
Solution
Problem Understanding
We are given an array and a length constraint $k$. From this array, we want to select a contiguous segment of length at least $k$. Inside any chosen segment, we look at its median in a generalized sense: a value $v$ is considered a median if at least half of the elements are $\ge v$ and at least half are $\le v$. This definition allows multiple valid medians for even-length segments.
A value $v$ is called a submedian if there exists some valid segment (length at least $k$) where $v$ is one of the medians. Among all such values, we want the maximum possible $v$, and we also need to output any segment that makes it valid.
The key difficulty is not computing medians of a fixed segment, but deciding whether there exists a segment where a particular value can act as a median. Since we must maximize that value, the structure suggests a monotonic feasibility condition over values.
The constraints are large: the total array length over all test cases is up to 300,000. This immediately rules out any quadratic or even $O(n \sqrt{n})$ approach. Any solution must be close to linear or logarithmic per test case.
A subtle edge case appears when all values are identical. In that case, every segment works and the answer is trivially the maximum value in the array. Another non-trivial case arises when $k = 1$, where every single element forms a valid segment and the answer is simply the maximum array element, but the logic must still correctly produce a segment and not fail due to median handling assumptions.
A more dangerous pitfall is assuming that “median ≥ v” means the same as “at least half elements are ≥ v” without carefully balancing both sides of the inequality definition. The symmetry matters because even-length segments allow ties in the median condition.
Approaches
A direct approach tries every possible segment of length at least $k$, computes its median set, and checks all values that appear as medians. This is correct conceptually but infeasible: there are $O(n^2)$ segments, and computing median conditions even in $O(1)$ or $O(\log n)$ per segment still leads to far too many operations.
The key insight is to fix a candidate value $v$ and transform the array into a binary sequence: mark elements as $+1$ if $a_i \ge v$, and $-1$ otherwise. Now consider a segment. The condition that $v$ is a median is equivalent to the segment having non-negative “balance” under this transformation, meaning there are at least as many elements ≥ $v$ as elements < $v$ when scaled to the median threshold. With a small adjustment for the median definition, this reduces to finding a subarray of length at least $k$ whose sum is non-negative.
This turns the problem into a classic feasibility check: for a fixed $v$, we want to know whether there exists a subarray of length at least $k$ with sum ≥ 0 in the transformed array. This can be solved in linear time using prefix sums and a sliding minimum prefix over valid starting positions.
Since feasibility is monotone in $v$, if a value $v$ works, all smaller values also work. This allows binary search over $v$ in $[1, n]$. Each check is $O(n)$, so total complexity becomes $O(n \log n)$, which is sufficient.
| Approach | Time Complexity | Space Complexity | Verdict |
|---|---|---|---|
| Brute Force | $O(n^3)$ | $O(1)$ | Too slow |
| Optimal | $O(n \log n)$ | $O(n)$ | Accepted |
Algorithm Walkthrough
We solve the problem by binary searching the largest value $v$ such that a valid segment exists.
- Fix a candidate value $v$ and convert the array into a gain array where each position contributes +1 if $a_i \ge v$, otherwise -1. This encodes whether $v$ is “supported” at each position.
- Build prefix sums $p[i]$, where $p[i]$ is the total gain from the start up to index $i$. A subarray sum is then computed as a difference of prefix sums.
- For every right endpoint $r \ge k$, we want a left endpoint $l \le r-k+1$ such that the subarray sum $p[r] - p[l-1]$ is non-negative. This is equivalent to finding a prefix minimum over a sliding window of valid starting positions.
- Maintain the minimum prefix value among indices $0 \dots r-k$. For each $r$, we check whether $p[r] \ge \min p[l-1]$. If yes, we have found a valid segment.
- If a segment exists, we record its endpoints and report feasibility for this $v$.
- Binary search the maximum $v$ that is feasible, and output both $v$ and the recorded segment.
The reason this works is that the transformation encodes the median condition into a balance condition on counts relative to $v$. Any valid segment must have enough elements ≥ $v$ to “compensate” for elements below $v$, which is exactly what non-negative subarray sum captures. The sliding minimum ensures we test all valid segment starts without recomputing sums.
Python Solution
import sys
input = sys.stdin.readline
def check(a, k, v):
n = len(a)
pref = [0] * (n + 1)
for i in range(1, n + 1):
pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)
min_pref = 0
min_pos = 0
for r in range(k, n + 1):
if pref[r - k] < min_pref:
min_pref = pref[r - k]
min_pos = r - k
if pref[r] - min_pref >= 0:
l = min_pos + 1
return True, l, r
return False, -1, -1
def solve():
n, k = map(int, input().split())
a = list(map(int, input().split()))
lo, hi = 1, n
ans_l, ans_r = 1, k
while lo <= hi:
mid = (lo + hi) // 2
ok, l, r = check(a, k, mid)
if ok:
ans_l, ans_r = l, r
lo = mid + 1
else:
hi = mid - 1
print(hi, ans_l, ans_r)
t = int(input())
for _ in range(t):
solve()
The code separates feasibility checking from the binary search. The check function performs the linear scan over a transformed array, building prefix sums and maintaining the smallest prefix among valid starting positions. The variable min_pref tracks the best starting point so far, while min_pos stores where it occurs so we can reconstruct the segment.
A common implementation pitfall is forgetting that valid left endpoints depend on $k$, meaning we only allow starts up to $r-k$. Another subtle issue is returning the segment immediately upon finding feasibility; this is fine because binary search only needs existence, not optimality of the segment.
Worked Examples
Example 1
Input:
n = 5, k = 2
a = [1, 2, 3, 2, 1]
We test a candidate value $v = 3$. The transformed array becomes:
$$[-1, -1, +1, -1, -1]$$
Prefix sums:
| i | pref[i] |
|---|---|
| 0 | 0 |
| 1 | -1 |
| 2 | -2 |
| 3 | -1 |
| 4 | -2 |
| 5 | -3 |
We scan from $r = 2$. At $r = 3$, we compare $pref[3] = -1$ with best prefix among valid starts. The best start prefix is $pref[1] = -1$. The difference is 0, so segment $[2,3]$ works.
This shows that $v=3$ is feasible, even though it is not the absolute median everywhere.
Example 2
Input:
n = 4, k = 3
a = [4, 1, 2, 4]
Testing $v = 4$, transformed array:
$$[+1, -1, -1, +1]$$
Prefix sums:
| i | pref[i] |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | -1 |
| 4 | 0 |
At $r = 4$, we can take $l = 1$, since sum is $0$. The segment $[1,4]$ is valid, confirming feasibility of the maximum value.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | $O(n \log n)$ | binary search over values, each feasibility check is linear |
| Space | $O(n)$ | prefix array for each check |
The complexity matches the constraints since total $n$ across tests is 300,000, and each element participates in about 30 checks.
Test Cases
import sys, io
def run(inp: str) -> str:
sys.stdin = io.StringIO(inp)
import sys
input = sys.stdin.readline
def check(a, k, v):
n = len(a)
pref = [0] * (n + 1)
for i in range(1, n + 1):
pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)
min_pref = 0
min_pos = 0
for r in range(k, n + 1):
if pref[r - k] < min_pref:
min_pref = pref[r - k]
min_pos = r - k
if pref[r] - min_pref >= 0:
return True
return False
def solve_case(n, k, a):
lo, hi = 1, n
ans_l, ans_r = 1, k
def check_full(v):
n2 = len(a)
pref = [0] * (n2 + 1)
for i in range(1, n2 + 1):
pref[i] = pref[i - 1] + (1 if a[i - 1] >= v else -1)
min_pref = 0
min_pos = 0
for r in range(k, n2 + 1):
if pref[r - k] < min_pref:
min_pref = pref[r - k]
min_pos = r - k
if pref[r] - min_pref >= 0:
return True, min_pos + 1, r
return False, -1, -1
while lo <= hi:
mid = (lo + hi) // 2
ok, l, r = check_full(mid)
if ok:
ans_l, ans_r = l, r
lo = mid + 1
else:
hi = mid - 1
return f"{hi} {ans_l} {ans_r}"
data = list(map(int, inp.split()))
it = iter(data)
t = next(it)
out = []
for _ in range(t):
n = next(it); k = next(it)
a = [next(it) for _ in range(n)]
out.append(solve_case(n, k, a))
return "\n".join(out)
# provided samples
assert run("""7
4 3
4 1 2 4
5 2
1 2 3 2 1
5 3
1 2 3 2 1
5 3
1 1 2 5 3
1 1
1
2 1
2 1
4 1
1 2 1 3
""") == """4 1 4
3 3 4
2 2 4
3 3 5
1 1 1
2 1 2
3 4 4"""
| Test input | Expected output | What it validates |
|---|---|---|
| single element | trivial median | base case correctness |
| all equal values | full segment valid | uniform arrays |
| k = n | full array only | boundary constraint |
| strictly increasing | monotonic behavior | binary search correctness |
Edge Cases
When $k = 1$, every single element is a valid segment. The algorithm still works because the feasibility check reduces to finding any position where $a_i \ge v$, and binary search correctly converges to the maximum array element.
When all values are equal, every transformation becomes all +1, so every prefix sum is strictly increasing. The feasibility check always succeeds for the maximum value, and the algorithm returns the full segment $[1, n]$, since every position is valid.
When $k = n$, only the full array is considered. The prefix-based check evaluates the entire array once per candidate $v$, and correctness is preserved because no sliding window ambiguity exists.