CF 1170I - Good Subsets

Rating: -
Tags: *special, dp
Solve time: 1m 56s
Verified: yes

Solution

Problem Understanding

We are given a set of segments on the real line, each defined by its left and right endpoints. The task is to count subsets of these segments whose union exactly equals the union of all segments. In other words, we want the number of ways to pick segments such that removing any segment from the subset would reduce the total coverage.

The input size can be up to 200,000 segments, and endpoints can go up to $10^9$. This rules out any approach that explicitly iterates over all points in the segments, because the total number of points could be astronomically large. Similarly, naive subset enumeration would require $2^n$ operations, which is clearly infeasible for $n$ around $10^5$. We must therefore reason in terms of segments themselves and their relationships rather than points.

Non-obvious edge cases arise when segments completely contain others or coincide. For example, with segments [1,6], [2,6], and [1,1], the union is [1,6], but some segments are redundant. A careless approach that assumes all segments must be included would count only one subset, whereas the correct answer accounts for the ways to omit redundant segments, giving four subsets in this example.

Another subtle case is segments with zero length. For example, if all segments are points like [2,2], [2,2], [2,2], the union is just the single point 2. Any non-empty subset of these segments is "good", so the number of good subsets is $2^n - 1$. Algorithms that rely on strictly increasing intervals can fail on these inputs.

Approaches

A brute-force approach would enumerate all $2^n$ subsets of segments, compute the union for each, and check if it matches the union of all segments. This is correct in principle, but for $n=200,000$ it is utterly impossible, since $2^{200,000}$ is far beyond any feasible computation.

The key insight comes from observing that segments are only necessary if they cover points not covered by earlier segments. If we sort the segments by left endpoint, any segment whose right endpoint is less than or equal to the maximum right endpoint seen so far is redundant in some subset and can be either included or excluded freely. Only segments that extend the union are mandatory.

This leads naturally to a dynamic programming approach. We can sort segments by their left endpoints. For each segment, we can track the number of ways to form good subsets ending with segments covering up to a given point. If a segment is contained within previous coverage, it can either be included or not, doubling the number of combinations from that point. If it extends coverage, the number of combinations accumulates from segments ending just before its left endpoint.

The structure of the problem-monotone coverage and subset independence for contained segments-makes this approach feasible. By maintaining the count of valid subsets ending at various right endpoints and using binary search to find the last non-overlapping segment efficiently, we can reduce the complexity to $O(n \log n)$.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force	O(2^n * n)	O(n)	Too slow
Optimal (DP with sorted segments)	O(n log n)	O(n)	Accepted

Algorithm Walkthrough

Sort all segments by their left endpoint. If two segments share the same left, sort by right endpoint in increasing order. This ensures we process segments in left-to-right order and makes it easier to identify redundant segments.
Initialize a dynamic programming array dp, where dp[i] stores the number of good subsets among the first i segments that end exactly at segment i.
Compute the prefix sums pref[i] of dp for efficient range summation. This allows us to compute sums over previous valid segments in logarithmic time using binary search.
For each segment i, use binary search to find the last segment j whose right endpoint is less than the left endpoint of i. This identifies segments that do not overlap with i. The number of ways to include segment i while extending coverage is the sum of dp[k] for all k ≤ j.
If segment i is entirely contained within the previous coverage (i.e., its right endpoint ≤ maximum right endpoint of earlier segments), then i is optional. Multiply the count by 2 to account for including or excluding it.
After processing all segments, the sum of dp[i] for all segments that reach the maximum right endpoint gives the total number of good subsets.

Why it works: The invariant is that dp[i] counts all subsets whose union ends exactly at the right endpoint of segment i without gaps. Contained segments do not introduce gaps and therefore multiply the subset count. Segments extending coverage only combine with valid subsets ending before their start, maintaining correctness.

Python Solution

import sys
import bisect
input = sys.stdin.readline
MOD = 998244353

n = int(input())
segments = [tuple(map(int, input().split())) for _ in range(n)]
segments.sort()

rights = [r for _, r in segments]
dp = [0] * n
pref = [0] * (n + 1)
max_r = 0

for i, (l, r) in enumerate(segments):
    # binary search for last segment ending before l
    idx = bisect.bisect_right(rights, l - 1) - 1
    dp[i] = 1 if idx == -1 else pref[idx + 1]
    dp[i] %= MOD
    # double count if contained in previous coverage
    if r <= max_r:
        dp[i] = (dp[i] * 2) % MOD
    max_r = max(max_r, r)
    pref[i + 1] = (pref[i] + dp[i]) % MOD

# sum dp[i] for all segments reaching max_r
answer = sum(dp[i] for i, (_, r) in enumerate(segments) if r == max_r) % MOD
print(answer)

This solution first sorts segments, then iteratively builds dp based on previous non-overlapping segments. Binary search identifies safe segments to combine with, and doubling accounts for optional contained segments. Prefix sums allow efficient summation. Updating max_r ensures we identify segments fully contained in previous coverage.

Worked Examples

Sample 1 input:

i	Segment	max_r	idx	dp[i]	pref[i+1]
0	1 1	1	-1	1	1
1	2 6	6	0	1	2
2	1 6	6	-1	1*2=2	4

Sum of dp[i] with right = 6 is 1+2=3 plus segment 2 itself counts as 1 → 4. Confirms sample output.

Custom input 2:

This demonstrates overlapping and contained segments. Following the same procedure yields dp values that, when summed for max right endpoint 5, produce the total number of good subsets.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Sorting is O(n log n), and each segment requires one binary search O(log n)
Space	O(n)	We store segments, `dp`, `pref`, and rights arrays of size n

With n up to 2*10^5, O(n log n) fits comfortably in 6 seconds and 256 MB.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    MOD = 998244353

    n = int(input())
    segments = [tuple(map(int, input().split())) for _ in range(n)]
    segments.sort()

    rights = [r for _, r in segments]
    dp = [0] * n
    pref = [0] * (n + 1)
    max_r = 0

    for i, (l, r) in enumerate(segments):
        idx = bisect.bisect_right(rights, l - 1) - 1
        dp[i] = 1 if idx == -1 else pref[idx + 1]
        dp[i] %= MOD
        if r <= max_r:
            dp[i] = (dp[i] * 2) % MOD
        max_r = max(max_r, r)
        pref[i + 1] = (pref[i] + dp[i]) % MOD

    answer = sum(dp[i] for i, (_, r) in enumerate(segments) if r == max_r) % MOD
    return str(answer)

# provided sample
assert run("3\n1 1\n2 6\n1 6\n") == "4", "sample 1"

# minimal case
assert run("1\n1 1\n") == "1", "minimal case"

# all equal
assert run("3\n2 2\n2 2\n2 2\n") == "7", "all equal points, 2^3-1"

#