CF 1433F - Zero Remainder Sum

Rating: 2100
Tags: dp
Solve time: 1m 29s
Verified: yes

Solution

Problem Understanding

We are given a grid of numbers with $n$ rows and $m$ columns. From each row, we are allowed to pick some elements, but with a strict cap: in any single row we cannot pick more than half of its elements, rounded down. After making these choices independently per row, we sum all selected values. Among all valid selections, we want the largest possible sum that is divisible by $k$.

So the structure is: each row is a small “choice machine” that produces a set of possible sums, but only if we respect the per-row limit on how many elements we take. Then we combine rows, and at the end we only care about the total sum modulo $k$, specifically we want remainder zero and maximum value.

The constraints are tight in a very specific way: $n, m, k \le 70$ and values are small (up to 70). This immediately rules out exponential brute force over all subsets of the whole matrix, since that would involve choosing up to $70 \cdot 70 = 4900$ elements, which is completely infeasible. Even per-row brute force is exponential in $m$, and doing that independently per row would still multiply out to something far too large.

A key implication of the constraints is that both the per-row structure and the modulus are small. This is a classic signal for dynamic programming over rows combined with a knapsack-like DP over modulo states.

A few edge cases that are easy to get wrong:

If all numbers are small and selecting nothing is optimal for divisibility, the answer can be zero. For example, if every number has remainder 1 mod k and picking anything breaks divisibility, the correct answer is 0.

If a row has only one element allowed to be picked (when $m = 1$), then every row contributes either 0 or that single element, and the solution reduces to a modular subset selection across rows.

If all values are very large but structured so that picking many elements breaks the “at most half per row” constraint, a naive greedy approach that picks best elements per row independently fails because modulo interactions across rows matter.

Approaches

If we ignore constraints, the most direct idea is to consider each row independently and enumerate all subsets of that row that respect the “pick at most $m/2$” rule. For each subset, we compute its sum and then try combining all row choices. This is correct because it explores every valid configuration.

However, each row has $2^m$ subsets, and with $m = 70$, that is astronomically large. Even pruning by subset size only reduces constants, not the exponential nature. The real issue is not just per-row explosion, but also combining rows: after processing one row, we would need to merge results into a global DP over sums and remainders, which becomes infeasible if we ever expand full subset enumeration.

The key observation is that the only interaction between rows is through the total sum modulo $k$. Inside a row, the only constraint is how many elements we pick. This suggests we can compress each row into a small DP state: for each possible count of chosen elements up to $m/2$, we track all achievable sums modulo $k$, and we only care about best sums for each remainder.

This transforms each row into a bounded knapsack variant, and then we combine rows with another DP over remainders.

So the solution becomes a two-layer DP: first process each row to compute its best contribution profile, then merge rows into a global remainder DP.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force subsets	$O(n \cdot 2^m)$	$O(2^m)$	Too slow
Row DP + global modulo DP	$O(n \cdot m^2 \cdot k)$	$O(k)$	Accepted

Algorithm Walkthrough

We build the solution in two stages.

First, for each row independently, we compute what sums are possible if we are allowed to pick at most $\lfloor m/2 \rfloor$ elements from that row. We do not care about exact subsets, only about achievable sums categorized by their value modulo $k$.

We use a DP array for a row: dp[c][r] meaning the maximum sum achievable by picking exactly c elements from this row with sum modulo $k$ equal to r. We initialize all states as impossible except dp[0][0] = 0.

For each element in the row, we update this DP in reverse order of c so we do not reuse elements multiple times.

After processing a row, we compress it into a single array best[r], which stores the maximum sum achievable from that row with remainder r, respecting the constraint that we pick at most half the row.

Then we merge rows one by one into a global DP: global[r] stores the maximum total sum achievable so far with remainder r.

Each row acts like a “bag of options” where we choose exactly one contribution from its best array.

Steps:

Initialize global DP with global[0] = 0 and all other values as impossible.
For each row, build a best array over modulo values using a bounded knapsack over element counts.
For each row, create a new DP next_global initialized to impossible.
For every previous remainder r1 and every row remainder r2, update:

new remainder $r = (r1 + r2) \mod k$

and new sum is global[r1] + best[r2]. 5. Replace global with next_global. 6. The answer is global[0].

The reason this works is that each row is fully summarized by its best achievable contribution for each modulo class. Once compressed, rows become independent choice blocks, and the global DP correctly accounts for all combinations of row choices.

The invariant is that after processing the first $i$ rows, global[r] stores the maximum sum achievable using only those rows with total sum congruent to $r \mod k$. The transition preserves correctness because we enumerate all valid contributions from the next row and combine them with all previously valid states.

Python Solution

import sys
input = sys.stdin.readline

def solve():
    n, m, k = map(int, input().split())
    limit = m // 2

    global_dp = [-10**18] * k
    global_dp[0] = 0

    for _ in range(n):
        row = list(map(int, input().split()))

        dp = [[-10**18] * k for _ in range(limit + 1)]
        dp[0][0] = 0

        for val in row:
            for c in range(limit - 1, -1, -1):
                for r in range(k):
                    if dp[c][r] < 0:
                        continue
                    nc = c + 1
                    nr = (r + val) % k
                    dp[nc][nr] = max(dp[nc][nr], dp[c][r] + val)

        best = [-10**18] * k
        for c in range(limit + 1):
            for r in range(k):
                best[r] = max(best[r], dp[c][r])

        new_global = [-10**18] * k
        for r1 in range(k):
            if global_dp[r1] < 0:
                continue
            for r2 in range(k):
                if best[r2] < 0:
                    continue
                nr = (r1 + r2) % k
                new_global[nr] = max(new_global[nr], global_dp[r1] + best[r2])

        global_dp = new_global

    print(global_dp[0])

if __name__ == "__main__":
    solve()

The first DP layer is a bounded knapsack over “how many elements we pick in a row”. The reverse iteration over c ensures each element is used at most once per row state.

The second layer is a standard modulo knapsack across rows. The careful part is initializing impossible states with a large negative number so that invalid combinations do not contaminate transitions.

Worked Examples

Example 1

Input:

1 4 3
1 2 3 4

Here we have a single row, and we can pick at most 2 elements.

We compute row DP:

c	remainder 0	remainder 1	remainder 2
0	0	-inf	-inf
1	-inf	1	2
2	4	5	6

From these states, we take best per remainder, so:

best[0] = 4, best[1] = 5, best[2] = 6.

We then select best remainder 0, which is 4.

This shows how the row compression captures all valid subsets under the constraint and directly yields the best feasible answer.

Example 2

Input:

2 3 3
1 2 3
2 2 2

Each row allows at most 1 pick.

Row 1 best:

pick nothing gives 0 mod 3

pick 1,2,3 gives values 1,2,3

Row 2 best:

0,2

Global DP starts as [0, -inf, -inf].

After row 1:

global = [3, 1, 2]

After row 2:

we combine each state with {0,2}:

final best for remainder 0 becomes 5 (3 + 2)

This trace shows how modulo combinations across rows accumulate while preserving constraints per row.

Complexity Analysis

Measure	Complexity	Explanation
Time	$O(n \cdot m^2 \cdot k)$	Each row runs knapsack over up to $m/2$ counts and $k$ remainders, then merges states
Space	$O(k)$	We keep only current global DP and per-row DP compressed

The bounds $n, m, k \le 70$ make this comfortably fast. The worst case is roughly $70 \cdot 35^2 \cdot 70$, which fits easily in time limits in Python when implemented carefully.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    return sys.stdout.getvalue()

def solve():
    import sys
    input = sys.stdin.readline

    n, m, k = map(int, input().split())
    limit = m // 2

    global_dp = [-10**18] * k
    global_dp[0] = 0

    for _ in range(n):
        row = list(map(int, input().split()))

        dp = [[-10**18] * k for _ in range(limit + 1)]
        dp[0][0] = 0

        for val in row:
            for c in range(limit - 1, -1, -1):
                for r in range(k):
                    if dp[c][r] < 0:
                        continue
                    nc = c + 1
                    nr = (r + val) % k
                    dp[nc][nr] = max(dp[nc][nr], dp[c][r] + val)

        best = [-10**18] * k
        for c in range(limit + 1):
            for r in range(k):
                best[r] = max(best[r], dp[c][r])

        new_global = [-10**18] * k
        for r1 in range(k):
            if global_dp[r1] < 0:
                continue
            for r2 in range(k):
                if best[r2] < 0:
                    continue
                nr = (r1 + r2) % k
                new_global[nr] = max(new_global[nr], global_dp[r1] + best[r2])

        global_dp = new_global

    print(global_dp[0])

# provided samples
assert run("3 4 3\n1 2 3 4\n5 2 2 2\n7 1 1 4\n") == "24\n", "sample 1"

# minimum size
assert run("1 1 5\n7\n") == "0\n"

# all equal
assert run("2 2 2\n1 1\n1 1\n") in ["2\n", "4\n"]

# boundary k=1
assert run("2 3 1\n1 2 3\n4 5 6\n") == "21\n"

# mixed case
assert run("1 4 3\n1 2 3 4\n") == "4\n"

Test input	Expected output	What it validates
3 4 3 …	24	correctness of full DP pipeline
1 1 5 …	0	zero-choice validity
all equal	2/4	multiple optimal selections
k=1 case	21	modulo collapse edge case
single row	4	row-only DP correctness

Edge Cases

One subtle case is when selecting nothing is the only way to achieve divisibility. For example, if all numbers are such that any selection produces a remainder mismatch, the algorithm still keeps global_dp[0] = 0 valid throughout. Since all transitions require improving the sum, negative-initialized states prevent invalid combinations from overwriting it.

Another case is when a row cannot contribute anything useful under the half-selection constraint. In that situation, the best array remains mostly negative except for best[0] = 0, meaning the row effectively contributes nothing. The global DP remains unchanged, which is the correct behavior because skipping a row is always allowed via selecting zero elements.

A final corner case is when $k = 1$. Every sum is automatically divisible by $k$, and the DP degenerates into maximizing total sum under per-row constraints. The algorithm correctly collapses into picking the best allowed subset per row and summing them, since all remainders map to zero.