CF 1914E1 - Game with Marbles (Easy Version)

Each color is an independent pile of marbles owned by both players. A move chooses a color that still exists on both sides. Suppose color i is chosen by Alice. Alice throws away exactly one of her marbles of that color, while Bob loses all bi marbles of that color.

CF 1914E1 - Game with Marbles (Easy Version)

Rating: 1400
Tags: brute force, games, greedy, sortings
Solve time: 2m 34s
Verified: yes

Solution

Problem Understanding

Each color is an independent pile of marbles owned by both players. A move chooses a color that still exists on both sides.

Suppose color i is chosen by Alice. Alice throws away exactly one of her marbles of that color, while Bob loses all b_i marbles of that color. After that move, Bob has zero marbles of color i, so that color can never be played again.

Similarly, if Bob chooses color i, Bob loses one marble and Alice loses all a_i marbles of that color.

Because any chosen color immediately becomes unavailable forever, every color is played at most once. The game is really about deciding who gets ownership of each color's final resolution.

The final score is

$$\text{Alice's remaining marbles} - \text{Bob's remaining marbles}.$$

Alice wants this value as large as possible, Bob wants it as small as possible.

The easy version has n ≤ 6. That is extremely small. Even trying every possible move sequence is feasible because the number of colors is tiny. The number of game states grows roughly like n!, which for n = 6 is only 720. With at most 1000 test cases, a complete minimax search is still comfortably fast.

The most common mistake is to think that colors are independent and can be optimized separately. They are not, because turn order determines who gets to claim a color.

Consider:

n = 2
a = [100, 1]
b = [1, 100]

If Alice gets color 1 and Bob gets color 2, the outcome is very different from the reverse assignment. The competition for turns is the entire game.

Another easy mistake is evaluating only the marbles removed during play instead of the final marbles remaining.

For example:

n = 1
a = [5]
b = [7]

If Alice chooses the only color, she loses 1 marble and Bob loses all 7. The final state is:

Alice: 4
Bob: 0

The score is 4, not 7 - 1 = 6. The game is defined by remaining marbles.

A third subtle point is that a color chosen by either player disappears forever.

Example:

a = [10]
b = [10]

After Alice plays the color, Bob cannot later play the same color. Any solution that allows multiple uses of one color is modeling a different game.

Approaches

Because n is at most six, the most direct approach is minimax.

At any state we know which colors are still available. If it is Alice's turn, she tries every playable color and chooses the move producing the largest eventual score. If it is Bob's turn, he chooses the smallest eventual score.

The challenge is evaluating a terminal state efficiently.

A useful observation is that each color contributes independently once we know who claimed it.

Suppose color i is never played. Then its contribution to the final score is

$$a_i - b_i.$$

Suppose Alice plays color i.

Alice keeps a_i-1 marbles and Bob keeps 0, so the contribution becomes

$$(a_i-1)-0 = a_i-1.$$

Compared with the untouched contribution a_i-b_i, the score changes by

$$(a_i-1)-(a_i-b_i)=b_i-1.$$

Similarly, if Bob plays color i, the contribution becomes

$$0-(b_i-1)=-(b_i-1),$$

which differs from the untouched contribution by

$$-(b_i-1)-(a_i-b_i)=-(a_i-1).$$

This lets us express the game entirely as accumulating gains and losses from chosen colors.

For the easy version, however, we do not actually need a sophisticated greedy argument. The state space is tiny, so a memoized minimax over subsets is enough and is very easy to prove correct.

The brute-force works because every move permanently removes exactly one color. After at most six moves the game ends. The total number of states is only

$$2^6 \times 2 = 128.$$

Memoization makes the search essentially instantaneous.

Approach Time Complexity Space Complexity Verdict
Brute Force Minimax without memoization O(n!) O(n) Accepted for easy version, but redundant work
Memoized Minimax on subsets O(n·2^n) O(2^n) Accepted

Algorithm Walkthrough

Let base be the score if no color is ever played:

$$base = \sum (a_i-b_i).$$

We will compute only the additional change caused by the played colors.

For a color i:

$$gainAlice_i = b_i - 1$$

because Alice choosing it increases the score by that amount relative to base.

$$gainBob_i = -(a_i - 1)$$

because Bob choosing it decreases the score by that amount relative to base.

We define a DP state by a bitmask of already-used colors and whose turn it is.

1. Compute the base score

Add up all a_i - b_i.

This is the score before accounting for any played colors.

2. Represent the game state by a bitmask

Bit i is set if color i has already been chosen by someone.

Every move sets exactly one previously unset bit.

3. Define the minimax function

Let

dp(mask, turn)

be the optimal score adjustment from the current state onward.

turn = 0 means Alice moves next.

turn = 1 means Bob moves next.

4. Handle terminal states

If every color is already used, no further adjustment is possible.

Return 0.

5. Alice's transition

Alice tries every unused color i.

Choosing color i immediately contributes

$$b_i-1.$$

After that the game continues from the new mask and Bob's turn.

Alice chooses the maximum value among all options.

6. Bob's transition

Bob tries every unused color i.

Choosing color i immediately contributes

$$-(a_i-1).$$

The game then continues from the new mask and Alice's turn.

Bob chooses the minimum value among all options.

7. Combine with the base score

The answer is

$$base + dp(0,\text{Alice turn}).$$

Why it works

The key property is that every color can be chosen at most once. Once a player selects a color, that color disappears permanently from the game. Because of this, the future state depends only on which colors have already been used and whose turn it is.

The contribution of a move is completely determined at the moment that color is chosen. Nothing that happens later can change that color again. The minimax recurrence exactly matches the rules of optimal play: Alice maximizes the final score, Bob minimizes it. Since every legal continuation is explored and memoized, the computed value equals the true game value.

Python Solution

import sys
from functools import lru_cache

input = sys.stdin.readline

def solve():
    t = int(input())
    ans = []

    for _ in range(t):
        n = int(input())
        a = list(map(int, input().split()))
        b = list(map(int, input().split()))

        base = sum(ai - bi for ai, bi in zip(a, b))
        full = (1 << n) - 1

        @lru_cache(None)
        def dp(mask, turn):
            if mask == full:
                return 0

            if turn == 0:  # Alice
                best = -10**30
                for i in range(n):
                    if not (mask & (1 << i)):
                        best = max(
                            best,
                            (b[i] - 1) + dp(mask | (1 << i), 1)
                        )
                return best

            else:  # Bob
                best = 10**30
                for i in range(n):
                    if not (mask & (1 << i)):
                        best = min(
                            best,
                            -(a[i] - 1) + dp(mask | (1 << i), 0)
                        )
                return best

        ans.append(str(base + dp(0, 0)))

    sys.stdout.write("\n".join(ans))

if __name__ == "__main__":
    solve()

The solution starts by computing the untouched score base. Every move then becomes a simple adjustment relative to that baseline.

The DP state contains only the used-color mask and the player to move. No marble counts need to be stored because a color's entire effect is determined once it is selected.

Alice's transition adds b[i] - 1. Bob's transition adds -(a[i] - 1). These values come directly from comparing the score after a color is played with the score if that color had remained untouched.

The recursion depth never exceeds n, which is at most six, so recursion is completely safe.

Worked Examples

Example 1

n = 3
a = [4, 2, 1]
b = [1, 2, 4]

Base score:

$$(4-1)+(2-2)+(1-4)=0$$

Adjustment values:

Color Alice gain Bob gain
1 0 -3
2 1 -1
3 3 0

Optimal play:

Turn Player Color Immediate Change Running Adjustment
1 Alice 1 0 0
2 Bob 3 0 0
3 Alice 2 +1 1

Final score:

base + adjustment = 0 + 1 = 1

This example shows that a move with zero immediate gain can still be optimal because it prevents the opponent from claiming a more valuable color.

Example 2

n = 4
a = [1, 20, 1, 20]
b = [100, 15, 10, 20]

Base score:

$$(1-100)+(20-15)+(1-10)+(20-20)=-103$$

Adjustment values:

Color Alice gain Bob gain
1 99 0
2 14 -19
3 9 0
4 19 -19

One optimal sequence is:

Turn Player Color Immediate Change Running Adjustment
1 Alice 1 99 99
2 Bob 4 -19 80
3 Alice 2 14 94
4 Bob 3 0 94

Final score:

-103 + 94 = -9

This trace demonstrates how both players are competing for the same pool of colors rather than maximizing local gains independently.

Complexity Analysis

Measure Complexity Explanation
Time O(n · 2^n) Each mask-turn state tries all unused colors
Space O(2^n) Memoization stores one value per state

With n ≤ 6, there are at most 2 × 2^6 = 128 states. Even across 1000 test cases the running time is tiny compared to the limits.

Test Cases

# helper: run solution on input string, return output string
import sys
import io
from functools import lru_cache

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)

    input = sys.stdin.readline
    t = int(input())
    out = []

    for _ in range(t):
        n = int(input())
        a = list(map(int, input().split()))
        b = list(map(int, input().split()))

        base = sum(x - y for x, y in zip(a, b))
        full = (1 << n) - 1

        @lru_cache(None)
        def dp(mask, turn):
            if mask == full:
                return 0

            if turn == 0:
                best = -10**30
                for i in range(n):
                    if not (mask & (1 << i)):
                        best = max(
                            best,
                            (b[i] - 1) + dp(mask | (1 << i), 1)
                        )
                return best

            best = 10**30
            for i in range(n):
                if not (mask & (1 << i)):
                    best = min(
                        best,
                        -(a[i] - 1) + dp(mask | (1 << i), 0)
                    )
            return best

        out.append(str(base + dp(0, 0)))

    return "\n".join(out)

# provided sample
assert run(
"""5
3
4 2 1
1 2 4
4
1 20 1 20
100 15 10 20
5
1000000000 1000000000 1000000000 1000000000 1000000000
1 1 1 1 1
3
5 6 5
2 1 7
6
3 2 4 2 5 5
9 4 7 9 2 5
"""
) == """1
-9
2999999997
8
-6"""

# minimum n
assert run(
"""1
2
1 1
1 1
"""
) == "0"

# symmetric large values
assert run(
"""1
2
1000000000 1000000000
1000000000 1000000000
"""
) == "0"

# Alice strongly prefers first color
assert run(
"""1
2
100 1
1 100
"""
) == "198"

# all equal values
assert run(
"""1
4
5 5 5 5
5 5 5 5
"""
) == "0"
Test input Expected output What it validates
n=2, all ones 0 Smallest valid size
All values 10^9 0 Large integer handling
[100,1] vs [1,100] 198 Turn competition over valuable colors
All values equal 0 Symmetric positions

Edge Cases

Consider the symmetric position:

1
2
5 5
5 5

Every color has Alice gain 4 and Bob gain -4. Alice takes one color, Bob takes the other. The adjustments cancel exactly, giving final score 0. The DP explores both possibilities and reaches the same result.

Now consider:

1
2
100 1
1 100

The untouched score is zero. Alice gain values are [0, 99], while Bob gain values are [-99, 0]. Alice immediately claims the second color worth 99. Bob is forced to take the remaining color for -99. The answer becomes 198. Any greedy rule based only on current marble totals can easily miss this interaction, but minimax handles it naturally.

Finally, consider a color where one side has an enormous number of marbles:

1
2
1000000000 1
1 1

The score adjustments involve values close to 10^9. Python integers have arbitrary precision, so no overflow occurs. The DP stores and combines these values exactly, producing the correct optimal score.