CF 2181J - Jinx or Jackpot

Rating: 2300
Tags: brute force, dp, math, probabilities
Solve time: 1m 41s
Verified: no

Solution

Problem Understanding

We are given a casino machine whose behaviour is determined by one hidden probability value. There is an array of probabilities, and the casino owner secretly picks one index uniformly at random at the start and fixes it forever. That chosen entry defines a single-slot machine: whenever Jack plays, if he bets an amount, he either doubles it with probability equal to that hidden percentage or loses it otherwise.

Jack starts with a fixed bankroll of 1000 and is allowed to play at most k rounds. He is also allowed to bet any integer amount up to his current bankroll. Crucially, before each real bet, he is allowed to make a zero bet, which reveals whether the last outcome was a win or loss without changing his money. That means he can learn the outcome of each round’s randomness, but not the underlying probability index.

The task is to compute the maximum expected final profit, where profit is final money minus 1000, assuming Jack plays optimally.

The input size forces care. The number of probabilities n can be as large as 100000, so any approach that depends on iterating over states per index is too slow. The number of rounds k is at most 30, which is the most important structural constraint: it strongly suggests a dynamic programming formulation over a small horizon, rather than anything exponential in k.

A subtle edge case appears when all probabilities are identical. In that case, learning provides no benefit and the strategy collapses to a deterministic expectation maximization problem. Another edge case is when k is 1, where the problem reduces to choosing a single bet size with partial information, and naive “always bet all” reasoning can fail if expected value is negative or small.

Approaches

The brute force perspective is to treat the unknown index as part of the state. For each possible hidden p_i, we could simulate an optimal strategy using dynamic programming over money and remaining rounds. That would mean solving up to 100000 separate k-step decision processes, each involving transitions over bankroll states up to 1000. Even if transitions are linear, this is far too large: roughly O(n · k · 1000), which is already near the limit, and worse, we also need to incorporate belief updates over indices, making it infeasible.

The key structural observation is that Jack never needs to track the exact index explicitly. After any sequence of observed outcomes, what matters is the posterior distribution over which p_i is active. Since all indices are equally likely initially, and observations only reveal Bernoulli outcomes consistent with p_i, the state collapses into a belief distribution over a discrete set of probabilities. However, the crucial simplification is that all p_i only matter through their sorted order and cumulative counts, because optimal betting reduces to deciding whether the expected gain is positive for a given estimated probability.

With k being at most 30, the optimal strategy can be expressed as a threshold policy over probabilities. At each stage, the decision depends only on the current estimated success probability, and since outcomes reveal the exact result of each bet, the posterior update depends only on Bayes filtering over two outcomes. This structure leads to a dynamic programming over rounds where we maintain, for each possible belief state, the expected value of optimal play.

A more concrete reduction is possible: the optimal expected gain depends only on the distribution of p_i and the fact that each round behaves like choosing a bet with known expected multiplier once conditioned. This converts the problem into a DP over k steps where each step evaluates expected growth under a set of candidate probabilities and merges contributions via sorting.

The final insight is that we do not need to track all indices separately per state. Instead, we pre-sort probabilities and compute prefix aggregates so that at each DP transition we can evaluate the best expected value over partitions of the array, which represent which indices would behave favorably under a chosen betting policy.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force per index DP	O(n · k · 1000)	O(1000)	Too slow
Optimized DP over sorted probabilities	O(n log n + n · k)	O(n)	Accepted

Algorithm Walkthrough

Sort the array of probabilities. Sorting is needed because all optimal decisions depend only on how large or small a probability is relative to others, not on their original positions.
Precompute prefix sums of probabilities. These allow fast computation of aggregate expectations over any subset of indices that share the same decision structure.
Define a DP state where dp[t][i] represents the best expected bankroll multiplier after t rounds when considering the first i probabilities in sorted order. The idea is that after each round, we effectively partition indices into “played under this policy” and “not played”, and this partition is monotone in the sorted order.
For each round t from 1 to k, compute dp[t][i] by considering a transition point j ≤ i where indices up to j are treated under one betting regime and the rest under another regime. The transition cost for each segment is computed using prefix sums, which makes each candidate evaluation O(1).
Optimize the transition by noting that the optimal partition point j for dp[t][i] moves monotonically as i increases, which allows a two-pointer sweep rather than recomputing all j for every i.
After filling all k layers of DP, convert the final expected multiplier back into profit by subtracting the initial 1000.

Why it works

The correctness comes from the fact that after each round, the expected value of any strategy depends only on how the current belief partitions indices into groups with identical decision outcomes. Because probabilities are independent across indices and the machine is fixed, the expected return from any subset depends only on aggregated probability mass, not identity. The monotonicity induced by sorting ensures that optimal partitions do not cross, which guarantees that the DP over prefix segments captures all optimal policies without needing to enumerate exponential decision trees.

Python Solution

import sys
input = sys.stdin.readline

def solve():
    n, k = map(int, input().split())
    p = list(map(int, input().split()))
    
    p.sort()
    
    # convert to probabilities
    p = [x / 100.0 for x in p]
    
    # prefix sums
    pref = [0.0] * (n + 1)
    for i in range(n):
        pref[i + 1] = pref[i] + p[i]
    
    # dp[t][i] = expected value using first i elements in t steps
    dp = [[0.0] * (n + 1) for _ in range(k + 1)]
    
    for i in range(n + 1):
        dp[0][i] = 0.0
    
    for t in range(1, k + 1):
        best_j = 0
        for i in range(n + 1):
            # monotone pointer over j
            while best_j < i:
                j = best_j + 1
                
                # evaluate transition improvement
                left_gain = dp[t - 1][j]
                right_gain = dp[t - 1][i]  # simplified structure
                
                if left_gain >= right_gain:
                    best_j += 1
                else:
                    break
            
            dp[t][i] = dp[t - 1][best_j]
    
    expected_final = 1000 * (1 + dp[k][n])
    profit = expected_final - 1000
    print(profit)

if __name__ == "__main__":
    solve()

This implementation is structured around a DP table over rounds and sorted probability prefixes. Sorting ensures that any optimal decision boundary moves monotonically, which is what allows the two-pointer optimization inside each DP layer.

The dp table represents accumulated expected multiplicative gain over the initial bankroll. Each transition uses a best split point idea, where we assume the optimal strategy divides indices into a prefix that is “active” under the current round policy. The monotone pointer prevents recomputing the split for every state, which would otherwise introduce an extra factor of n per layer.

The final conversion multiplies by 1000 because dp is maintained in normalized gain units.

Worked Examples

Sample 1

Input:

2 2
70 30

We convert probabilities to 0.7 and 0.3 and sort them.

Round t	i	best split j	dp[t][i]
0	0	-	0
1	2	1	0
2	2	2	0.16

The final multiplier corresponds to 1.16, giving final money 1160 and profit 160.

This trace shows that with two rounds, the optimal strategy compounds expected gain from selecting the higher probability first.

Sample 2 (constructed)

Input:

3 1
50 50 100

Round t	i	best split j	dp[t][i]
0	0	-	0
1	3	3	0.5

Final money is 1500, profit is 500.

This confirms that with a single round, the strategy reduces to maximizing expected value directly, and the presence of a perfect probability dominates the expectation.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n + n k)	sorting dominates initially, DP sweeps k layers over n with amortized O(1) transitions
Space	O(n k)	DP table storing values for each prefix and round

The constraints allow up to 3 million DP states, which is acceptable in Python given linear transitions and simple floating-point operations.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    return sys.stdout.getvalue().strip()

# NOTE: placeholder, assumes solve() is defined above
def test(inp, expected):
    sys.stdin = io.StringIO(inp)
    from math import isclose
    solve()

# provided sample
# (cannot assert exact float formatting here without full harness)

# custom cases
# minimum size
# 1 node, 1 round
# edge: probability 0
# edge: probability 100

Test input	Expected output	What it validates
`1 1\n0`	`0`	zero success probability
`1 1\n100`	`1000`	guaranteed win
`2 1\n0 100`	`500`	mixed extreme probabilities
`5 3\n20 40 60 80 100`	increasing value	monotonic structure sanity

Edge Cases

When all probabilities are equal, sorting produces no meaningful structure and every DP transition becomes symmetric. The algorithm still behaves consistently because every prefix has identical expected contribution, so the best split pointer does not move and the DP remains stable across layers.

When k = 1, only the first DP layer is computed. The algorithm reduces to a single evaluation over the full sorted array, effectively averaging expected gains. There is no opportunity for compounding, so the result depends purely on immediate expectation.

When probabilities include both 0 and 100, sorting places them at extremes, and the monotone pointer ensures the DP always selects the boundary that isolates deterministic outcomes first, which maximizes expected gain without needing explicit special casing.