CF 4C - Registration System
We are building a username registration system. Every incoming request contains a desired username. If that username has never appeared before, registration succeeds immediately and we print OK.
Rating: 1300
Tags: data structures, hashing, implementation
Solve time: 1m 42s
Verified: yes
Solution
Problem Understanding
We are building a username registration system. Every incoming request contains a desired username. If that username has never appeared before, registration succeeds immediately and we print OK.
If the username is already taken, we must generate a new one by attaching the smallest positive integer suffix that produces a name not yet used. For example, if alex already exists, we try alex1, then alex2, and so on until we find a free name. The chosen generated name also becomes occupied.
The input gives a sequence of registration requests, one per line. The output must contain the system's response for each request in the same order.
The constraint of up to $10^5$ requests changes the problem completely. A solution that scans previously used names every time would become far too slow. With $10^5$ operations, we should aim for roughly linear time overall, or close to it. Hash tables are a natural fit because we repeatedly ask the same question: "Has this username already been used?"
The tricky part is that generated names themselves also become occupied. A careless implementation may only track original names and forget generated ones.
Consider this input:
4
bob
bob
bob1
bob
The correct output is:
OK
bob1
OK
bob2
After the second request, bob1 is already occupied because the system generated it. When the third request explicitly asks for bob1, it must already exist.
Another subtle case appears when many suffixes are already occupied:
5
a
a1
a2
a
a
The correct output is:
OK
OK
OK
a3
a4
A naive implementation that always starts checking from 1 would repeatedly test a1, a2, and so on. That still produces correct answers, but it wastes a large amount of time.
One more easy mistake is forgetting that the original name itself also needs tracking:
3
test
test
test
The correct output is:
OK
test1
test2
If we only track generated suffixes, the third request could incorrectly produce test1 again.
Approaches
The brute-force idea follows the problem statement directly. Maintain a set of used usernames. When a new request arrives, check whether it already exists.
If it does not exist, insert it and print OK.
If it already exists, start trying suffixes one by one:
name1
name2
name3
...
The first unused candidate becomes the answer.
This method is correct because it literally simulates the required process. The problem is performance. Suppose the input contains a repeated $10^5$ times. The first duplicate checks a1, the next checks a1 and a2, the next checks a1, a2, a3, and so on. The total number of checks becomes:
$$1 + 2 + 3 + \dots + (n-1)$$
which is $O(n^2)$. With $10^5$ requests, that is around $5 \times 10^9$ operations, far beyond the limit.
The key observation is that once we already know the next available suffix for a base name, we never need to retry smaller suffixes again.
For example, after generating alex7, we already know that alex1 through alex7 are occupied. The next duplicate request for alex should begin directly from suffix 8.
This transforms the problem into maintaining two pieces of information inside a hash map:
- Whether a username already exists.
- For every base name, what suffix should be tried next.
Hash tables give average $O(1)$ insertion and lookup, so each request can be processed in constant expected time.
| Approach | Time Complexity | Space Complexity | Verdict |
|---|---|---|---|
| Brute Force | $O(n^2)$ | $O(n)$ | Too slow |
| Optimal | $O(n)$ average | $O(n)$ | Accepted |
Algorithm Walkthrough
- Create a hash map called
mp.
The map stores usernames as keys. For a base name, its value represents the next suffix to try. 2. Process each incoming username one by one.
We must answer requests in order because every accepted username immediately affects future requests.
3. If the username does not exist in the map, print OK.
This means the name is completely unused, so registration succeeds immediately.
4. Insert this username into the map with value 1.
The value 1 means that if this name appears again later, the first suffix to try should be 1.
5. If the username already exists, read its stored counter.
Suppose mp[name] = k. This means suffixes smaller than k are already occupied, so there is no reason to test them again.
6. Construct the candidate name + str(k).
This is the smallest suffix that has not yet been tried for this base name. 7. Print the generated candidate and insert it into the map.
The generated name now becomes occupied just like any explicitly requested username.
8. Increase mp[name] by one.
The next duplicate request for the same base name should continue from the next suffix.
Why it works
The invariant is:
For every base username x, mp[x] always stores the smallest suffix
that has not yet been assigned for x.
Initially, after inserting x for the first time, the smallest unused suffix is 1, so mp[x] = 1.
Whenever we generate xk, we immediately increment the counter to k + 1. Since all smaller suffixes were already occupied before, and xk has just become occupied now, the next unused suffix is exactly k + 1.
Because the invariant always holds, every generated username is both valid and minimal.
Python Solution
import sys
input = sys.stdin.readline
def solve():
n = int(input())
mp = {}
for _ in range(n):
name = input().strip()
if name not in mp:
print("OK")
mp[name] = 1
else:
suffix = mp[name]
new_name = name + str(suffix)
print(new_name)
mp[name] += 1
mp[new_name] = 1
solve()
The dictionary mp plays two roles at the same time. Its keys represent all occupied usernames, while values for base usernames track the next suffix to use.
When a name appears for the first time, we print OK and initialize its counter to 1. This matches the invariant that the next duplicate should try suffix 1.
For duplicate requests, we directly read the stored suffix instead of scanning from 1 again. That single optimization removes the quadratic behavior.
The line:
mp[new_name] = 1
is extremely important. Generated usernames must also become reserved immediately. Without this line, future requests could accidentally reuse them.
Another subtle detail is updating mp[name] after generating a username. If we forget to increment it, the same suffix would be reused repeatedly.
The implementation uses input().strip() because each username comes on its own line and we do not want trailing newline characters inside the stored strings.
Worked Examples
Example 1
Input:
4
abacaba
acaba
abacaba
acab
| Step | Request | Map Before | Output | Map After |
|---|---|---|---|---|
| 1 | abacaba | {} | OK | {abacaba: 1} |
| 2 | acaba | {abacaba: 1} | OK | {abacaba: 1, acaba: 1} |
| 3 | abacaba | {abacaba: 1, acaba: 1} | abacaba1 | {abacaba: 2, acaba: 1, abacaba1: 1} |
| 4 | acab | {abacaba: 2, acaba: 1, abacaba1: 1} | OK | {abacaba: 2, acaba: 1, abacaba1: 1, acab: 1} |
This trace shows how the counter for abacaba advances from 1 to 2 after generating abacaba1. The algorithm never checks suffix 1 again for that base name.
Example 2
Input:
5
a
a
a
a1
a
| Step | Request | Map Before | Output | Map After |
|---|---|---|---|---|
| 1 | a | {} | OK | {a: 1} |
| 2 | a | {a: 1} | a1 | {a: 2, a1: 1} |
| 3 | a | {a: 2, a1: 1} | a2 | {a: 3, a1: 1, a2: 1} |
| 4 | a1 | {a: 3, a1: 1, a2: 1} | a11 | {a: 3, a1: 2, a2: 1, a11: 1} |
| 5 | a | {a: 3, a1: 2, a2: 1, a11: 1} | a3 | {a: 4, a1: 2, a2: 1, a11: 1, a3: 1} |
This example demonstrates that generated usernames behave exactly like normal usernames. Once a1 exists, requesting a1 again must generate a11.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | $O(n)$ average | Each request performs constant-time hash map operations |
| Space | $O(n)$ | Every accepted username is stored once |
With at most $10^5$ requests, linear complexity easily fits within the limits. Python dictionaries are optimized hash tables, so insertions and lookups are fast enough for this constraint.
Test Cases
# helper: run solution on input string, return output string
import sys
import io
def run(inp: str) -> str:
sys.stdin = io.StringIO(inp)
input = sys.stdin.readline
def solve():
n = int(input())
mp = {}
out = []
for _ in range(n):
name = input().strip()
if name not in mp:
out.append("OK")
mp[name] = 1
else:
suffix = mp[name]
new_name = name + str(suffix)
out.append(new_name)
mp[name] += 1
mp[new_name] = 1
return "\n".join(out)
return solve()
# provided sample
assert run(
"4\nabacaba\nacaba\nabacaba\nacab\n"
) == (
"OK\nOK\nabacaba1\nOK"
), "sample 1"
# minimum size
assert run(
"1\nx\n"
) == (
"OK"
), "single registration"
# all equal values
assert run(
"5\na\na\na\na\na\n"
) == (
"OK\na1\na2\na3\na4"
), "repeated identical names"
# generated names later requested explicitly
assert run(
"4\nbob\nbob\nbob1\nbob\n"
) == (
"OK\nbob1\nbob11\nbob2"
), "generated usernames must also be reserved"
# boundary style chaining
assert run(
"6\na\na1\na\na1\na2\na\n"
) == (
"OK\nOK\na2\na11\na21\na3"
), "nested suffix interactions"
# larger repeated sequence
large_input = "10\n" + "\n".join(["z"] * 10) + "\n"
assert run(large_input) == (
"OK\nz1\nz2\nz3\nz4\nz5\nz6\nz7\nz8\nz9"
), "many duplicates"
| Test input | Expected output | What it validates |
|---|---|---|
| Single username | OK |
Minimum input size |
| Many identical names | Sequential suffixes | Counter increments correctly |
bob, bob, bob1, bob |
bob1 must become occupied |
Generated names are reserved |
| Mixed suffix chains | Correct nested generation | Handles names that already contain digits |
Ten repeated z values |
z1 through z9 |
Large duplicate sequences remain efficient |
Edge Cases
Consider the case where a generated username is later requested explicitly:
4
bob
bob
bob1
bob
Step by step:
bobis new, printOK, storebob -> 1.bobexists, generatebob1, print it, updatebob -> 2, storebob1 -> 1.bob1already exists because it was generated earlier, so we must generatebob11.bobnow starts directly from suffix2, generatingbob2.
The output becomes:
OK
bob1
bob11
bob2
This confirms that generated usernames are treated exactly like normal ones.
Now consider repeated duplicates:
5
a
a
a
a
a
Execution:
- First
aprintsOK. - Second
aprintsa1. - Third
aprintsa2. - Fourth
aprintsa3. - Fifth
aprintsa4.
The algorithm never rechecks old suffixes because mp["a"] always stores the next unused suffix directly.
Finally, consider names that already contain digits:
5
a1
a1
a11
a1
a
Execution:
a1printsOK.- Duplicate
a1generatesa11. a11already exists, so it generatesa111.- Another
a1generatesa12. ais still unused, so it printsOK.
The output is:
OK
a11
a111
a12
OK
The algorithm never tries to parse digits from usernames. Every string is treated independently, which avoids many corner-case bugs.