Thursday, January 28, 2016

Remove adjacent duplicates in Java

Problem

Given a string, we want to repeatedly remove all adjacent duplicate characters until there are no adjacent duplicate characters. For example, "abbbc" would become "ac" and "acbbcd" would become "ad".
An interesting aspect of this problem is that it does not completely define a function: there are several correct answers to a given string, depending on the order in which the deletion happens. For example, "acbbccb" can be reduced to "acb" or just "a".

Algorithm

The algorithm that feels natural is to go through the initial string character by character, and check if that character is equal to the previous one, in which case both should be omitted. However, this is incorrect if a character appears three times in a row: we need a way to process all consecutive duplicates.
Our final algorithm will therefore go through the string and compare any new character with the last character we appended to our result. If they are equal, we will fast-forward as long as the following characters are identical. Once we run out of identical characters, we can safely delete the last character of our result. If the new character is different from the previous character, we append it to the result.
In terms of data structure, using a StringBuilder (or a Stack) is recommended, as it allows efficient append and removal. StringBuilder has the nice advantage of offering a simple toString method.
The final time complexity is linear in the size of the original string. The memory complexity is also linear and we cannot do better with the required API, since Java String objects are immutable. If our function takes a char[] as argument, it is possible to write an in-place version of this algorithm.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
private static String solve(String str){
  // Variable result will hold our solution.
  StringBuilder result = new StringBuilder();
  int i = 0;
  while (i < str.length()) {
    // Check wheter the character at position i is identical
    // to the last character that we have seen.
    if (result.length() > 0 &&
        str.charAt(i) == result.charAt(result.length() - 1)) {
      // We found a duplicate! Let's skip it.
      i++;
      // Are there more of that same character?
      while (i < str.length() &&
          str.charAt(i) == result.charAt(result.length() - 1)) {
        // We skip them as well.
        i++;
      }
      // There are no more characters identical to the last we
      // have seen. We can now delete it.
      result.deleteCharAt(result.length() - 1);
    } else {
      // This character is either the first, or is
      // different from the previous one: we can add it
      // to the result.
      result.append(str.charAt(i));
      i++;
    }
  }
  return result.toString();
}
This problem can be found on career cup.

1 comment:

  1. I don't think it's accurate
    for input: "mississipie" it returns "mipie" while the expected result is "mpie"

    ReplyDelete