Python Beginner Exercise 6: Remove Special character In String

Table of Contents
  • also post on DEV

    Question

    Given:

str1 = "/*Jon is @developer & musician"

Expected Output:

"Jon is developer musician"

My attempt

  • The hints tell me to use translate() and maketrans(), so I google and try to use it.
  • The first attempt, is to make a table replace the special character with white space
    str_1 = "/*Jon is @developer & musician"  
    x = "/*@&"  
    y = "    "  
    my_table = str_1.maketrans(x,y)  
    print(str_1.translate(my_table))
  • with an output
    Jon is  developer   musician
  • seems good, but there is too many white space between it, not the same as the equal to the expected result

Syntax of translate() and maketrans() method

The maketrans() method will return a mapping table for translate() method to use

translate() method syntax

str.translate(mapping table)

maketrans() syntax

string.maketrans(x, y, z)
  • x is required, y and z is optional

  • Only One parameter

    string.maketrans(dictionary)
  • With Two parameters

    string.maketrans(same length string, same length string)
  • Three parameters

    string.maketrans(same length string, same length string, another string)
  • another string can be not equal strength

Recommend solution

Solution 1: Use translate() and maketrans() method

import string

str_1 = "/*Jon is @developer & musician"
new_str = str_1.translate(str_1.maketrans('', '', string.punctuation))

print(new_str)
  • This solution use the maketrans method, take the string.punctuation as the third parameter to remove all the special charter in the original string
  • To use the punctuation method, you must import string module

Solution 2: use re.sub() method

import re  

str_1 = "/*Jon is @developer & musician"  

# replace special symbols with ''  
new_str = re.sub(r'[^\w\s]', '', str_1)  
print(new_str)
  • First step is to import re module.
  • second, construct a new string using re.sub method: replacing special character with empty string from str_1
    r'[^\w\s]' 
  • The above expression means match any character that is not an word character nor white space
  • --
  • "\w" means matches Unicode word characters; this includes most characters that can be part of a word in any language, as well as numbers and the underscore
  • "\s" Matches Unicode whitespace characters
  • the prefix r means raw string notation. The special character afterwards will not activate its special function, only treat as a normal character.
  • [] Used to indicate a set of characters.
  • if ^ in the set [] and is the first character , it means all the characters that are not in the set will be matched

My reflection

So I learn maketrans() and translate() method as well as the regular expression. Using re module seems easier to code, but the regular expression syntax is not that straight forward to read.

Credit

Leave a Reply