regex - u character appears within a regular expression in python -
i have lines of code extracts email addresses pdf file. page in pdf.pages: pdf = page.extracttext() # print elpdf r = re.compile(r'[\w\-][\w\-\.]+@[\w\-][\w\-\.]+[a-za-z]{1,4}') results = r.findall(pdf) listemail.append(results) print(listemail[0:]) pdf.stream.close() unfortunately, after running code have noticed results not fine appears 'u' character every time match found: [[u'testuser1@training.local']] [[u'testuser2@training.local']] does know haow avoid character appearing? thanks in advance as others have noted, not bug, feature. if want non-unicode encoded strings, can convert text unicode more palatable. stackoverflow q/a cover subject: convert unicode string string in python (containing symbols) i've run before , in use cases, can problematic, encounter issues method expects non-unicode string , breaks. :) example solutions link: >>> a=u'aaa' >>...