python3 jieba分詞不會遇到UnicodeEncodeError問題,因爲在cut函數加入了strdecode函數,處理編碼的問題,而python2並沒有做處理。
def cut(self, sentence, cut_all=False, HMM=True):
'''
The main function that segments an entire sentence that contains
Chinese characters into seperated words.
Parameter:
- sentence: The str(unicode) to be segmented.
- cut_all: Model type. True for full pattern, False for accurate pattern.
- HMM: Whether to use the Hidden Markov Model.
'''
sentence = strdecode(sentence)