This error indicates that your Python installation can handle only 7-bit ASCII strings. There are a couple ways to fix or work around the problem.
If your programs must handle data in arbitrary character set encodings, the environment the application runs in will generally identify the encoding of the data it is handing you. You need to convert the input to Unicode data using that encoding. For example, a program that handles email or web input will typically find character set encoding information in Content-Type headers. This can then be used to properly convert input data to Unicode. Assuming the string referred to by value is encoded as UTF-8:
value = unicode(value, "utf-8")
will return a Unicode object. If the data is not correctly encoded as UTF-8, the above call will raise a UnicodeError exception.
If you only want strings converted to Unicode which have non-ASCII data, you can try converting them first assuming an ASCII encoding, and then generate Unicode objects if that fails:
try:
x = unicode(value, "ascii")
except UnicodeError:
value = unicode(value, "utf-8")
else:
# value was valid ASCII data
pass
It's possible to set a default encoding in a file called sitecustomize.py that's part of the Python library. However, this isn't recommended because changing the Python-wide default encoding may cause third-party extension modules to fail.
Note that on Windows, there is an encoding known as "mbcs", which uses an encoding specific to your current locale. In many cases, and particularly when working with COM, this may be an appropriate default encoding to use.
CATEGORY: programming