No matter how you’ve defined your tables (as containing UTF-8 data only, for example), the MySQLdb module that takes care of interacting with a MySQL database from Python simply refuses to return your data as proper UTF-8. Since it took me nearly an hour (and most of my hair) to figure out how to force it to do so anyway and I care enough to have others avoid the same fate, here’s what I learned:
Make the connection:
connection = MySQLdb.connect(host = dbhost, user=dbuser, passwd=dbpass, db=dbname)
And before you do anything else, run a query that forces MySQL to output UTF-8 from now on:
cursor = connection.cursor(MySQLdb.cursors.DictCursor) # or some other cursor type
cursor.execute('SET character_set_results="utf8"')
Once you’ve done this, all your subsequent queries will return your data in proper UTF-8 encoding.
The fact that you have to run a query suggests that it’s not Python or MySQLdb that’s to blame, by the way. Not that I particularly care, of course. I just want it to work.