There comes a time for a developer to face a bug so weird that it leaves him/her speechless and scramble to find an answer wherever possible.
After seemingly innocent deployment to production, I noticed huge spike in errors like the following:
ActionView::Template::Error: "\xEB" from ASCII-8BIT to UTF-8
Offending line was JSON encoding of geo location information provided in request header by our cache provider.
happens to be
character in ASCII, so it was probably some German city name.
Our site’s character type is UTF-8, and Ruby 2.0’s default encoding is UTF-8. I also verified that
and
are both UTF-8. It was our cache provider inserting non-UTF-8 character in request header.
No problem. I will just force UTF-8 encoding on the string.
Then, I was faced with another kind of problems.
ActionView::Template::Error: partial character in source, but hit end
Something is off.
Off to the best debugging method in Ruby – print the string encoding. And it’s ISO-8859-1 (or Latin-1).
So, the final solution was to force ISO-8859-1 encoding and encode it again in UTF-8.
string.to_s.force_encoding("ISO-8859-1").encode("UTF-8")
However, I am not 100% happy with the solution. It just smells. It works for now, but I need to look for better solution when I get a chance.
For bonus, check this out – [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)][1]