Encoding issues: cucumber
In my standalone test runner which uses ruby 1.9.2, cucumber driving firewatir, and using rspec matchers I have a test that looks like this:
Scenario: Single logout flash
Given I am logged in with "username"/"password"
When I log out
Then I should see "Du er nĂ¥ logget ut"
That last bit is in Norwegian. The project this code tests has support for two Norwegian written languages, and English (sort of). Actually, let’s not go there.
The browser is set up like this:
Watir::Browser.default = "firefox"
def browser
@browser ||= Watir::Browser.new
end
The test should be passing, but I get the following error on the line with the Norwegian text in it.
incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
Tthe step definition for the failing step is:
Then /^I should see "([^"]*)"$/ do |term|
browser.text.should include(term)
end
ASCII-8BIT isn’t really an encoding, as far as I can tell. It’s unencoded bytes (and is therefore aliased to BINARY).
There are a few places this can blow up, I think.
1. The file itself could be encoded wrong.
2. The regex could be having trouble with utf-8, so the term could be unencoded
3. the browser.text could be unencoded.
My first reaction was to add # encoding: utf-8
to the top of each of my files in the project. I should have thought of it earlier — I’ve been having to do this on a lot of projects with Norwegian output lately. Unfortunately that didn’t change the output for the failing scenario.
Next I dug into at the documentation for the ruby encoding class and tried setting default internal and external encoding in the env.rb file:
Encoding.default_internal = Encoding::UTF_8
Encoding.default_external = Encoding::UTF_8
No dice. The output remained the same.
Next, I tried forcing the regex to deal with utf-8 properly (/matcher/u)
Then /^I should see "([^"]*)"$/u do |term|
The output was unchanged.
Reluctantly, I tried what feels like a fairly dirty hack:
browser.text.force_encoding('utf-8').should include(term)
This worked.
So it seems like FireWatir’s text comes back unencoded.