Ruby on Rails Tuesday, May 31, 2011

So... the old database used what type of encoding? latin1? And the
new one uses utf8?

Is it a problem if some articles were already cleaned by doing a
search and replace, e.g. swapping all ’ for its corresponding proper
symbol?

Really appreciate the help!

On May 29, 2:19 pm, Frederick Cheung <frederick.che...@gmail.com>
wrote:
> On May 29, 6:45 pm, daze <dmonopol...@gmail.com> wrote:> I'm wondering if anyone can give any insight into how I could resolve
> > the problem on this website:
>
> >http://jdrampage.com/
>
> > basically, all the ' are supposed to be apostrophes ( ' ), and
> > quotes are messed up too...
>
> > Is it possible to run some command in the "rails console production"
> > to fix this?
>
> That does indeed look like an encoding issue. I assume that your '
> were in fact curly quotes. This kind of thing can happen when there is
> a mismatch between the encoding the database is using and what rails
> is using.
>
> For example if rails is using utf8, but the database connection is set
> to CP1252 then in order to save the curly quote character, your ruby
> script would send the bytes E3 80 99 which is the utf8 sequence for
> the uncode right single quotation mark (U2019).
> If your db connection is set to be latin1 (or any similar single byte
> encoding) then it will happily store that byte sequence as it is.
>
> If now your app were to start doing the right thing and ask the db for
> utf8 then converts what it things is latin1 (but is actually already
> utf8) into utf8 a second time and so you get garbage (in cp1252 E3 80
> 99 is ’ which is what I see on your website). In order to fix this
> you typically want to tell the database to reinterpret the contents of
> text columns as utf8. How exactly depends on your database, but in
> mysql something like
>
> alter table foos MODIFY some_column BLOB;
> alter table foos MODIFY some_column TEXT CHARACTER SET utf8;
>
> will reinterpret whatever is in some_column as utf8. This might not be
> exactly what you need - experiment with your data to see exactly what
> what has happened - I once had a case where text was going through
> this double encoding process twice so I had to repeat the above
> commands twice to straighten out the data). Once you've sorted things
> out, make sure you don't fall into this hole again by making sure that
> all your databases and tables have their default encoding set to utf8
>
> Fred
>
> > Really appreciate any help!!

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

No comments:

Post a Comment