Ruby on Rails Tuesday, May 31, 2011

On May 31, 1:01 pm, daze <dmonopol...@gmail.com> wrote:
> So... the old database used what type of encoding?  latin1?  And the
> new one uses utf8?
>

It's your database, you tell me!

> Is it a problem if some articles were already cleaned by doing a
> search and replace, e.g. swapping all ’ for its corresponding proper
> symbol?

Depends. if you have replaced with pure ascii then it's not a problem.
if not (ie for a given column and table you have a mix of encodings)
then you will have made things worse.

Fred
>
> Really appreciate the help!
>
> On May 29, 2:19 pm, Frederick Cheung <frederick.che...@gmail.com>
> wrote:
>
>
>
> > On May 29, 6:45 pm, daze <dmonopol...@gmail.com> wrote:> I'm wondering if anyone can give any insight into how I could resolve
> > > the problem on this website:
>
> > >http://jdrampage.com/
>
> > > basically, all the ' are supposed to be apostrophes ( ' ), and
> > > quotes are messed up too...
>
> > > Is it possible to run some command in the "rails console production"
> > > to fix this?
>
> > That does indeed look like an encoding issue. I assume that your '
> > were in fact curly quotes. This kind of thing can happen when there is
> > a mismatch between the encoding the database is using and what rails
> > is using.
>
> > For example if rails is using utf8, but the database connection is set
> > to CP1252 then in order to save the curly quote character, your ruby
> > script would send the bytes E3 80 99 which is the utf8 sequence for
> > the uncode right single quotation mark (U2019).
> > If your db connection is set to be latin1 (or any similar single byte
> > encoding) then it will happily store that byte sequence as it is.
>
> > If now your app were to start doing the right thing and ask the db for
> > utf8 then converts what it things is latin1 (but is actually already
> > utf8) into utf8 a second time and so you get garbage (in cp1252 E3 80
> > 99 is ’ which is what I see on your website). In order to fix this
> > you typically want to tell the database to reinterpret the contents of
> > text columns as utf8. How exactly depends on your database, but in
> > mysql something like
>
> > alter table foos MODIFY some_column BLOB;
> > alter table foos MODIFY some_column TEXT CHARACTER SET utf8;
>
> > will reinterpret whatever is in some_column as utf8. This might not be
> > exactly what you need - experiment with your data to see exactly what
> > what has happened - I once had a case where text was going through
> > this double encoding process twice so I had to repeat the above
> > commands twice to straighten out the data). Once you've sorted things
> > out, make sure you don't fall into this hole again by making sure that
> > all your databases and tables have their default encoding set to utf8
>
> > Fred
>
> > > Really appreciate any help!!

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

No comments:

Post a Comment