Ruby on Rails Monday, July 4, 2011

Unicode uses them to indicate to the application reading the text file
which order the following bytes are in. Since UTF-8 uses compound
characters to indicate the scary-high end of the unicode character
table (two bytes needed to encode some characters) the order that the
bits arrived in is of critical importance. Text files may be little-
endian or big-endian, and unless you know what order to expect, you
can't really know.

Walter

On Jul 4, 2011, at 3:02 AM, Sebastian wrote:

> Thank you for your reply!
>
> Stripping the first chars is possible of course, but I don't
> understand why these chars are there.
>
> It was working before! I could just upload the utf-8 csv and everthing
> was working great before. I don't really know what I changed that now
> these chars are appearing.
>
> Sebastian
>
> On 1 Jul., 15:12, Frederick Cheung <frederick.che...@gmail.com> wrote:
>> On Jul 1, 11:48 am, Sebastian <sebastian.go...@googlemail.com> wrote:
>>
>>> OK,
>>
>>> it was working perfectly when I just made sure that my csv file is
>>> in
>>> utf-8 encoding format.
>>
>>> I deleted some of my programm, so I had to write a lot of stuff
>>> again.
>>
>>> If I now upload a csv file which is in utf-8 format and then I have
>>> every time in the first row that the first three character are: \xEF
>>> \xBBxBF
>>
>> That's a utf BOM: a magic unicode character that tells whoever is
>> reading the stream what endianness is and also allows to tell UTF8
>> apart from utf16
>> You can safely strip them from the file.
>>
>>
>>
>>> I read that these is something about unicode and ordering, but i
>>> don't
>>> know where these hex chars come from.
>>
>>> Also every german special character is also shown in this hex code,
>>> e.g. "k\xC3\xBChler" should be "kühler"
>>
>> That is probably just an output thing if you are seeing this in a
>> terminal window- \xC3\xBC is the utf8 sequence for ü
>>
>> Fred
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> If I use files in other encodings there are not these three chars in
>>> the beginning, but every special char is "?"
>>
>>> Has anyone an idea where this comes from?
>>
>>> Cheers,
>>> Sebastian
>>
>>> On 22 Jun., 13:26, Sebastian <sebastian.go...@googlemail.com> wrote:
>>
>>>> file.temp is an object. I have a form where a csv can be
>>>> uploaded, but
>>>> it is never stored. That's why I use tempfile. That means that I
>>>> probably have no path to use in that method.
>>
>>>> BUT, the open and foreach method for the CSV class is working
>>>> with an
>>>> object whenever I don't have a german special character in my csv
>>>> file
>>>> or when my csv file is already in utf-8 encoding format.
>>
>>>> On 22 Jun., 12:05, Chirag Singhal <chirag.sing...@gmail.com> wrote:
>>
>>>>> What does file.tempfile return?
>>>>> If it is a file object, then we have a problem, we need to pass
>>>>> in file path
>>>>> here.
>>>>> So call path on the file object and pass that as the first
>>>>> argument.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Ruby on Rails: Talk" group.
> To post to this group, send email to rubyonrails-
> talk@googlegroups.com.
> To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
> .
>

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

No comments:

Post a Comment