Martin @ Blog

software development and life.

Flower

Encoding in Scala interpreter

One of the nice things of Scala is the availability of a command line interpreter based on the REPL principle (Read-evaluate-print loop). Last week, for a particular project, I wanted to generate a string containing a part of the UTF-8 character table.
Thanks to Scala’s concise syntax, this would not be very difficult:

(0x20AC until 0x20B6).foreach { x => print(x.toChar + " ") }

This example will print characters 0×20AC (euro symbol) up to 0×20B6 (an unknown symbol to me :) ).

However, the result I got on my system (Mac OS X 10.6.2 using Scala 2.8 nightly) was not really what I expected:

? ? ? ? ? ? ? ? ? ?


Yes, indeed, I only got a list of question marks. Several attempts to solve this problem (writing it to a file, printing it using other conversions, etc.) didn’t solve this problem. I expected it had something to do with the encoding, but not knowing enough on this subject prevented me from finding the actual problem. I ended up posting a question on Stackoverflow:

I am not able to print unicode characters correctly. Of course a-z, A-Z, etc. are printed correctly, but for example € or ƒ is printed as a ?.

print(8364.toChar)
results in ? instead of €. Probably I’m doing something wrong. My terminal supports utf-8 characters and even when I pipe the output to a seperate file and open it in a texteditor, ? is displayed.

I got one answer, stating that he (or she?) could not reproduce the problem:

Euro’s codepoint is 0×20AC (or in decimal 8364), and that appears to work for me (I’m on Linux, on a nightly of 2.8):

scala> print(0x20AC.toChar)
€

So, either there was an issue on Mac OS X or there was a bug in Scala. After a week or so, I still didn’t found the cause of my problem (of course, my original problem, printing the string containing UTF-8 characters was already solved in a different way). I decided to investigate a bit further. As most of the time, the cause was pretty obvious. Scala uses the system property file.encoding to determine which encoding it should use. I posted the ’solution’ on Stackoverflow:

The cause of the problem is the default encoding used by Mac OS X. When you start `scala` interpreter, it will use the default encoding for the specified platform. On Mac OS X, this is Macroman, on Windows it is probably CP1252. You can check this by typing the following command in the scala interpreter:

    scala> System.getProperty("file.encoding");
    res3: java.lang.String = MacRoman

According to the scala help test, it is possible to provide Java properties using the -D option. However, this does not work for me. I ended up setting the environment variable

    JAVA_OPTS="-Dfile.encoding=UTF-8"

After running scala, the result of the previous command will give the following result:

    scala> System.getProperty("file.encoding")
    res0: java.lang.String = UTF-8

Now, printing special characters works as expected:

    print(0x20AC.toChar)
    €

So, it is not a bug in Scala, but an issue with default encodings. In my opinion, it would be better if by default UTF-8 was used on all platforms. In my search for an answer if this is considered, I came across a discussion on the Scala mailing list on this issue. In the first message, it is proposes to use UTF-8 by default on Mac OS X when file.encoding reports Macroman, since UTF-8 is the default charset on Mac OS X (keeps me wondering why file.encoding by defaults is set to Macroman, probably this is an inheritance from Mac OS before 10 was released?). I don’t think this proposal will be part of Scala 2.8, since Martin Odersky wrote that it is probably best to keep things as they are in Java (i.e. honor the file.encoding property).

So the best way to prevent this issue, is to set file.encoding to UTF-8 using the JAVA_OPTS environment variable which is loaded by default on startup.

Tags: , ,

24 Responses to “Encoding in Scala interpreter”

  1. May 28th, 2010 at 12:47

    OLIO – A Miscellany » Scala – Random Notes says:

    [...] Character encoding [...]

  2. August 9th, 2010 at 20:22

    Jesper says:

    Thanks for that! I had exactly the same problem. You probably saved me from hours of work looking for the cause.

  3. August 31st, 2010 at 3:59

    Dillon Delcamp says:

    Sick and tired of obtaining low numbers of useless visitors to your site? Well i want to let you know about a brand new underground tactic that produces myself $900 on a daily basis on 100% AUTOPILOT. I could be here all day and going into detail but why dont you merely check their website out? There is a great video that explains everything. So if your serious about producing easy money this is the website for you. Auto Traffic Avalanche

  4. November 6th, 2010 at 7:16

    Shad Spurr says:

    Gosh, I’ve been looking about this specific topic for about an hour, glad i found it in your website!

  5. January 8th, 2011 at 9:29

    penis enlargement pills says:

    I like the template that you use. what name you use these templates? if allowed to know where I can get the template as your blog and whether I should change a little, such as colors, images and layout. thanks

  6. January 17th, 2011 at 5:13

    David Mortensen says:

    Thanks for publishing this! I was having the same problem and probably wouldn’t have solved it without your post.

  7. February 6th, 2011 at 5:18

    Odilia Ostler says:

    This is really fascinating, You are an overly skilled blogger. I’ve joined your feed and look ahead to seeking extra of your excellent post. Also, I’ve shared your website in my social networks!

  8. February 6th, 2011 at 5:21

    Lakita Alverson says:

    I delight in, result in I found just what I was having a look for. You have ended my four day lengthy hunt! God Bless you man. Have a nice day. Bye

  9. March 1st, 2011 at 15:15

    male enhancement says:

    hi true blogger, a good article my friend, to improve rankings in google one way is to increase the link popularity of your blog or website. If you wish please visit my site and submit your blog to my directory. thanks.

  10. April 17th, 2011 at 3:25

    Cody Coggeshall says:

    Notice: substantial statement. I apologize just before hand but I want to accentuate this helpful website with my very own encounter on wealth management . Appropriate now, I work at a large hedge fund that mostly uses alternatives so I might claim I’ve some insight into the monetary industry. Between my husband and I, we have always researched to Warren Buffet. When Buffett was a young man, he used leverage to develop into a multimillionaire by developing unions and sharing in their profit margins. He expended very small, but obtained a big piece of profits in return for his amazing investment selections. This was a genuinely win-win scenario for him and his partners. My suggestions would be to start slow with investment funds, read up on the award-winning thesis by Delos Chang – it genuinely facts exactly how wonderful mutual index funds are. S&P 500, in my view, is tough to beat if you’re searching for a decent return. Of course, if you’re committing during drug wars and mortgage loan property crises..that might be a bit problematic. But if you have not , please take a brief read on the arguments provided in Delos Chang’s article if you’re purchasing the lengthy run – say for children (quick tip: the educational 401k exempts you from federal monetary aid). If you are after to day trade, one guidance for you: better get your stuff collectively because you better understand more than the brokers and dealers within the NYSC!

  11. July 23rd, 2011 at 14:25

    http://www.researchcmd.com/?p=1555 says:

    ful bs male

  12. July 23rd, 2011 at 14:40

    http://www.conservatoryprices.co.uk/conservatory-2/the-most-durable-material-on-the-market-upvc/ says:

    lets be associates haha

  13. August 11th, 2011 at 9:14

    world wide web says:

    As a world wide web resource for corporations and know-how enthusiasts to comply with the newest and greatest developments in Unified Communications, IP Telephony, Hosted Communications and VoIP.

  14. October 15th, 2011 at 21:41

    Hapjes Maken says:

    Thanks for posting this, without your help i woudn’t solve this.

  15. December 3rd, 2011 at 10:50

    Manda Patter says:

    multifaceted concerns. You genuinely surpassed people??s expectations. Thanks for churning out the significant, dependable, informative and in

  16. December 6th, 2011 at 12:47

    Anxiety Symptoms says:

    Great weblog right here! Also your site a lot up fast! What host are you the usage of? Can I get your affiliate link on your host? I desire my web site loaded up as fast as yours lol

  17. December 15th, 2011 at 7:25

    http://nizhonipeterbilt.homeip.net/blog/index.php/?p=83 says:

    One thing I have actually noticed is the fact there are plenty of fallacies regarding the banking institutions intentions any time talking about property foreclosure. One delusion in particular is the bank needs to have your house. The bank wants your dollars, not your property. They want the funds they loaned you with interest. Avoiding the bank will undoubtedly draw some sort of foreclosed final result. Thanks for your publication.

  18. December 15th, 2011 at 7:30

    burberry store says:

    Wow, I enjoyed your neat post.

  19. December 16th, 2011 at 22:13

    Pain On Left Side Of Stomach says:

    I cling on to listening in the route of rumor talk about obtaining boundless online dependent grant programs so i’ve been seeking out close to to for that best online internet site to acquire one. Could you reveal to me please, notably in which could i arrive around some?

  20. December 18th, 2011 at 5:58

    Geraldo Dirocco says:

    Respect to author , some great entropy.

  21. December 25th, 2011 at 0:22

    Stra Lane says:

    I have added some pictures related to this on Flickr

  22. January 3rd, 2012 at 19:17

    reciprocality says:

    I’m really enjoying the design and layout of your site. It’s a very easy on the eyes which makes it much more pleasant for me to come here and visit more often. Did you hire out a designer to create your theme? Exceptional work!

  23. January 5th, 2012 at 8:29

    Alec Kase says:

    I don’t know where you get your informatio-n but the Grand Hotel in Jerome, Arizona was NOT a former insane asylum — it was a hospital. I should know because I was BORN there (1939). Like most hospirals it did have a pyschiatri-c ward but it was primarily a hospital not an Insane asylum!! cheap diablo 3 gold

  24. January 17th, 2012 at 3:49

    philosophy says:

    I not to mention my buddies happened to be reading the good points found on the blog and then the sudden I got a terrible feeling I never thanked the web site owner for those techniques. My people became as a result thrilled to learn all of them and have in effect extremely been having fun with these things. I appreciate you for getting simply kind as well as for going for this sort of incredible guides millions of individuals are really desperate to be aware of. My honest apologies for not expressing appreciation to you sooner.

Leave a Reply