Ruby 1.9 Changes, Cherry Picked

In this post I look at some hand-picked changes in ruby 1.9 including new features and deprecations. Of course you can check the full changelog anytime at the main Ruby Language Site.

Ruby 1.9 brought forth some very nice and important changes which have been refined in v1.9.1 and v1.9.2. Naturally upgrades come with a cost: some of the new features might/will break compatibility with old codes and solutions. I’ll show you the most important things to look out for and also some of the awesomeness of this new version of ruby.

Some of the most important improvements in ruby 1.9

  • Much better threading - performance and tools are superior to ruby 1.8. Read more about threads and fibers at Paul Barry’s blog.
  • Unicode support, finally. Also a brand new encoding engine has been added. There’s a very informative blog post about this on Yehuda Katz’s blog.
  • The interpreter’s performance had been vastly improved. Check out the (v1.9.0) benchmark.
  • Rubygems is now integrated into ‘ruby’ itself.

Let’s look at the main incompatibilites that can break your code

  • The shiny new String class.
  • Literal Hash constructor has changed.
    • {“a”, “b”} no longer makes a Hash but a nice syntax error. You have to use {“a” => “b”} or {:a => “b”} or  {a: “b”}

Now for the other important changes and upgrades

Changes in the Hash class

  • Hash finally preserves insertion order
    • Ruby 1.8
      hash = {:a=> 'A', :b=>'B', :c=>'C', :d=>'D'}
      hash.keys ====>  [:b, :c, :d, :a]
      hash.values ====> ["B", "C", "D", "A"]
            
    • Ruby 1.9
      hash = {:a=> 'A', :b=>'B', :c=>'C', :d=>'D'}
      hash.keys ====>  [:a, :b, :c, :d]
      hash.values ====> ["A", "B", "C", "D"]
            
  • Hash#to_s behaves much nicer
    • Ruby 1.8
      hash = {:a=> 1, :b=>2, :c=>3, :d=>4}
      hash.to_s ====> "b2c3d4a1"
            
    • Ruby 1.9
      hash = {:a=> 1, :b=>2, :c=>3, :d=>4}
      hash.to_s ====> "{:a=>1, :b=>2, :c=>3, :d=>4}"
            
  • Hash#select now returns a hash, not an array
    • Ruby 1.8
      hash = {:a=> 1, :b=>2, :c=>3, :d=>4}
      hash.select{|k,v| k == :c }  ====> [[:c, 3]]
            
    • Ruby 1.9
      hash = {:a=> 1, :b=>2, :c=>3, :d=>4}
      hash.select{|k,v| k == :c } ====> {:c=>3}
            

String changes

  • Single character strings
    • Ruby 1.8
      irb(main):001:0> ?c
      => 99       
      irb(main):001:0> "cat"[1]
      => 97
            
    • Ruby 1.9
      irb(main):001:0> ?c
      => "c"
      irb(main):001:0> "cat"[1]
      => "a"
            
  • Encoding, encoding, encoding
    • All strings have an additional chunk of info attached: Encoding
       
      ruby-1.9.2-p136 :003 > "whatever".encoding
       => # 
      ruby-1.9.2-p136 :004 > "whatever".encoding.name
       => "UTF-8" 
            
    • String#size takes encoding into account - returns the encoded character count
       
      puts utf8_string.size    # >> 6
      puts latin1_string.size  # >> 6
            
    • You can get the raw datasize
       
      puts utf8_string.bytesize    # >> 8
      puts latin1_string.bytesize  # >> 6
            
    • Indexed access is by encoded data - characters, not bytes
       
      puts utf8_string[2..4]    # >> sum
      puts latin1_string[2..4]  # >> sum
            
    • You can change encoding by force but it doesn’t convert the data
       
      my_string = "Whatever"
      puts my_string.encoding.name  # >> US-ASCII
      my_string.force_encoding("UTF-8")
      puts my_string.encoding.name  # >> UTF-8
      
      # changing the encoding doesn't convert the data though!
      latin1_string.force_encoding("UTF-8")
      puts latin1_string.encoding.name    # >> UTF-8
      puts latin1_string.bytesize         # >> 6
      puts latin1_string.valid_encoding?  # >> false
      latin1_string =~ /AR/  # !> ArgumentError: invalid byte sequence in UTF-8
            
    • You can re-encode a string to ‘fix’ the above ‘error’
       
      transcoded_utf8_string= latin1_string.encode("UTF-8")
      puts transcoded_utf8_string.valid_encoding?  # >> true
            
    • Iterators have changed as String is not Enumerable anymore
      #Use each_byte, each_char or each_codepoint
      
      utf8_resume.each_byte do |byte|
        puts byte
      end
      # >> 82
      # >> 195
      # >> 169
      # >> 115
      # >> 117
      # >> 109
      # >> 195
      # >> 169
      
      utf8_resume.each_char do |char|
        puts char
      end
      # >> R
      # >> é
      # >> s
      # >> u
      # >> m
      # >> é
      
      utf8_resume.each_codepoint do |codepoint|
        puts codepoint
      end
      # >> 82
      # >> 233
      # >> 115
      # >> 117
      # >> 109
      # >> 233
      
      # if you need custom processing, you can get the enumerators with bytes, chars, lines and codepoints
      
      p utf8_resume.bytes.first(3)
      # >> [82, 195, 169]
      p utf8_resume.chars.find { |char| char.bytesize > 1 }
      # >> "é"
      p utf8_resume.codepoints.to_a
      # >> [82, 233, 115, 117, 109, 233]
      p utf8_resume.lines.map { |line| line.reverse }
      # >> ["émuséR"] 
            

Block variables now shadow local variables

#Ruby 1.9

irb(main):001:0> i=0; [1,2,3].each {|i|}; i
=> 0
irb(main):002:0> i=0; for i in [1,2,3]; end; i
=> 3

#Ruby 1.8.6

irb(main):001:0> i=0; [1,2,3].each {|i|}; i
=> 3

tr and Regexp are Unicode-compatible

You can now specify source file encoding

Basic

# coding: utf-8
Emacs

# -*- encoding: utf-8 -*-
Shebang

#!/usr/local/rubybook/bin/ruby
# encoding: utf-8

Inject methods

#Ruby 1.9

[1,2].inject(:+)

#Ruby 1.8.6

[1,2].inject {|a,b| a+b}

Lambda shorthand syntax

#Ruby 1.9

p = -> a,b,c {a+b+c}
puts p.(1,2,3)
puts p[1,2,3]

#Ruby 1.8.6

p = lambda {|a,b,c| a+b+c}
puts p.call(1,2,3)

Complex numbers

#Ruby 1.9

Complex(3,4) == 3 + 4.im

Multi-splat

# This will work on both Ruby 1.8 and 1.9
a, b, c = *[1, 2], 3
 
# This will fail on 1.8, but work on 1.9
a, b, c = 1, *[2, 3]
 
# Even this will work on 1.9, but not 1.8
a, b, c, d, e, f = *[1, 2], 3, *[4, 5]
   

Blocks now can accept block arguments

define_method(:answer) { |&b| b.call(42) }

    Comments

    Copyright © 2013 Csaba Okrona . Powered by Octopress