UTF-8 GBK lookup in ruby

GBK and utf-8 are two most popular charsets in Chinese websites.  UTF-8 suppose to be the standard, but people use GBK for some reason, one of the reason is the windows default.

Conversion string between those two is not too hard, but my situation is that I need to find a lookup between them. For example, “探” in UTF code is 63a2, while in GBK code is CCBD. Usually, the display formats are \u63a2 for utf-8, and %CC%BD for GBK.

Code example:

#get hex
hex = "0x"+s.slice(/\u(.{4})/, 1).to_s

#utf-8 decode
utf = [hex.to_i(16)].pack('U*')

gbk = Iconv.conv( 'gbk','UTF-8',  (utf))

gbk_encode = URI.encode(gbk)



