Every Moday, a radio station in Sigapore will upload a new epsiode of their Movie Review program, one of my favorate podcasts. Unfortunately, they only update the html page part, the podcast feed updating usually happens days later.
I can’t wait that long, so I wrote my Ruby program to rip their html page, get mp3 url, and then, to generate a latest podcast feed used by my itunes.
mc = MovieCafe.new
newitem = mc.mp3_list.first
if newitem.mp3_url != mc.rss.items.first.enclosure.url
#item = RSS::Rss::Channel::Item.new
item = mc.rss.items.first
item.title = newitem.title
item.enclosure.url = newitem.mp3_url
item.description = newitem.description
item.pubDate = Time.now
mc.rss.channel.lastBuildDate = Time.now
The problem I got was, the charset from html is gb2312, while the podcast feed is utf-8. I had to covert charset. Eventually, I made it:
title = Iconv.new(“UTF-8″,”gbk”).iconv(title)
Here is my podcast feed of Movie Review channel, at least 2 days newer than the official one.