HpricotBasics on Hpricot
Perl??/W3C???????? - Walrus, Digit.
Web???? HTML ???????RSS ?????????
Rubyful Soup is a Ruby port of the hit Python HTML/XML parser Beautiful Soup. It's designed to be a useful quick-and-dirty parser for screen-scraping, along the same lines as its parent:
A library for parsing various microformats under Ruby
??????????????HTML??????URL???????????????????????????????????????handle() ? extrac() ??????
Syck is an extension for reading and writing YAML swiftly in popular scripting languages.
A JavaScript YAML dumper/parser.
This is a simple HTML link extractor designed for the person who does not want to deal with the intricacies of HTML::Parser or the de-referencing needed to get links out of HTML::LinkExtor.
REXML 2.4.2???????XPath?????????
???REXML API??????
Unfortunately, it will only work with an XML file on a local file system, but if you have a proxy you're using anyway (cross-domain ajax calls?) it's a simple task.
Java??????HTML?????????????????HTML????????????????????HTML?XHTML?????????????????????????
?????????????????????????????????????????????????????????????HTML?????RSS?OPML?????????
HTML::Selector::XPath ?????: blog.bulknews.net
Libxml2 ? C ?????C ???????????????????????? Libxml2 ???????????????????
URLParser - Public code - Trac
FasterCSV ?????? csv ???? 5?6 ????Python???????? ? ruby ????? 245 ????
Clear in the Overhead: Libxml-Ruby 0.4.0 preview 1 released!
Ruby ? JSON ???????????? - WebOS Goodies