Realigning the Engine Yard AppCloud UI »
Created at: 18.11.2010 02:04, source: Engine Yard Blog, tagged: News 960 css design ui ux
My colleague, Andrew announced the new Engine Yard AppCloud User Experience team about a month ago. We are very excited about the ideas we have for AppCloud's UI, and are working surely and steadily toward a better user experience. Along the way, one of our priorities was to refactor the UI's HTML/CSS for better maintainability and consistency. This allows us to work more efficiently on our design and front end architecture. Today, you will notice a major layout update: We've moved to a fixed-width layout. This is so the AppCloud UI can provide a greater experience on smaller screens -- still very important today with the popularity of smaller laptops and devices like the iPad. We opted for a fixed layout over a responsive fluid layout for now in order to maximize maintainability. In general, we believe fluid layouts are better suited for publication websites as opposed to applications. Instead of trying to reinvent the wheel, we opted for the 960.gs which is a solid grid system that is very easy to integrate into our workflow. Another change you will notice is that AppCloud is more tightened up and polished. We're implementing design systems for our typography, colors, form, tables, and other elements in our UI, which gives everything a consistent visual language. We'll show off our style guide in another upcoming blog post. We would love to know what you think. Please feel free to share your thoughts with us over email or Twitter.
more »
Double Shot #652 »
Created at: 19.02.2010 13:56, source: A Fresh Cup, tagged: Double Shot bundler css git hoptoad jquery nested_attributes query_trace rails
Remember kids, sleep is for the weak and sensible.
- Nathan Myhrvold's Intellectual Ventures Using Over 1,000 Shell Companies To Hide Patent Shakedown - The only reason I call the patent system "broken" is that I can't think of a more appropriate word that's fit for polite company.
- Git v1.7.0 Release Notes - Time to upgrade again. Note that there are some minor behavior changes.
- EZ-CSS - Yes, another CSS framework - but not yetanother grid.
- Hoptoad Notifier: Rack and automatic Metal support - Another functionality bump from the Hoptoad folks.
- A Visual Git Reference - With boxes and arrows.
- query_trace - I pushed out a new version of this gem yesterday.
- Rails 3 Beautiful Code - Presentation from Greg Pollack. Personally I am skeptical of "beauty" as a code metric in the working world.
- Bundler and I are breaking up - I've seen quite a bit of Bundler hate recently. Maybe it's just that it's not stable enough yet.
- Is The Singularity Here Yet? - Check the page source.
- Complex Nested Forms with Rails and Unobtrusive jQuery - This worked fine for me after I remembered that _delete had been renamed to _destroy in Rails 2.3.5.
more »
Double Shot #644 »
Created at: 09.02.2010 13:10, source: A Fresh Cup, tagged: auto_timeout aws css devise haml mapreduce rails riak ruby savon sinatra textmate
There's certainly no shortage of goings-on to link to lately.
- Devise 1.0.0 - Major milestone for this Rack-based authentication solution for Rails.
- Cloudfront: no-brainer CDN Support for S3 - I've got a couple of apps out there where I really should implement this.
- Save file and reload Safari from TextMate - Passed on by Alex Heaton.
- Bye Bye Github and GitHub post - Afterthoughts - Elad Meidar discusses expectations and git hosting.
- SendEmail 1.56 - Command line email sending that can handle TLS (and thus Google Mail servers). Requires some perl modules.
- sinatra_more - Gem to boost Sinatra to handling more complex applications.
- Auto timeout sessions in Rails - Useful plugin from Matthew Bass, for the times when you need to keep people from just idling. I ended up forking it to make it use jQuery instead of Prototype.
- MR.Flow - Web-based designer for MapReduce operations.
- EdgeRails.info - Ryan Daigle is moving all his edge content to a fresh domain.
- RubyInstaller Release Candidate 2 - 1.8.6, 1.8.7 and 1.9.1 - Useful tool for people who want to run Ruby on Windows without resorting to a virtual machine.
- Savon - Ruby SOAP client that includes WSSE authentication.
- Wrap your SQL head around Riak's Map-Reduce - Personally my SQL head says "no! enough already!" but I expect I'll have to learn this stuff sooner or later.
- Haml Sucks for Content - Actually a good post on some advanced HAML techniques rather than another entry in the holy war.
- Less.app - Autocompiler for your LESS CSS files. Well, if you have any.
more »
Double Shot #643 »
Created at: 08.02.2010 13:03, source: A Fresh Cup, tagged: Double Shot bundler css javscript jquery mockups rails riak whois
Whew, busy weekend. It's scary when you look forward to Monday as a day of rest.
- Rapid App RC1 (Rapp) & Rails 3.0b Support - Webbynode is early to the Rails 3.0 hosting party.
- The easiest way to test Rails 3 - Instructions from BitNami to get up and running with their RubyStack.
- The Path to Rails 3: Greenfielding new apps with the Rails 3 beta - Walkthrough from Jeremy McAnally.
- Signed and Permanent cookies in Rails 3 - Pratik covers one of the new Rails 3 features.
- 52Framework - Claims to be the first HTML5/CSS3 framework, and as far as I know, it is.
- Standard Browser Security Features - Surely the most comprehensive writeup of same origin policy in existence.
- Rails Plugins - Site that's tracking whether plugins are ready for Rails 3. Not much uptake yet.
- The Rails Initialization Process - Ryan Bigg goes digging into the guts of Rails 3.
- Why Riak should power your next Rails app - Deep dive into another NoSQL alternative.
- Auditing Rails Projects - Quite old, but still workable technique for adding rcov builds to cerberus.
- D'Note - Automatically extract TODO and FIXME and so on from your source code. This could come in handy as part of a build process.
- mocklinkr - Make your mockups linked & clickable.
- Facebooker tips 1 - SessionExpired, Cucumber and Default environment - Elad Meidar is starting to document his battles with Rails + FaceBook.
- RoboDomain - Domain management tool for people with a lot of domains.
- Ruby Whois 1.0 is here! - The code behind RoboDomain.
- rake db:size - Rake tasks to get database and table sizes. A little collaboration between Elad and myself.
- jDiv - jQuery rollover dropdown panel.
- jquery-twit - And, yes, Twitter display via jQuery.
- Bundle me some Rails - An introduction to the way Rails 3 does library management. I remain deeply skeptical.
- MegaMutex: A Distributed Mutex for Ruby - Lock things cross-machine using memcached.
- Javascript unpacker and beautifier - Makes dealing with minified js a bit easier.
more »
Getting Started with Nokogiri »
Created at: 14.01.2010 20:00, source: Engine Yard Blog, tagged: Technology css html Nokogiri ParseTree XML XPath
Nokogiri is a library for dealing with XML and HTML documents. I wrote Nokogiri along with my (more attractive) partner in crime, Mike Dalessio. We both use and enjoy working with Nokogiri for dealing with HTML and XML on a daily basis, and I’d like to share it with you! In this post, we’ll be covering:
- Getting Nokogiri installed
- Basic document parsing
- Basic data extraction
Hopefully by the end of this article you will also be able to use and enjoy Nokogiri on a day to day basis too!
Installation
Nokogiri is actually a wrapper around Daniel Veillard’s excellent HTML/XML parsing library written, libxml2. Since Nokogiri simply wraps and builds upon this already existing library, installing libxml2 is a prerequisite for installing Nokogiri. Fortunately, libxml2 has been ported to most systems, so the installation is pretty easy.
OS X
I recommend installing libxml2 on OS X from macports. OS X ships with libxml2 installed, but macports is more up to date, so I’d recommend using it instead.
To install libxml2 from macports:
$ sudo port install libxml2 libxslt
Then to install nokogiri:
$ sudo gem install nokogiri
And that should be it!
Linux
On Linux, we still need to install libxml2. The command for installing libxml2 will change depending on the package manager and linux distribution you’re using, but we’ll cover Fedora and Ubuntu here.
On Fedora:
$ sudo yum install libxml2-devel libxslt-devel $ gem install nokogiri
On Ubuntu:
$ sudo apt-get install libxml2 libxml2-dev libxslt libxslt-dev $ gem install nokogiri
Windows
Dealing with libxml2 on Windows is so much work, that we built libxml2 for you, and now ship it along with Nokogiri. On Windows, to install, simply do gem install nokogiri.
Oh Noes! Something Went Wrong!
Nokogiri ships with some basic intelligence for finding your installation of libxml2, but clever developers can easily fool it! If you have problems, first check that the libxml2 and libxslt development packages are installed. If everything seems OK, and Nokogiri still won’t install, send an email to the Nokogiri mailing list. We’re here to help!
Basic Parsing
Now that we have installation out of the way, it’s time to get Nokogiri to do some work for us. Nokogiri lets you parse an HTML or XML document using a few different strategies:
- DOM
- SAX
- Reader
- Pull
Each of these strategies have different advantages and disadvantages. We won’t go through all the differences in this post; the DOM interface is the most common, and generally regarded as the easiest to use, so that’s what we’ll focus on here.
There are two main entry points to Nokogiri depending on the kind of document you wish to parse: one for HTML documents and one for XML documents. Parsing HTML documents looks like this:
doc = Nokogiri::HTML(html_document)
Parsing XML documents looks like this:
doc = Nokogiri::XML(xml_document)
Both of these functions will take an IO object or a String object. Since both forms accept IO objects, we can even feed open-uri straight in to Nokogiri like this:
doc = Nokogiri::HTML(open("http://www.google.com/search?q=doughnuts"))
Feeding Nokogiri an IO object is slightly more efficient than using a String, but you should choose the one that is most convenient.
Data Structures
To become data extraction Zen Masters, we first need to understand the data structure returned by Nokogiri. Notably, we need to understand that Nokogiri converts HTML and XML documents into a tree data structure.
For example, an HTML document that looks like this:
<html>
<head>
<title>Hello!</title>
</head>
<body id="uniq">
<h1>Hello World!</h1>
</body>
</html>
…will be represented in memory with a tree that looks like this:

Any data extraction technique used is simply a way for traversing this in-memory tree. If we keep this structure in mind while trying to do data extraction, we can enter data extraction nirvana!
Data Extraction
We’ve seen how to turn an HTML or XML document into an in-memory tree. Now we’re going to try to do something useful with this tree: extract some data. Let’s take a look at a few different strategies for unlocking the data in our tree.
There are three different ways to traverse our in-memory tree. The first two, XPath and CSS, are small languages built specifically for tree traversal. The last one we’ll examine is the Nokogiri API for manual tree traversal.
Basic XPath
The XPath language was written to easily traverse an XML tree structure, but we can use it with HTML trees as well. Here’s a sample program for extracting search result links from a google search. We’ll use XPath to find the data we want, and then pick apart the XPath syntax:
require 'open-uri' require 'nokogiri' doc = Nokogiri::HTML(open("http://www.google.com/search?q=doughnuts")) doc.xpath('//h3/a').each do |node| puts node.text end
The XPath used in this program is:
//h3/a
In English, this XPath says:
Find all “a” tags with a parent tag whose name is “h3″
Thus, our program finds all “a” tags with “h3″ parents, loops over them, and prints out the text content.
XPath works like a directory structure where the leading “/” indicates the root of the tree. Slashes separate the tag matching information. When there’s nothing between slashes, it’s a sort of wild card—meaning “any tag matches”. The “h3″ and “a” are tag name matchers, and only match when the tag name matches.
Finding tag names is great, but if you run the previous program, you might find that it returns more “a” tags than we actually want. We need to narrow down our search based on some attributes of the tags, specifically the “class” values. To match attribute values in XPath, we use brackets. Now let’s look at a couple of examples.
To match “h3″ tags that have a class attribute, we write:
h3[@class]
To match “h3″ tags whose class attribute is equal to the string “r”, we write:
h3[@class = "r"]
Using the attribute matching construct, we can modify our previous query to:
//h3[@class = "r"]/a[@class = "l"]
which in English terms is:
Find all “a” tags with a class attribute equal to “l” and an immediate parent tag “h3″ that has a class attribute equal to “r”
If we substitute that XPath back in to our original program, we’ll get the expected results.
For more information on doing XPath queries, I recommend checking out the tutorial at w3schools as well as the w3 recommendation.
For more information on using XPath within Nokogiri, check out the Nokogiri tutorials as well as the RDoc.
Next, let’s look at CSS syntax.
Basic CSS
CSS is similar to XPath in that it’s another language for searching a tree data structure. In this section, we’ll perform the same task as the XPath section, but we’ll examine the CSS syntax.
CSS does not separate tag matching patterns by slashes, but rather by whitespace or “greater than” characters (actually, there are more, but we’re just going to talk about those two for now). Let’s rewrite our previous XPath as CSS and examine the syntax.
//h3/a
…can be written in CSS as:
h3 > a
The “>” character indicates that the “a” tag must be a direct descendant of the “h3″ tag. Most CSS that I see uses space separators like this:
h3 a
Using a space indicates that there could be any number of tags between the ”h3″ tag and the “a” tag. The space is similar to “//” in XPath, and this CSS query could be written in XPath like this:
//h3//a
Similar to XPath, CSS can use brackets for matching attributes. Let’s do a couple more XPath to CSS translations. On the left is XPath, on the right is CSS:
h3[@class] => h3[class] h3[@class = "r"] => h3[class = "r"]
This syntax works, but CSS provides us with a shorthand for matching the ”class” attribute. To find all h3 tags whose class attribute contains “r”, we can say:
h3.r
There’s a subtle difference between the two previous examples. The selector h3[@class = "r"] must be an exact match; the class value must exactly equal the string r. In the second example, the selector h3.r means “the class attribute must contain the value r”. That means h3.r will match the following tag, but h3[@class = "r"] will not:
<h3 class="r foo">Hi!</h3>
The XPath selector and our translated CSS selector would not match this tag, but the “h3.r” selector would. Most of the time, the CSS class selectors do what we want. Only when I need something very specific do I use the bracket form in my CSS selectors.
With this knowledge in hand, we can rewrite our original program using CSS selectors:
doc = Nokogiri::HTML(open("http://www.google.com/search?q=doughnuts")) doc.css('h3.r > a.l').each do |node| puts node.text end
I think the CSS selectors usually result in more concise and clear queries than XPath, so I usually stick to CSS queries in my code. There are some tasks which CSS cannot accomplish that XPath can though, so it’s nice to be able to fall back to XPath queries when I need to.
Next, let’s look at some basic node API’s provided by Nokogiri.
Basic Node API
Since we’re dealing with a tree data structure, Nokogiri provides methods for navigating that tree. In fact, all of the tree traversal we’ve seen so far using XPath and CSS can be accomplished manually via Ruby. Manual tree traversal is, however, cumbersome and verbose, which is why languages like XPath and CSS exist. Sometimes a combination of XPath or CSS plus manual tree traversal is easiest, so it is still important to know the API.
Every tag in a document is represented by class called a Node. Each node in the tree has 0 or more children, 0 or 1 parent, 0 or more siblings, and 0 or more attributes. Nokogiri provides methods for accessing all of these things on any particular node. We can access any of those relative nodes like so:
node.parent #=> parent node node.children #=> children nodes node.next_sibling #=> next sibling node node.previous_sibling #=> previous sibling node
These node access methods can be used for manually traversing a tree, but I tend to leave the hard work to XPath or CSS queries and only use manual tree access when I have to.
When it comes to accessing attributes of a tag, the node may be treated like a normal Ruby Hash. We can get and set attributes on a node like so:
node['class'] #=> the value of the class attribute node['class'] = 'foo'
We can even get a list of attributes or values of attributes like so:
node.keys #=> list of attribute name node.values #=> list of attribute values
For more information on things you can do with Nodes, check out the Node Documentation and also the Nokogiri tutorials section.
Conclusion
I hope this article has you on your way to HTML and XML parsing nirvana. Remember the tree data structure, and remember that XPath and CSS can be performed on HTML documents and XML documents.
Make sure to check out our documentation, and if you have any problems make sure to join the mailing list!
more »
