Episode #131: Rake Rules

Upgrade to download episode video.

Episode Script

In the last two episodes we constructed a Rakefile for building Markdown files into HTML. The file is working as it stands now, but it contains some duplication. There are two nearly identical rules for building HTML files, one that looks for source files with a .md extension, and one that looks for source files with a .markdown extension. It would be nice if we could combing them into a single, more generic rule.

source_files = Rake::FileList.new("**/*.md", "**/*.markdown") do |fl|
  fl.exclude("~*")
  fl.exclude(/^scratch\//)
  fl.exclude do |f|
    `git ls-files #{f}`.empty?
  end
end

task :default => :html
task :html => source_files.ext(".html")

rule ".html" => ".md" do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end

rule ".html" => ".markdown" do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end

We'll start by removing the second rule. If we run rake now, it fails:

source_files = Rake::FileList.new("**/*.md", "**/*.markdown") do |fl|
  fl.exclude("~*")
  fl.exclude(/^scratch\//)
  fl.exclude do |f|
    `git ls-files #{f}`.empty?
  end
end

task :default => :html
task :html => source_files.ext(".html")

rule ".html" => ".md" do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end
$ rake
rake aborted!
Don't know how to build task 'ch4.html'

Tasks: TOP => default => html
(See full trace by running task with --trace)

Before we go any further, let's talk about this error message a little bit. It says: “Don't know how to build task ‘ch4.html'”. This doesn't tell us a lot. It's also a little bit confusing, because it's talking about a task called ch4.html. But ch4.html is a file we want to build, not a task, right?

As it turns out Rake thinks about all of the things it is asked to build as tasks. The only difference between plain tasks and file tasks is that with a file task, Rake knows that if there is a file matching the name of the task, and that file is newer than any of its prerequisites, it needn't bother running the task at all.

In this case we know why Rake can't build this file, because we just removed the rule telling it how. But what if we didn't know? This message doesn't give us a whole lot to work with.

To get a better understanding of what Rake is up to, we can pass the –trace flag to it. This time, Rake leaves a breadcrumb trail to follow, telling us what it tried to do. First, it invoked the default task. We've made the default task dependent on the “html” task, so it invoked that one next.

The next step after that is an abrupt notification that rake has aborted because it didn't know how to build ch4.html, followed by a Ruby stack trace.

$ rake --trace
** Invoke default (first_time)
** Invoke html (first_time)
rake aborted!
Don't know how to build task 'ch4.html'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task_manager.rb:49:in `[]'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:53:in `lookup_prerequisite'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `block in prerequisite_tasks'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `map'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `prerequisite_tasks'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:174:in `block in invoke_with_call_chain'
/usr/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:168:in `invoke_with_call_chain'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:197:in `block in invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `each'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:174:in `block in invoke_with_call_chain'
/usr/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:168:in `invoke_with_call_chain'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:161:in `invoke'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:149:in `invoke_task'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `block (2 levels) in top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `each'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `block in top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:115:in `run_with_threads'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:100:in `top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:78:in `block in run'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:165:in `standard_exception_handling'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:75:in `run'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/bin/rake:33:in `<top (required)>'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/bin/rake:23:in `load'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/bin/rake:23:in `<main>'
Tasks: TOP => default => html

 

Let's ask Rake why it was trying to build ch4.html in the first place. We can do this by running Rake with the -P flag, which tells it to dump a list of prerequisites.

$ rake -P
rake default
    html
rake html
    ch1.html
    ch2.html
    ch3.html
    subdir/appendix.html
    ch4.html
This output makes it clear that the html task depends on a list of files, including ch4.html. Remember, we're still pretending we don't know what the problem is with this Rakefile. We've gathered a lot of insight into Rake's thinking, but so far we're none the wiser about the connection between HTML files and Markdown files, something we would need to understand in order to fix this problem. In order to peer even deeper into Rake's thought process, we next set an option called Rake.application.options.trace_rules to true in our Rakefile. Enabling this option does exactly what its name suggests: it tells Rake to give us tracing information about rules defined in the Rakefile.
Rake.application.options.trace_rules = true

source_files = Rake::FileList.new("**/*.md", "**/*.markdown") do |fl|
  fl.exclude("~*")
  fl.exclude(/^scratch\//)
  fl.exclude do |f|
    `git ls-files #{f}`.empty?
  end
end

task :default => :html
task :html => source_files.ext(".html")

rule ".html" => ".md" do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end
We run rake -trace again, and this time in addition to the breadcrumb trail of task invocation, we see some new information. For each file build, Rake tells us that it is attempting to use a rule in which a .html depends on a corresponding .md file. When it gets to ch4.html, it fails. It doesn't explicitly say that the prerequisite file ch4.md couldn't be found. But with the information in front of us now, we can reasonably deduce what the problem is.
$ rake --trace
** Invoke default (first_time)
** Invoke html (first_time)
Attempting Rule ch1.html => ch1.md
(ch1.html => ch1.md ... EXIST)
Attempting Rule ch2.html => ch2.md
(ch2.html => ch2.md ... EXIST)
Attempting Rule ch3.html => ch3.md
(ch3.html => ch3.md ... EXIST)
Attempting Rule subdir/appendix.html => subdir/appendix.md
(subdir/appendix.html => subdir/appendix.md ... EXIST)
Attempting Rule ch4.html => ch4.md
(ch4.html => ch4.md ... FAIL)
rake aborted!
Don't know how to build task 'ch4.html'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task_manager.rb:49:in `[]'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:53:in `lookup_prerequisite'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `block in prerequisite_tasks'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `map'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:49:in `prerequisite_tasks'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:174:in `block in invoke_with_call_chain'
/usr/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:168:in `invoke_with_call_chain'
!/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:197:in `block in invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `each'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:195:in `invoke_prerequisites'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:174:in `block in invoke_with_call_chain'
/usr/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:168:in `invoke_with_call_chain'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/task.rb:161:in `invoke'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:149:in `invoke_task'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `block (2 levels) in top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `each'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:106:in `block in top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:115:in `run_with_threads'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:100:in `top_level'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:78:in `block in run'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:165:in `standard_exception_handling'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/lib/rake/application.rb:75:in `run'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/gems/rake-10.1.0/bin/rake:33:in `<top (required)>'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/bin/rake:23:in `load'
/home/avdi/.rvm/gems/ruby-1.9.3-p327/bin/rake:23:in `<main>'
Tasks: TOP => default => html
  Now it's time to make the rule work again. We start by defining a method, source_for_html. It will take the name of an HTML file, and return the name of the corresponding Markdown file. In order to do so, it needs access to the source files list. Right now the list is a local variable, which won't be accessible inside this method. We change it to a constant.
We then search the source files list for the first source file whose base name matches the base name of the given HTML file name. In order to compare just the base names, we use the #ext method again. You might remember that we used this method on the source file list in order to derive a list of HTML output file names. This time we pass an empty string to #ext in order to remove the file extension entirely. “Wait just a darn minute!” I can hear you saying. “We sent the #ext message to a FileList before. But here we're sending it to individual file name strings! How does that work?” As it turns out, Rake modifies Ruby's String class to support some of the same methods that FileList supports, so that we can do the same operations on FileLists and individual file names interchangeably.
Rake.application.options.trace_rules = true

SOURCE_FILES = Rake::FileList.new("**/*.md", "**/*.markdown") do |fl|
  fl.exclude("~*")
  fl.exclude(/^scratch\//)
  fl.exclude do |f|
    `git ls-files #{f}`.empty?
  end
end

task :default => :html
task :html => SOURCE_FILES.ext(".html")

rule ".html" => ->(f){source_for_html(f)} do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end

def source_for_html(html_file)
  SOURCE_FILES.detect{|f| f.ext('') == html_file.ext('')}
end
Now we have a method which, given an HTML file name, can search back to find the source Markdown file needed to generate it. Now we need to make the .html rule use this method. We do this by replacing the .md dependency in the rule with a lambda. Inside the lambda, we take the sole argument and pass it into our #source_for_html method. When Rake tries to build a .html file, it will now pass the name of the target file to the lambda we provided as a prerequisite. It will take the return value of this lambda and see if it matches an existing file. If so, it considers the rule a match and proceeds to execute the associated code. We still have rule tracing enabled, so we get a window into how Rake reasons using our updated rule. When it comes to the ch4.html target, it correctly determines that the prerequisite is ch4.markdown, not ch4.md. Finding that file, it goes ahead and builds the ch4.html file.
t$ rake
Attempting Rule ch1.html => ch1.md
(ch1.html => ch1.md ... EXIST)
Attempting Rule ch2.html => ch2.md
(ch2.html => ch2.md ... EXIST)
Attempting Rule ch3.html => ch3.md
(ch3.html => ch3.md ... EXIST)
Attempting Rule subdir/appendix.html => subdir/appendix.md
(subdir/appendix.html => subdir/appendix.md ... EXIST)
Attempting Rule ch4.html => ch4.markdown
(ch4.html => ch4.markdown ... EXIST)
pandoc -o ch4.html ch4.markdown
We now have a generic rule for building HTML files from Markdown files with either long or short extensions. But more importantly, we have a lot more insight into how Rake works with rules to determine what to build, and how. Happy hacking!