Random Episode

Episode #134: Rake Clean

Upgrade to download episode video.

Episode Script

In the last episode we defined a Rake task to clean up the products of our build script. It did this through the simple expedient of recursively removing the “outputs” directory.

task :clean do
  rm_rf "outputs"
end

Sometimes cleanup isn't that simple. Today, we're going to start with a somewhat modified version of the Rakefile we've been developing.

As before, we're converting Markdown files to HTML. Unlike in the immediately preceding episode, the HTML files will be generated next to their source files—there are no separate source and output directories.

In addition to the rule for building Markdown files to HTML, we've added some new rules. We now have a rule for concatenating all the HTML fragment files into a single book.html. Then there's a rule to convert the book.html into an EPUB format ebook using the ebook-convert command from the Calibre ebook package. Finally, there's a rule to take the EPUB file and convert it into a Kindle-compatible .mobi file using Amazon's KindleGen.

As a last tweak, we've updated the :default rule to depend on these .epub and .mobi targets.

SOURCE_FILES = Rake::FileList.new("**/*.md", "**/*.markdown") do |fl|
  fl.exclude("~*")
  fl.exclude(/^scratch\//)
  fl.exclude do |f|
    `git ls-files #{f}`.empty?
  end
end

task :default => ["book.epub", "book.mobi"]
task :html => SOURCE_FILES.ext(".html")

rule ".html" => ->(f){source_for_html(f)} do |t|
  sh "pandoc -o #{t.name} #{t.source}"
end

file "book.html" => SOURCE_FILES.ext(".html") do |t|
  chapters   = FileList["**/ch*.html"]
  backmatter = FileList["backmatter/*.html"]
  sh "cat #{chapters} #{backmatter} > #{t.name}"
end

file "book.epub" => "book.html" do |t|
  sh "ebook-convert book.html #{t.name}"
end

file "book.mobi" => "book.epub" do |t|
  sh "kindlegen book.epub -o #{t.name}"
end

def source_for_html(html_file)
  SOURCE_FILES.detect{|f| f.ext('') == html_file.ext('')}
end

This build script produces two different categories of file:

  • It generates intermediate files. All of the HTML files fall into this category. Another name for these files is temporary files, since they aren't needed once the full build process has finished.
  • It also generates deliverable ebook files. These files are the end goal of the whole process.

When it comes to automatically cleaning up our project directory, we'd like to treat these two categories of files differently. At times we may want to clean up just the intermediate files, leaving the ebook products intact. At other times we may want to blow away every generated file and start with a clean slate.

We could write our own tasks to handle these two types of cleanup. Or, we could make use of Rake's optional rake/clean library.

To use rake/clean, we first require it. Once we do this, a new global constant called CLEAN is available to us. This constant is a FileList, which is initially empty.

require 'rake/clean'

CLEAN                           # => []
CLEAN.class                     # => Rake::FileList

We can use the CLEAN list to tell Rake which files are intermediate files. First, let's add the list of HTML files generated from Markdown files

CLEAN.include(SOURCE_FILES.ext(".html"))

Then we'll add the concatenated book.html to the list.

file "book.html" => SOURCE_FILES.ext(".html") do |t|
  chapters   = FileList["**/ch*.html"]
  backmatter = FileList["backmatter/*.html"]
  sh "cat #{chapters} #{backmatter} > #{t.name}"
end
CLEAN.include("book.html")

Next, we'll add files to another list called CLOBBER. This list tells Rake which files are considered final products. To the CLOBBER list, we add the .epub and .mobi ebook files.

file "book.epub" => "book.html" do |t|
  sh "ebook-convert book.html #{t.name}"
end
CLOBBER << "book.epub"

file "book.mobi" => "book.epub" do |t|
  sh "kindlegen book.epub -o #{t.name}"
end
CLOBBER << "book.mobi"

We could have added these files to the CLEAN and CLOBBER lists all in one place in the Rakefile. But by making each addition next to the rules for building that file, we make it more likely that if we ever remove or change those rules, we'll remember update the associated entry in the CLEAN or CLOBBER list.

When we go to the command line and tell Rake to list the available tasks with the rake -T command, we can see that there are two tasks available that we didn't define: clean, and clobber.

We run rake without any arguments first, to build our ebook files. When we list the files in the project, we can see various .html intermediate files as well as the final product .epub and .mobi files.

$ rake
$ tree
.
├── backmatter
│   ├── appendix.html
│   └── appendix.md
├── book.epub
├── book.html
├── book.mobi
├── ch1.html
├── ~ch1.md
├── ch1.md
├── ch2.html
├── ch2.md
├── ch3.html
├── ch3.md
├── ch4.html
├── ch4.markdown
├── Rakefile
├── scratch
│   └── test.md
└── temp.md

If we then run rake clean, we don't see any output. But when we list the project contents again, we can see that all the .html files have disappeared:

$ rake clean
avdi@hazel:~/Dropbox/rubytapas/134-rake-clean/project$ tree
.
├── backmatter
│   └── appendix.md
├── book.epub
├── book.mobi
├── ~ch1.md
├── ch1.md
├── ch2.md
├── ch3.md
├── ch4.markdown
├── Rakefile
├── scratch
│   └── test.md
└── temp.md

If we then run rake clobber, we see a bunch of warnings about files that could not be found. That's because clobber first executes the clean task, which we already ran. That task is trying to remove a bunch of files which are already gone. Don't worry though; these warnings are harmless and can be safely ignored.

When we look at the project contents after running clobber, we can see that the ebook files have vanished along with the intermediate files.

And that's all there is to it—with rake/clean, we should never again need to write our own project cleanup tasks to remove build files. We can just add the appropriate files or file patterns to the CLEAN and CLOBBER lists, and let Rake do the rest. Happy hacking!

Reload this page to get a new random episode!