Removing unused CSS with Purgecss/UnCSS

knowler · February 12, 2018, 8:53am

This seems to be a common request here so I thought I would demonstrate two very simple methods which require very little modification to Sage.

Recently, I’ve been looking for a quicker alternative to UnCSS, since it is slow and often flaky. I thought PurifyCSS would be the answer since it has a nice webpack plugin, however, it doesn’t remove similar named selectors, which — if you use something like Tachyons — can be a lot. As of writing this, the main repo seems to be unmaintained as well.

Purgecss

Purgecss was originally thought of as the v2 of purifycss.¹

Purgecss is a tool that works well and is actively maintained. It also has a handy webpack plugin. And though it has nothing in it, this repo looks promising as well.

Using Purgecss with Sage

Add the webpack 3 plugin² (and glob-all) to your Sage project:

yarn add --dev purgecss-webpack-plugin@0.23.0 glob-all

Then drop the plugin in the webpack optimize config:

// resources/assets/build/webpack.config.optimize.js

// ...
const glob = require('glob-all');
const PurgecssPlugin = require('purgecss-webpack-plugin');

module.exports = {
  plugins: [
    // ... 
    new PurgecssPlugin({
      paths: glob.sync([
        'app/**/*.php',
        'resources/views/**/*.php',
        'resources/assets/scripts/**/*.js',
      ]),
      whitelist: [ // Only if you need it!
        'pr3','pv2','ph3',
        'mb1',
        'input',
        'tracked-mega'
      ],
    }),
  ],
};

Keep in mind that this will only process if you run yarn build:production as it is in the optimize config file. You could load this in the main config, however, removing CSS should be equated to minifying your CSS, which is reserved for the production build script.

As you may have noticed above, a small drawback is the need to whitelist any CSS that’s not in the specified paths, which makes using this on a site with a plugin like HTML Forms a bit painful.

Using UnCSS

Though UnCSS is more accurate since it loads the actual pages to figure out which classes are being used, this means every possible view must be shown and you’ll need to manual add the sitemap.

yarn add --dev uncss postcss-uncss

Then in the PostCSS config:

// resources/assets/build/postcss.config.js

// ...
const uncssConfig = {
  html: [
    'http://example.test',
    // Your entire sitemap added manually
    // or some other way if you’re clever (wget is handy for this).
  ]
};

// ...

module.exports = ({ file, options }) => {
  return {
    parser: options.enabled.optimize ? 'postcss-safe-parser' : undefined,
    plugins: {
      'postcss-uncss': options.enabled.optimize ? uncssConfig : false, // ← Add the plugin
      cssnano: options.enabled.optimize ? cssnanoConfig : false,
      autoprefixer: true,
    },
  };
};

strarsis · February 12, 2018, 2:10pm

What about using a WordPress plugin like The SEO framework for autogenerating a sitemap.xml? Of course, non-public and hidden areas would have to be collected by a separate sitemap.

knowler · February 12, 2018, 2:40pm

In writing this, I tried to use The SEO Framework’s sitemap.xml, however, since the goal for this post was to try and do it in a way that was quick and easy to setup, I didn’t want to include a ton of extra dependencies (e.g. xml to json converters, sitemap generators, etc).

An idea I had was to add a pre-build script using wget to spider the entire site, ignoring robots.txt (i.e. -e robots=off), however, it was getting too convoluted for this post.

I like the idea of having a tool check your actual codebase (Purgecss) vs. the generated version of your website (UnCSS), as I do want all of my template code in my theme directory (not in the database). Of course, there are pitfalls, but if you didn’t need to theme any plugins, it’s gold.

kculmback · February 14, 2018, 7:56pm

@knowler thanks for taking the time to put this together!

I’m intrigued by the idea of automating the process of creating the sitemap array for UnCSS using wget. I’ve done some research and would love some feedback to see if I’m on the right track!

To get a list of all the urls in a website was able to use this script:

wget --spider --recursive --level=inf --no-verbose --accept html --output-file=./test.txt http://example.test/

I then edited the file using this script:

grep -i URL ./test.txt | awk -F 'URL:' '{print $2}' | awk '{$1=$1};1' | awk '{print $1}' | sort -u | sed '/^$/d' > ./sortedurls.txt

Which gave a list like this:

http://example.test/
http://example.test/page1/
http://example.test/page2/
...

The next step would be to create a JavaScript array from this file. I figure this can be done using node-fs. Something like this:

const fs = require('fs');
const text = fs.readFileSync('./sortedurls.txt', 'utf-8');
const arr = text.split('\n');

Which would create an array like this:

[ 'http://example.test/', 
  'http://example.text/page1/', 
  'http://example.text/page2/',
  ... ]

However, this does rely on the ability to use the Node File System module out of the box with Webpack, which I haven’t tried (and I can’t say I’m an expert on Node or Webpack).

You would also need to be able to execute a bash script to use wget, grep, and awk. But it looks like that isn’t too hard to do with this plugin. Depending on the size of a site though, this whole thing could take a long time!

It’s very possible I’ve completely overcomplicated this though. Would love feedback, especially if there is an easier way to go about it!

knowler · February 14, 2018, 8:23pm

This looks awesome. I’ll have to take a look at it later, but first, one word of warning regarding crawling with wget is that you need to whitelist your IP so you don’t get blocked (edit: for remote servers of course).

kculmback · February 14, 2018, 8:39pm

That’s a good point!

I also saw someone suggest adding --wait=1 to the wget command in order to not decrease your site’s performance while generating the sitemap.

Henk · March 9, 2018, 9:18am

@knowler thanks for sharing these methods!

I use your UnCSS as a PostCSS plugin method. For the sitemap I use the WordPress JSON Sitemap Generator plugin I made. It generates a JSON sitemap on non-production environments and is available via: http://yourdomain.com?json_sitemap.

The JSON sitemap includes:

Single posts
Pages
Single CPT’s
Author archives
Term archives
CT term archives
Monthly archives

Plus, these special pages:

Empty search results page
Search results page with no results
Search results page with pagination
404 page

I hope my plugin comes in handy for others as well.

knowler · March 9, 2018, 2:29pm

Wow, this is awesome!

jasonbaciulis · March 10, 2018, 9:48am

@knowler @Henk thank you for this, works beautifully!

dpc · March 22, 2018, 8:58am

Hey all - would this be a viable option if you have 3,000+ articles and growing?

knowler · March 22, 2018, 2:17pm

Using PurgeCSS is always a safe option if you have the paths and whitelist set correctly since it just scans for classes that are being used in your source PHP and JS. There is no need for a sitemap.

Using UnCSS is the less safe option (but most accurate when it works). UnCSS uses a headless browser (used to be PhantomJS, but now is jsdom >= 0.15.0) to process the actual/generated HTML (that’s why we need a sitemap) to find out which styles are being used. The reason I call this option “less safe” is because it will only process what you feed it, which means you have to give it an accurate sitemap. You need parity between your development and production databases for this. Also, the larger your sitemap, the longer it will take UnCSS to process it, which is why I call “slow and flaky” in the OP. In the past, using UnCSS, I have found it to break for no apparent reason during a build. If you choose UnCSS, you will need to be more strategic in your approach, especially if the site in question has “3000+ articles and growing.”

Personally, I have completely switched from UnCSS to PurgeCSS for removing unused CSS. I find it a lot more trustworthy and I think it’s very important to be able to run yarn build:production twice in a row and expect the same results (if my code hasn’t changed).

Henk · March 23, 2018, 2:50pm

If you choose UnCSS, you will need to be more strategic in your approach, especially if the site in question has “3000+ articles and growing.”

For instance, you could limit the sitemap to a number of representative posts. In my experience this is doable and can be accurate.

jasonbaciulis · June 1, 2018, 5:13am

If anyone else is getting this error with Webpack 3 and purgeCSS:
TypeError: Cannot read property 'compilation' of undefined

Then you need to use purgeCSS version compatible with Webpack 3:
yarn add purgecss-webpack-plugin@0.23.0 -D

Another tip.
cssnanoConfig inside postcss.config.js is set to remove all comments and that prevents whitelisting directly within SCSS files. So just set to use default preset:

const cssnanoConfig = { preset: ['default'] };

Now just mark comments as important /*! purgecss ignore */ and cssnano will leave them be.

knowler · June 1, 2018, 5:38am

Good to know! I reflected this in the original post.

manalyse · June 1, 2018, 12:39pm

Thanks a lot for this useful information @knowler. I have a question regarding this:

I’m not sure I understand correctly what you mean by “not in the specified paths”. Do you mean that only the files that are in the specified paths get compared with the css, and everything else, e.g. dynamically generated html through wordpress plugins, are not?

That would then include all plugins that do not use custom template classes, correct? Also WooCommerce (if the default templates are not overwritten), etc.

knowler · June 1, 2018, 2:22pm

Yes. Purgecss looks for class names used in the files you specify with the paths options and removes anything that is unused for the stylesheet(s) that it is building (any stylesheet that the Webpack process is building). Purgecss does not remove styles for other stylesheets that are enqueued by plugins.

If you have custom styles defined that are for a plugin that you have not created a custom template for, then you will need to whitelist them. If you do have the custom templates included then you do not need to whitelist the classes since they would be a part of the files that Purgecss analyzes.

Example: Gravity Forms

@mmirus shared this helpful snippet with me for using Purgecss with Gravity Forms:

    new PurgecssPlugin({
       paths: glob.sync([
         "app/**/*.php",
         "resources/views/**/*.php",
         "resources/assets/scripts/**/*.js",
       ]),
       whitelist: [],
+      whitelistPatternsChildren: [/^gfield/, /^gform/, /^ginput/],
    }),

Potentially Bad Idea: Add Plugin Template Paths

Technically, if you knew where the templates were located in a plugin, you could add that path to the paths options, however, I would be hesitant to do it this way since I don’t have control over what those templates look like or what the path is. An update to the plugin could break the next build of your stylesheet.

knowler · June 9, 2018, 4:41pm

Just made a pull request (#2078) that will fix this issue when (and if) merged.

You will still need to leave the comment as important, but other tools like Autoprefixer recognize that this is necessary when using Sass or Less. It works without the important comment in normal (Post)CSS.

mmirus · June 14, 2018, 10:20pm

Here’s a tip for whitelisting the styles from an entire CSS or SCSS file easily.

In this case, we’re going to whitelist the selectors in the TailwindCSS Preflight file (which basically contains normalize.css and a few additional resets). To whitelist something else, just insert your path or paths.

We’ll use this package: https://github.com/qodesmith/purgecss-whitelister. Refer to the GIthub page for additional docs.

Install purgecss-whitelister:

yarn add --dev purgecss-whitelister

Edit webpack.config.optimize.js:

You will need to import the whitelister. Add this with the other imports at the top of the file:

const whitelister = require("purgecss-whitelister");

And then edit the PurgeCSS section of your config:

new PurgecssPlugin({
  paths: glob.sync([
    "app/**/*.php",
    "resources/views/**/*.php",
    "resources/assets/scripts/**/*.js",
  ]),
  whitelist: [
    "my-whitelisted-selector",
    ...whitelister("node_modules/tailwindcss/css/preflight.css"),
  ],
  whitelistPatternsChildren: [/^gfield/, /^gform/, /^ginput/],
  extractors: [
    {
      extractor: TailwindExtractor,
      extensions: ["js", "php"],
    },
  ],
}),

This is the important bit:

  whitelist: [
    "my-whitelisted-selector",
    ...whitelister("node_modules/tailwindcss/css/preflight.css"),
  ],

It shows how to whitelist both an arbitrary selector of your choice (like you normally would with PurgeCSS) and the results of the whitelister plugin.

whitelister('mypath.css') uses the whitelister package to get the selectors from the file or files you specify.

The ... before whitelister() is the JavaScript spread operator, which takes the items in the array that whitelister() returns and adds them to the whitelist array along with any other items in it (my-whitelisted-selector, in this case).

Thanks to @knowler for helping with this!

nitradesign · August 9, 2018, 3:37pm

Awesome stuff. Thanks @mmirus this solved the issue for me. Was having problems with Swiper styles when running yarn build:production but the whitelister with the spread operator solved it!

LucasDemea · November 20, 2018, 11:44am

Hi,
For those who choose the postcss-uncss way, here is a way to automate the sitemap.json creation on build:production, in combination with henk’s plugin, or a custom json_sitemap function you would add to wp.

First import the Webpack Shell Plugin:

yarn add webpack-shell-plugin --save-dev

In your webpack.config.optimize.js add this:

const WebpackShellPlugin = require("webpack-shell-plugin");

module.exports = {
    plugins: [
    new ImageminPlugin({
          .....
    }),
    new WebpackShellPlugin({
      onBuildStart: [
        "curl --silent --output sitemap.json " +
          config.devUrl +
          "/?json_sitemap"
      ],
      onBuildEnd: []
    })
  ]
};

I personaly chose to reuse henk’s plugin function, and slightly modifying it to allow better control on sitemap.json final size. If someone is interested, I can share the updated fonction.