Removing unused CSS with Purgecss/UnCSS

@knowler thanks for taking the time to put this together!

I’m intrigued by the idea of automating the process of creating the sitemap array for UnCSS using wget. I’ve done some research and would love some feedback to see if I’m on the right track!

To get a list of all the urls in a website was able to use this script:

wget --spider --recursive --level=inf --no-verbose --accept html --output-file=./test.txt http://example.test/

I then edited the file using this script:

grep -i URL ./test.txt | awk -F 'URL:' '{print $2}' | awk '{$1=$1};1' | awk '{print $1}' | sort -u | sed '/^$/d' > ./sortedurls.txt

Which gave a list like this:

http://example.test/
http://example.test/page1/
http://example.test/page2/
...

The next step would be to create a JavaScript array from this file. I figure this can be done using node-fs. Something like this:

const fs = require('fs');
const text = fs.readFileSync('./sortedurls.txt', 'utf-8');
const arr = text.split('\n');

Which would create an array like this:

[ 'http://example.test/', 
  'http://example.text/page1/', 
  'http://example.text/page2/',
  ... ]

However, this does rely on the ability to use the Node File System module out of the box with Webpack, which I haven’t tried (and I can’t say I’m an expert on Node or Webpack).

You would also need to be able to execute a bash script to use wget, grep, and awk. But it looks like that isn’t too hard to do with this plugin. Depending on the size of a site though, this whole thing could take a long time!

It’s very possible I’ve completely overcomplicated this though. Would love feedback, especially if there is an easier way to go about it!

1 Like

This looks awesome. I’ll have to take a look at it later, but first, one word of warning regarding crawling with wget is that you need to whitelist your IP so you don’t get blocked (edit: for remote servers of course).

That’s a good point!

I also saw someone suggest adding --wait=1 to the wget command in order to not decrease your site’s performance while generating the sitemap.

@knowler thanks for sharing these methods!

I use your UnCSS as a PostCSS plugin method. For the sitemap I use the WordPress JSON Sitemap Generator plugin I made. It generates a JSON sitemap on non-production environments and is available via: http://yourdomain.com?json_sitemap.

The JSON sitemap includes:

  • Single posts
  • Pages
  • Single CPT’s
  • Author archives
  • Term archives
  • CT term archives
  • Monthly archives

Plus, these special pages:

  • Empty search results page
  • Search results page with no results
  • Search results page with pagination
  • 404 page

I hope my plugin comes in handy for others as well. :grinning:

3 Likes

Wow, this is awesome!

1 Like

@knowler @Henk thank you for this, works beautifully!

2 Likes

Hey all - would this be a viable option if you have 3,000+ articles and growing?

Using PurgeCSS is always a safe option if you have the paths and whitelist set correctly since it just scans for classes that are being used in your source PHP and JS. There is no need for a sitemap.

Using UnCSS is the less safe option (but most accurate when it works). UnCSS uses a headless browser (used to be PhantomJS, but now is jsdom >= 0.15.0) to process the actual/generated HTML (that’s why we need a sitemap) to find out which styles are being used. The reason I call this option “less safe” is because it will only process what you feed it, which means you have to give it an accurate sitemap. You need parity between your development and production databases for this. Also, the larger your sitemap, the longer it will take UnCSS to process it, which is why I call “slow and flaky” in the OP. In the past, using UnCSS, I have found it to break for no apparent reason during a build. If you choose UnCSS, you will need to be more strategic in your approach, especially if the site in question has “3000+ articles and growing.”

Personally, I have completely switched from UnCSS to PurgeCSS for removing unused CSS. I find it a lot more trustworthy and I think it’s very important to be able to run yarn build:production twice in a row and expect the same results (if my code hasn’t changed).

3 Likes

If you choose UnCSS, you will need to be more strategic in your approach, especially if the site in question has “3000+ articles and growing.”

For instance, you could limit the sitemap to a number of representative posts. In my experience this is doable and can be accurate.

1 Like

If anyone else is getting this error with Webpack 3 and purgeCSS:
TypeError: Cannot read property 'compilation' of undefined

Then you need to use purgeCSS version compatible with Webpack 3:
yarn add purgecss-webpack-plugin@0.23.0 -D

Another tip.
cssnanoConfig inside postcss.config.js is set to remove all comments and that prevents whitelisting directly within SCSS files. So just set to use default preset:

const cssnanoConfig = { preset: ['default'] };

Now just mark comments as important /*! purgecss ignore */ and cssnano will leave them be.

6 Likes

Good to know! I reflected this in the original post.

Thanks a lot for this useful information @knowler. I have a question regarding this:

I’m not sure I understand correctly what you mean by “not in the specified paths”. Do you mean that only the files that are in the specified paths get compared with the css, and everything else, e.g. dynamically generated html through wordpress plugins, are not?

That would then include all plugins that do not use custom template classes, correct? Also WooCommerce (if the default templates are not overwritten), etc.

Yes. Purgecss looks for class names used in the files you specify with the paths options and removes anything that is unused for the stylesheet(s) that it is building (any stylesheet that the Webpack process is building). Purgecss does not remove styles for other stylesheets that are enqueued by plugins.

If you have custom styles defined that are for a plugin that you have not created a custom template for, then you will need to whitelist them. If you do have the custom templates included then you do not need to whitelist the classes since they would be a part of the files that Purgecss analyzes.

Example: Gravity Forms

@mmirus shared this helpful snippet with me for using Purgecss with Gravity Forms:

    new PurgecssPlugin({
       paths: glob.sync([
         "app/**/*.php",
         "resources/views/**/*.php",
         "resources/assets/scripts/**/*.js",
       ]),
       whitelist: [],
+      whitelistPatternsChildren: [/^gfield/, /^gform/, /^ginput/],
    }),

Potentially Bad Idea: Add Plugin Template Paths

Technically, if you knew where the templates were located in a plugin, you could add that path to the paths options, however, I would be hesitant to do it this way since I don’t have control over what those templates look like or what the path is. An update to the plugin could break the next build of your stylesheet.

3 Likes

Just made a pull request (#2078) that will fix this issue when (and if) merged.

You will still need to leave the comment as important, but other tools like Autoprefixer recognize that this is necessary when using Sass or Less. It works without the important comment in normal (Post)CSS.

2 Likes

Here’s a :fire: tip for whitelisting the styles from an entire CSS or SCSS file easily.

In this case, we’re going to whitelist the selectors in the TailwindCSS Preflight file (which basically contains normalize.css and a few additional resets). To whitelist something else, just insert your path or paths.

We’ll use this package: https://github.com/qodesmith/purgecss-whitelister. Refer to the GIthub page for additional docs.

Install purgecss-whitelister:

yarn add --dev purgecss-whitelister

Edit webpack.config.optimize.js:

You will need to import the whitelister. Add this with the other imports at the top of the file:

const whitelister = require("purgecss-whitelister");

And then edit the PurgeCSS section of your config:

new PurgecssPlugin({
  paths: glob.sync([
    "app/**/*.php",
    "resources/views/**/*.php",
    "resources/assets/scripts/**/*.js",
  ]),
  whitelist: [
    "my-whitelisted-selector",
    ...whitelister("node_modules/tailwindcss/css/preflight.css"),
  ],
  whitelistPatternsChildren: [/^gfield/, /^gform/, /^ginput/],
  extractors: [
    {
      extractor: TailwindExtractor,
      extensions: ["js", "php"],
    },
  ],
}),

This is the important bit:

  whitelist: [
    "my-whitelisted-selector",
    ...whitelister("node_modules/tailwindcss/css/preflight.css"),
  ],

It shows how to whitelist both an arbitrary selector of your choice (like you normally would with PurgeCSS) and the results of the whitelister plugin.

whitelister('mypath.css') uses the whitelister package to get the selectors from the file or files you specify.

The ... before whitelister() is the JavaScript spread operator, which takes the items in the array that whitelister() returns and adds them to the whitelist array along with any other items in it (my-whitelisted-selector, in this case).

Thanks to @knowler for helping with this!

8 Likes

Awesome stuff. Thanks @mmirus this solved the issue for me. Was having problems with Swiper styles when running yarn build:production but the whitelister with the spread operator solved it!

1 Like

Hi,
For those who choose the postcss-uncss way, here is a way to automate the sitemap.json creation on build:production, in combination with henk’s plugin, or a custom json_sitemap function you would add to wp.

First import the Webpack Shell Plugin:

yarn add webpack-shell-plugin --save-dev

In your webpack.config.optimize.js add this:

const WebpackShellPlugin = require("webpack-shell-plugin");

module.exports = {
    plugins: [
    new ImageminPlugin({
          .....
    }),
    new WebpackShellPlugin({
      onBuildStart: [
        "curl --silent --output sitemap.json " +
          config.devUrl +
          "/?json_sitemap"
      ],
      onBuildEnd: []
    })
  ]
};

I personaly chose to reuse henk’s plugin function, and slightly modifying it to allow better control on sitemap.json final size. If someone is interested, I can share the updated fonction.

4 Likes

Hi all,
I’ve been fighting with this for years, I used to use Uncss with a wget-spidered sitemap, later Purgecss with elaborate regexes… until I realized that I pretty much always want to

  • whitelist everything in resources/assets/styles/
  • purgify everything from node_modules (usually a CSS framework)

I haven’t found a clean option or plugin to do just this, but a workaround with purgecss start ignore. It needs more setup, which I detailed in this gist.

(I hope it’s OK to just link it here – this is my first post. HMU if I should unwrap it here & feel free to do so yourself!)

4 Likes

I had a similar setup to yours with purgecss start ignore comments but sometimes they weren’t reliable and styles got purged anyway.

I switched to whitelister and it works perfectly so far. Also, it keeps stylesheets cleaner and config is where it should be, that is in JS not CSS.

Here is an example how to whitelist contents of a folder:

whitelist: [
  "fade",
  "collapse",
  "collapsing",
  "in",
  "show",
  "modal-open",
  "modal-backdrop",
  "lazyloaded",
  "is-active",
  ...whitelister([
    "resources/assets/styles/common/*.scss",
    "resources/assets/styles/components/*.scss",
    "resources/assets/styles/layouts/*.scss",
    "resources/assets/styles/objects/*.scss",
  ]),
],

Thanks @mmirus for the tip!

4 Likes

Hi, does someone know how to use the rejected function of Purgecss with Sage? I’m trying to see which css is stripped out. I don’t get it working.

new Purgecss({
....
    rejected: true
})

https://www.purgecss.com/configuration#options