Using Puppeteer and Squoosh to fix the web performance of embedded tweets

First published: Feb 6, 2021 Last updated: Feb 15, 2021 Read time: 11 mins

In January I was doing some investigation work into the web performance of some of the UK Governments blog posts using lighthouse-parade and I stumbled across a recurring theme: some of the pages that performed the worst were ones that included Twitter embeds. Investigating further lead me to write the following Tweet (very ironic huh!):

If you are wanting to reference a tweet on a page, don't embed, just use a simple image + alt text + link. Your users will thank you! • LCP: 600ms slower • 2.7MB more JS! • 25 more requests • LH score dropped 50% Test: real Moto G4 with 3G Fast. #perfmatters #webperf - 11:01 PM · Jan 16, 2021

You are reading that correctly. As well as reducing page performance metrics and the Lighthouse score, a single embed adds an extra 2.7 MB of JavaScript to a page! That’s a crazy amount of code for what is really just a simple bit of text and (maybe) a few images. I’m not the first to notice this issue either.

The problem

So the problem I have, and one I’m looking to optimise is that I have many tweets embedded across lots of WordPress blog posts that I’d like to quickly fix. I’d also like for others to be able to follow guidance I write to fix these issues too. So I’m looking for a fairly simple & semi-automated solution.

Now WordPress comes with a really useful feature (for content authors). Simply paste a tweet into a blog post then let the WordPress OEmbed feature take over. Jetpack will pull in the Tweet data as a <blockquote>, then pull in the widgets.js JavaScript and hydrate the tweet to make it more “twittery”.

Image of a tweet dropped into a wordpress blog post that then gets expanded.

This is great for an author, it’s very “set it and forget it”, but it’s pretty poor from a web performance point of view. As far as I can tell you can’t control how this link is expanded and hydrated once it hits the page in WordPress (please do correct me on this if this assumption is wrong!). For example, we could simply omit the JavaScript widget.js from being added to the page. Therefore no 2.7 MB JavaScript download, so it’s progress. The downside is that the tweet doesn’t look like it does on Twitter (see the image below):

Image of a tweet when the JavaScript has been blocked.

The above image is taken from Firefox when the Enhanced Tracking Protection setting is set to strict, as the browser will block all social media trackers by default. Personally I’m fine with how it looks. I can still read the content, and if I want to see the actual tweet I can click on the date and it will take me to Twitter. But others will probably disagree. So how do we keep the “look” of a tweet, minimise the web performance impact it has on the page and still allow it to be accessible to screenreaders?

The solution

So the solution I’ve picked is actually very simple. Grab a screenshot of the tweet, use the following HTML:

<figure id="custom_ID">
    <a href="https://twitter.com/...." rel="noreferrer noopener">
        <img src="/images/screenshot-of-tweet.jpg" alt="Tweet by [name] on [date/hour] (tweet content below)" width="123" height="321" />
    </a>
    <figcaption>"Text contained within the tweet goes here"</figcaption>
</figure>

The code above is simplified, but hopefully you get the idea. You’d probably want to indicate that it links to an external page, and you could provide different image formats too (e.g. WebP, AVIF). The alt text contains information about the person who tweeted, and the date it happened. This means that even multiple tweets from the same author on a page will be seen as unique to assistive technologies.

There’s an interesting point you should consider before using this method. What if the author of the original tweet deletes it? So your screenshot now becomes a permanent record of the tweet they wanted removed. I think this is beyond the scope of this blog post, but it’s something Terence Eden discusses in his How to preserve deleted Tweets in WordPress? blog post.

So quickly skipping over the morality question of preserving tweets, we run into the next issue. How do you screenshot lots of tweets and grab the tweet text without it taking hours? There are ways to screenshot tweets very easily: Tweet Cyborg is a service that allows you to input any tweet and it will convert it to an image to download. Or you can even do it directly in both Chrome and Firefox Devtools:

Chrome Devtools gives you the option to capture a node screenshot when you right click on an element.

Both these methods work, but if you had to do this for 100’s of tweets, it’s going to take a while. And that isn’t even taking into account capturing the text for the images alt attribute!

Puppeteer to the rescue!

This is where we can use the fantastic Puppeteer to help us out. So what is Puppeteer? Quoting from the documentation:

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

So it allows you to programmatically open a browser, load a page, manipulate the page, capture data, then close the browser down and repeat. All without any input from us once started! And it just so happens to be able to capture screenshots too (since it uses the DevTools API after all, which we’ve already seen has the ability to capture node screenshots above).

Presenting `tweet-grabber`

I know, I’m terrible with names… but let’s look beyond that. It’s a Puppeteer script I’ve created that allows you to quickly capture screenshots of multiple tweets. Functionality includes:

Ability to capture a single tweet, multiple (comma separated) tweets, or pass in a file with a list of tweets to be captured.
The uncompressed PNG of the tweet will be captured as well as the text from the tweet (in an accompanying txt file).
Automatic compression of the tweets using the Squoosh CLI.

So for example, the tweet you can see at the top of this blog post was captured using tweet-grabber like so:

$ node index.js https://twitter.com/TheRealNooshu/status/1350578919389470721

The final compressed image as well as the tweet text is output to the compressed_tweets directory.

It’s not a complicated script, all fairly straightforward. I’ve already been putting it to good use to run through 10-15 tweets at a time, all output in less than 40 seconds. Much quicker than manually capturing these tweets!

What about copy / pasting tweet text?

So Thomas Steiner raised an excellent point after I’d published this post, one which I hadn’t considered. When embedding the tweet as an image you are preventing a user from easily copy and pasting the tweet text should they want to do this. A user could view the image alt text and / or visit the original tweet on Twitter. But both of these methods aren’t as simple as copying directly from a page. He also came up with a suggestion about using Puppeteer to capture the CSS for a tweet, while still keeping the original HTML (and discarding the widget.js). This sounds like an interesting idea I’d like to investigate further. Although I still need to consider the fact I will be writing guidance and maybe training a few “non-technical” folks on how to do this within the limitations of WordPress.

So keep this limitation in mind when using this method. There are a number of alternative solutions listed below that you may want to consider too.

Update: A simple solution I’ve found to this issue is to use the <figure> and the <figcaption> elements. The text is shown below the images and can be seen by screen readers, search engines, and can be copy / pasted by users if required.

Other alternatives

There are others who have looked into this problem too and have different solutions. I’m linking to them here for completeness:

Jane Manchun Wong has worked on a solution using graphql fragments, dataloader, relay and nexus. Follow up tweet here with more info.
Luis Alvarez has built a Static Tweet Next.js Demo.
Johan Janssens has been working on a very work in progress Cloudflare Worker solution, code seen here.
Lucas Pardue uses a simple bash script to strip out the unwanted (and heavy) resources.
Terence Eden is currently working on a really interesting and clever solution to convert a tweet to an SVG. Thread with info can be found here. A writeup of this method can be found on his blog: “Minimum Viable Tweet to Semantic SVG”
Terence Eden investigated a similar image capture method back in 2016 using the Scheenshotlayer API and mogrify when Making a Twitter Collage.
Bruno Quaresma has created TweetPik, a tool to easily convert tweets to images SVG, PNG and JPG.
Kyle Mitofsky has created the eleventy-plugin-embed-tweet plugin that allows you to embed a tweet during the static build process.
Umar Hansa has created better-twitter-embed that he uses on his Dev Tips website.

So there are a few options around to use if you want to include tweets in your site without the web performance hit.

Improvements from team Twitter

I briefly discussed this issue with Addy Osmani who pointed me in the direction of a tweet from Charlie Croom, a web developer at Twitter. It sounds like the team are fully aware of this issue and it’s on their roadmap to be fixed in the future! So with any luck these workarounds won’t be needed in the future.

Summary

Embedded tweets are bad for performance, but there are ways to avoid the performance impact. There are downsides to the image capture solution I presented above, but I feel they are fairly minor compared to the 2.7 MB of JavaScript and 25 extra requests embedding a tweet on a page inflicts upon a user visiting a page. If it doesn’t work for you there are other solutions available too, thanks to the wonderful open source community.

As always please let me know if you have any feedback and comments. I’m especially interested in how this script could be improved!

Post changelog:

06/02/21: Initial post published. Added a link to Terence Eden’s “Making a Twitter Collage” blog post (Thanks Terence). Added link to TweetPik (thanks Alfredo Lopez for the link).
07/02/21: Added two other ways of embedding tweets (eleventy-plugin-embed-tweet & better-twitter-embed). Thanks to Alex Russell & Umar Hansa for pointing them out. Added feedback from Thomas Steiner about the copy & pasting limitation if using an image.
11/02/21: Updated the HTML sample in the page to make use of <figure> and the <figcaption>. This replicates the HTML output by WordPress when using a caption.
15/02/21: Added a link to Terence Eden blog post on his semantic SVG version of a tweet. Updated the HTML to improve overall accessibility, as the initial version failed the 4.1.2 Name, Role, Value WCAG Success criterion. Many thanks to Arnaud Delafosse for his feedback, and Šime Vidas for flagging it.