Blog » Syntax highlighting in React with highlight.js and Web Worker

Syntax highlighting in React with highlight.js and Web Worker

2018-09 |

Ta treść jest dostępna tylko w języku angielskim

Syntax highlighting in React with highlight.js and Web Worker

One of the most common issues I heard people have with my blog was lack of syntax highlighting in posts, especially those, which contain a lot of code. Okay, it's almost 2019, I'm a software engineer, working mostly with front-end these times - I finally agreed, that it should be added. So I added it. And in the meantime, I also learned a little bit about Web Workers. Hence, this post, in which I describe this little adventure.

So, how do you do syntax highlighting? I have no clue! It's open source world, we don't bother with these questions, unless rare case, when there is no library for given purpose. Luckily, for syntax highlighting there is handful of them. Plenty of different implementations that exist out there allowed me to pick one that is tailored specifically to my needs.

My initial idea was that I don't want to block rendering of my post with syntax highlighting mechanism. It's cool feature, but it's not more important than the content of the article itself. More often than not, by the time user scrolls down to some piece of code in my post, at least a few seconds should pass - it's not at the top of the view in any case, there are always a few paragraphs of text, and I assume, that my readers actually read a few words from me. Therefore, I decided to use Web Worker for the job.

That's why I needed a standalone library that provides syntax highlighting mechanism. I didn't want it to be part of my project, one of my dependencies - I wanted to have it separated, and to be able to use it from my Web Worker conveniently. I decided to go for highlight.js, also because it allowed me to compile it to minified version, containing scripts only related to languages, that I use in my posts.

Once I established what I want to do and how I want to achieve it, I was ready to proceed to actual implementation. I started with modifications to my EntryContent component. Up until now, it was dummy component, which displayed HTML content of the entry, passed to it through props:

import * as React from 'react';

interface EntryContentPropsInterface {
  content: string;
}

export const EntryContent = (props: EntryContentPropsInterface): JSX.Element => (
  <div dangerouslySetInnerHTML="{{" __html:="" content="" }}=""/>
);

export default EntryContent;

With introduction of syntax highlighting, I wanted EntryContent to trigger this mechanism whenever it's mounted. My plan for triggering was to select all elements matching pre code selector from entry's content; then, pass content of each of those elements along with it's index to Web Worker, which I expected to return the same content, but highlighted; finally, substitute it in the view. So that's pretty much what I did:

import * as React from 'react';

import highlightWorker from './highlight.worker';
import WebWorker from 'common/WebWorker';

interface EntryContentPropsInterface {
  content: string;
}

export class EntryContent extends React.Component<EntryContentPropsInterface> {

  componentDidMount(): void {
    this.highlightCodeBlocksAsync();
  }

  highlightCodeBlocksAsync(): void {
    const codeBlocks = document.querySelectorAll('pre code');
    const worker = new WebWorker(highlightWorker);

    worker.addEventListener('message', (event) => {
      const { code, codeBlockIndex } = event.data;
      codeBlocks[codeBlockIndex].innerHTML = code;
    });

    codeBlocks.forEach((codeBlock, index) => {
      worker.postMessage([codeBlock.textContent, index]);
    });
  }

  render(): JSX.Element {
    const { content } = this.props;

    return (
      <div dangerouslySetInnerHTML="{{" __html:="" content="" }}=""/>
    );
  }

}

export default EntryContent;
</EntryContentPropsInterface

Two things in this code that remain unresolved are highlightWorker and WebWorker scripts, both imported at the top of the file. That's the implementation of my Web Worker, where all the magic happens. Let's start with highlightWorker:

export default () => {
  self.addEventListener('message', (event) => {
    try {
      importScripts('https://soofka.pl/scripts/highlight/highlight.pack.js');
      const result = self.hljs.highlightAuto(event.data[0]);
      postMessage({ code: result.value, codeBlockIndex: event.data[1] });
    } catch (e) {
      postMessage({ ...e });
    }
  });
};

It uses highlightAuto method from highlight.js library. This and other methods provided by the library are explained in details in documentation. TL;DR: it takes text as a parameter, and returns the same text, but highlighted, with automatic language detection. As you can see in the code of EntryContent, content of my code block was passed as first element of array of arguments in worker.postMessage function. The other parameter was index of given code block, so that I know where exactly should I place highlighted result in my view (keep in mind that it happens asynchronously).

The other piece of the puzzle is WebWorker. It's actually a piece of code I've found on the internet, most likely on StackOverflow, if memory serves. It's very clever way to help Webpack deal with Web Worker by creating it from imported script on the fly. Without it, I'd have to keep my Web Worker separated from source code of my blog, similarly to how I did with highlight.js library. Or I'd have to use some other workaround, such as worker-loader for Webpack. But instead, I did this:

export class WebWorker {
  constructor(worker) {
    const code = worker.toString();
    const blob = new Blob(['(' + code + ')()']);
    return new Worker(URL.createObjectURL(blob));
  }
}

export default WebWorker;

And that could be it! But I decided to do one more thing.

Even though Web Worker is very widely supported by browsers nowadays, I still wanted to practice a case, in which it's not available. Thus, I added one last method to EntryContent, which is meant to be triggered in case of lack of support for Web Worker, and which highlights code in synchronous manner. This time I used highlightBlock method from highlight.js, which handles DOM nodes conveniently. It allowed me to avoid setting innerHTML explicitly, which is never preferred option.

// ...

componentDidMount(): void {
  if (Worker) {
    this.highlightCodeBlocksAsync();
  } else {
    this.highlightCodeBlocks();
  }
}

// ...

highlightCodeBlocks(): void {
  document.querySelectorAll('pre code').forEach((codeBlock: Node) => {
    hljs.highlightBlock(codeBlock);
  });
}

// ...

I must admit, this way was much easier than the one with Web Worker, but much less fun too. Either way, I got what I wanted - syntax highlighting in my posts, and a little but of new knowledge and experience.

As usual, you can find all the changes I described in this post on my blog repository on GitHub.

Inne wpisy

Moving from Sass to Styled Components (with snapshot tests)

2018-08 |

Ta treść jest dostępna tylko w języku angielskim

Moving from Sass to Styled Components (with snapshot tests)

Who doesn't like to constantly rework perfectly fine stuff into something new and fancy just because it's trendy now? Well, probably pretty much every single JS developer, most certainly everyone who works with this language long enough to experience at least a glimpse of famous "JS fatigue" feeling (so approximately a few weeks). However, I created this blog as a playground to try out new libraries and frameworks, and since I'm learning Styled Components for my professional work at the moment, even though Sass-based styling worked perfectly for my needs, I reimplemented all of it into this controversial CSS-in-JS. And I'm still sane!

Czytaj dalej

Webpack 4 config explained (with example)

2018-04 |

Ta treść jest dostępna tylko w języku angielskim

Webpack 4 config explained (with example)

Using a skeleton for your application prepared by someone else comes with great benefit of a lot of time saved, but also with huge cost of a lot of knowledge not obtained. Sometimes you'll manage to complete your assignment just fine with some predefined boilerplate, without too much need for deep investigation of it's nooks and crannies. Other times, you'll end up in a position, where you reverse engineer it in order to introduce some major change, or just give up and start from scratch with your own thing. I wouldn't like you to give up on my application skeleton. Thus, I'll describe some of it's shenanigans in it's documentation. Today, I'm explaining build process.

Czytaj dalej

My first MobX store

2018-07 |

Ta treść jest dostępna tylko w języku angielskim

My first MobX store

My dad wants to read my blog. The only issue is that he doesn't speak English very well. It's communicative, but it's not quite enough to understand intricate, sophisticated Shakespearean language, I am decorating my posts with. Worry not, father, as I've found the solution: language versions. I am currently working on adaptations of my posts in Polish language. In the meantime, I'm also adapting my codebase to be able to recognize and properly handle language parameter. And for that purpose, for the first time, I decided to use MobX.

Czytaj dalej

Hard to imagine, but it's over 1 year since I created this blog, and up until recently, it always had static head tags. Title always being the same, for example, wasn't that much of an issue to me, but social meta tags never related to the content of the post I'm sharing on Twitter, that was not cool (it's also not cool when it comes to SEO, but it's not that much of my concern right now). I finally had to tackle it. Here's a simple way to do it that I've found.

Czytaj dalej

How to setup routing for Not Found on both sides with React-Router and Express

2018-03 |

Ta treść jest dostępna tylko w języku angielskim

How to setup routing for Not Found on both sides with React-Router and Express

When building web applications in React, I usually choose Express to be my server, and more often than not I use React-Router to manage redirections and changes in history. Not without a reason - both are among the most popular choices in their respective fields nowadays; both are simple and elegant in every day work. However, I had some tough moments with both of them when it came up to setting all unrecognized routing to "Not Found" page, and this piece came as a result of them.

Czytaj dalej