Posted by the Google Fonts team
The Google Fonts catalog now includes Japanese web fonts. Since shipping Korean in February, we have been working to optimize the font slicing system and extend it to support Japanese. The optimization efforts proved fruitful—Korean users now transfer on average over 30% fewer bytes than our previous best solution. This type of on-going optimization is a major goal of Google Fonts.
Japanese presents many of the same core challenges as Korean:
- Very large character set
- Visually complex letterforms
- A complex writing system: Japanese uses several distinct scripts (explained well by Wikipedia)
- More character interactions: Line layout features (e.g. kerning, positioning, substitution) break when they involve characters that are split across different slices
The impact of the large character set made up of complex glyph contours is multiplicative, resulting in very large font files. Meanwhile, the complex writing system and character interactions forced us to refine our analysis process.
To begin supporting Japanese, we gathered character frequency data from millions of Japanese webpages and analyzed them to inform how to slice the fonts. Users download only the slices they need for a page, typically avoiding the majority of the font. Over time, as they visit more pages and cache more slices, their experience becomes ever faster. This approach is compatible with many scripts because it is based on observations of real-world usage.
Frequency of the popular Japanese and Korean characters on the web
As shown above, Korean and Japanese have a relatively small set of characters that are used extremely frequently, and a very long tail of rarely used characters. On any given page most of the characters will be from the high frequency part, often with a few rarer characters mixed in.
We tried fancier segmentation strategies, but the most performant method for Korean turned out to be simple:
- Put the 2,000 most popular characters in a slice
- Put the next 1,000 most popular characters in another slice
- Sort the remaining characters by Unicode codepoint number and divide them into 100 equally sized slices
A user of Google Fonts viewing a webpage will download only the slices needed for the characters on the page. This yielded great results, as clients downloaded 88% fewer bytes than a naive strategy of sending the whole font. While brainstorming how to make things even faster, we had a bit of a eureka moment, realizing that:
- The core features we rely on to efficiently deliver sliced fonts are unicode-range and woff2
- Browsers that support unicode-range and woff2 also support HTTP/2
- HTTP/2 enables the concurrent delivery of many small files
In combination, these features mean we no longer have to worry about queuing delays as we would have under HTTP/1.1, and therefore we can do much more fine-grained slicing.
Our analyses of the Japanese and Korean web shows most pages tend to use mostly common characters, plus a few rarer ones. To optimize for this, we tested a variety of finer-grained strategies on the common characters for both languages.
We concluded that the following is the best strategy for Korean, with clients downloading 38% fewer bytes than our previous best strategy:
- Take the 2,000 most popular Korean characters, sort by frequency, and put them into 20 equally sized slices
- Sort the remaining characters by Unicode codepoint number, and divide them into 100 equally sized slices
For Japanese, we found that segmenting the first 3,000 characters into 20 slices was best, resulting in clients downloading 80% fewer bytes than they would if we just sent the whole font. Having sufficiently reduced transfer sizes, we now feel confident in offering Japanese web fonts for the first time!
Now that both Japanese and Korean are live on Google Fonts, we have even more ideas for further optimization—and we will continue to ship updates to make things faster for our users. We are also looking forward to future collaborations with the W3C to develop new web standards and go beyond what is possible with today's technologies (learn more here).
PS - Google Fonts is hiring :)