So the webmaster could fix this with a META (the easiest way), which was most definitely a thing back in 2001 when that page was created:
Code: Select all
<html lang="ja">
<head>
<meta charset="Shift_JIS">
</head>
... rest of page
</html>
, or a web server response header indicating the charset.
Code: Select all
Content-Type: text/html; charset=Shift_JIS
Of note this MUST be done because HTTP/1.1 (the protocol used) explicitly sets the default charset to ISO-8859-1 (Western, roman).
It's been good practice in browsers to default to the browser iuser's preferred encoding set in the browser options, though, slightly departing from that hard default. But it still remains that there can only be one default.
From the
W3C page about this:
Documents transmitted with HTTP that are of type text, such as text/html, text/plain, etc., can send a charset parameter in the HTTP header to specify the character encoding of the document.
It is very important to always label Web documents explicitly. HTTP 1.1 says that the default charset is ISO-8859-1. But there are too many unlabeled documents in other encodings, so browsers use the reader's preferred encoding when there is no explicit charset parameter.
Alternatively, the webmaster can also just convert everything to UTF-8 (there are handy scripts available for this everywhere) and avoid this altogether. Still recommended to add a charset in the page/headers to be explicit, but UTF-8 is very broadly accepted everywhere and supports all languages.