What is meta charset?

 

A charset or character set in full is essentially a set of characters recognized by the computer the same way the calculator can identify numbers. Each of these characters is represented by a number known as code point and this creates a communication channel for encoding and decoding content.

 

A character set, therefore, contains characters that serve a specific or particular purpose. The computer stores the characters as one or more bytes. An example is the ASCII character set which represents all English characters and special control characters with numbers from 0-127.

 

However, most character sets only work for specific languages and recognize limited characters and this makes the coding and encoding difficult or impossible. In modern times, however, the Unicode is the most reliable and universally accepted character set due to its ability to translate codes and numbers easily.

 

You can see the meta charset in the header of your html code

 

<meta charset="utf-8>

 

How does it work?

 

Meta Charset is what determines how text is transmitted and stored. This text data is usually converted to binary first and then there needs to be a kind of cipher that connects characters with their correct binary equivalents.

 

When this data is eventually decoded, the character encoding must be known beforehand or there could be complications. An example of these can be seen in browsers when you’re looking at a webpage. Information about the kind of character set used comes from the server or is written directly by the developer. Unfortunately, there is a myriad of character sets and this means diverse ways of matching binary codes to characters and bytes.

 

For content developers and authors, choosing the UTF-8 character set for your content means that you can use a single character set to multiple characters needs thereby simplifying things greatly without the need to track and convert multiple times. This means it would be easier to surf through your content without getting confusing characters and garbage

 

AddType 'text/html; charset=UTF-8' html

 

Why is it important?

 

When you think of the fact that every single time text is transmitted, it needs to be encoded in a specific charset and decoded on the other side, the importance of charset is quite obvious. This means that without proper character coding, a browser will display garbage text because it simply does not understand what is being put into it and has to make a quick uninformed guess.

 

It is also important in html forms because when you input text into text boxes on sites or social media platforms, it has to be encoded carefully. If this information is unavailable for any reason, the incorrect mapping could lead to the loss of vital information.

Charset code example

 

What a character set does is to provide a key to unlock and crack a code that passes between the user and the website.

 

It is a set of structured mappings between the bytes in the computer and the characters in the character set. If this key is missing, the data looks like written garbage. This means that when you input text through a keyboard, the character set links the characters you choose to specific bytes in computer memory, and then to display the text it reads the bytes back into the characters.

 

Is it a ranking factor for SEO?

 

The character set is not a ranking factor for search engine optimization. Most search engines focus on the important goal of delivering relevant, useful content to those who seek it and as such does not consider other outside factors that do not contribute to that goal.

 

So your character set matters because of how you transmit information but search engines are not interested in it. Using other charsets apart from Utf-8 will not decrease your SEO ranking because to a large extent it doesn’t matter what character encoding you use as long as the search engine is able to get information to the end users.

 

How can I add it if I need to?

 

You can add a character set to your website by using the following code

 

<?phpheader( Content-Type: text/html; charset=iso-8859-1’)

 

For this to work, you should include this in the PHP that includes your html file. It is important to note that it might not work on all web pages as the code above is not a function but a statement so you should include your page html. This is bearing in mind that the php web page uses Utf-8 character set on its header.

 

Different types of charset

Most charsets came about from individual manufacturers catering to the needs of their clients. Most charsets are incompatible with each other (with a few exceptions). The three most common charsets are, ASCII (1968), ISO 8859-1 (1987) and UTF-8 (1996).

ASCII

Charset for the English language. Contains 7-bits that are mapped to 128 characters. Each letter is assigned a number from 0 to 127. This code set is quite restricted, but being one of the pioneers sparked the creation of a character set for each of the other languages.  Most computers use ASCII codes to represent text.

ascii charset table

Unicode

Unicode was created to unify 135 modern and historic languages under one standard. Unicode is a standard and not a charset itself. As of May 2019, version 12.1, Unicode contains 137,994 characters including symbols and emojis. The Unicode standard defines UTF-8, UTF-16, and UTF-32

UTF-8

Now the dominant code of the internet. UTF-8 is used in 94% of websites. It encodes the most common characters,  basic numbers, and English with 8-bits. UTF-8 uses a minimum of 1 byte. UTF-8 is also identical to ASCII for English. This means that any ASCII text is also a UTF-8 text.

UTF-8 charset code example

Image Source 

UTF-16

Unicode with 16 bits. While used originally with systems such as Windows and Java, it never really took off with Linux and macOS. Today UTF-16 is used with 0.01% of webpages. UTF-16 uses a minimum of 2 bytes.

UTF-16 Unicode charset example

Image Source

UTF-32

Unicode with 32 bits. The advantage of UTF-32 is that the Unicode points are directly indexed. The disadvantage is that it is not efficient with its use of space as it always uses 4 bytes. This means up to twice the size of UTF-16 and four times to that of UTF-8.

 

In conclusion

So what are the SEO benefits of charset? While not a direct ranking factor you will need to be aware of your charset. If you accidentally display 2 different standards in your meta charset or do not follow the standard's rules correctly, then you will have a decoding issue, and your content will not be displayed correctly. This will negatively impact your SEO.

 

If you implement your charset correctly then you will help prevent a high bounce rate, not give people a reason to not link to you and search engines cannot mistakenly interpret your content which is going to help your SEO efforts.