To improve cost efficiency for customers who use the Webext Interact SMS API, a subset of Unicode characters are automatically converted to standard GSM 03.38 equivalents.
The conversion process is conditional. To ensure message integrity and predictability, the system applies the following rule:
The system will only perform conversions if every non-standard Unicode character in your message has a defined GSM 03.38 equivalent.
If your message contains any character that cannot be converted (i.e., a character outside the supported GSM 03.38 set for which no conversion rule exists), the system will not perform any conversions at all. The message will be sent in its original format.
| Original Character | Unicode | Converted To | Description |
|---|
| ' | U+2018 | ' | Left single quotation mark |
| ' | U+2019 | ' | Right single quotation mark |
| ‚ | U+201A | ' | Single low-9 quotation mark |
| ‛ | U+201B | ' | Single high-reversed-9 quotation mark |
| ′ | U+2032 | ' | Prime |
| ‵ | U+2035 | ' | Reversed prime |
| ʹ | U+02B9 | ' | Modifier letter prime |
| ʼ | U+02BC | ' | Modifier letter apostrophe |
| ʻ | U+02BB | ' | Modifier letter turned comma |
| ` | U+0060 | ' | Grave accent |
| ´ | U+00B4 | ' | Acute accent |
| " | U+201C | " | Left double quotation mark |
| " | U+201D | " | Right double quotation mark |
| „ | U+201E | " | Double low-9 quotation mark |
| ‟ | U+201F | " | Double high-reversed-9 quotation mark |
| ″ | U+2033 | " | Double prime |
| ‶ | U+2036 | " | Reversed double prime |
| « | U+00AB | < | Left-pointing double angle quotation mark |
| » | U+00BB | > | Right-pointing double angle quotation mark |
| ‹ | U+2039 | < | Single left-pointing angle quotation mark |
| › | U+203A | > | Single right-pointing angle quotation mark |
| Original Character | Unicode | Converted To | Description |
|---|
| – | U+2013 | - | En dash |
| — | U+2014 | - | Em dash |
| ― | U+2015 | - | Horizontal bar |
| ‐ | U+2010 | - | Hyphen |
| ‑ | U+2011 | - | Non-breaking hyphen |
| ‒ | U+2012 | - | Figure dash |
| − | U+2212 | - | Minus sign |
| ﹘ | U+FE58 | - | Small em dash |
| ﹣ | U+FE63 | - | Small hyphen-minus |
| - | U+FF0D | - | Fullwidth hyphen-minus |
| Original Character | Unicode | Converted To | Description |
|---|
| (tab) | U+0009 | (space) | Tab character |
| (nbsp) | U+00A0 | (space) | Non-breaking space |
| (en quad) | U+2000 | (space) | En quad |
| (em quad) | U+2001 | (space) | Em quad |
| (en space) | U+2002 | (space) | En space |
| (em space) | U+2003 | (space) | Em space |
| (3/em space) | U+2004 | (space) | Three-per-em space |
| (4/em space) | U+2005 | (space) | Four-per-em space |
| (6/em space) | U+2006 | (space) | Six-per-em space |
| (fig space) | U+2007 | (space) | Figure space |
| (punc space) | U+2008 | (space) | Punctuation space |
| (thin space) | U+2009 | (space) | Thin space |
| (hair space) | U+200A | (space) | Hair space |
| (nnbsp) | U+202F | (space) | Narrow no-break space |
| (mmsp) | U+205F | (space) | Medium mathematical space |
| (ideographic) | U+3000 | (space) | Ideographic space |
| Original Character | Unicode | Converted To | Description |
|---|
| · | U+00B7 | . | Middle dot |
| • | U+2022 | . | Bullet |
| ‧ | U+2027 | . | Hyphenation point |
| ∙ | U+2219 | . | Bullet operator |
| ⋅ | U+22C5 | . | Dot operator |
| ․ | U+2024 | . | One dot leader |
| Original Character | Unicode | Converted To | Description |
|---|
| ⁄ | U+2044 | / | Fraction slash |
| ∕ | U+2215 | / | Division slash |
| ⧸ | U+29F8 | / | Big solidus |
| / | U+FF0F | / | Fullwidth solidus |
| ÷ | U+00F7 | / | Division sign |
| Original Character | Unicode | Converted To | Description |
|---|
| × | U+00D7 | - | Multiplication sign |
| ∗ | U+2217 | - | Asterisk operator |
| ✱ | U+2731 | - | Heavy asterisk |
| * | U+FF0A | - | Fullwidth asterisk |
| Original | Unicode | Converted To | Description |
|---|
| À | U+00C0 | A | Latin A with grave |
| Á | U+00C1 | A | Latin A with acute |
| Â | U+00C2 | A | Latin A with circumflex |
| Ã | U+00C3 | A | Latin A with tilde |
| Ā | U+0100 | A | Latin A with macron |
| Ă | U+0102 | A | Latin A with breve |
| Ą | U+0104 | A | Latin A with ogonek |
| Ǎ | U+01CD | A | Latin A with caron |
| á | U+00E1 | a | Latin a with acute |
| â | U+00E2 | a | Latin a with circumflex |
| ã | U+00E3 | a | Latin a with tilde |
| ā | U+0101 | a | Latin a with macron |
| ă | U+0103 | a | Latin a with breve |
| ą | U+0105 | a | Latin a with ogonek |
| ǎ | U+01CE | a | Latin a with caron |
Note: The character à (U+00E0) is natively supported in SMS and is preserved as-is.
| Original | Unicode | Converted To | Description |
|---|
| È | U+00C8 | E | Latin E with grave |
| Ê | U+00CA | E | Latin E with circumflex |
| Ë | U+00CB | E | Latin E with diaeresis |
| Ē | U+0112 | E | Latin E with macron |
| Ĕ | U+0114 | E | Latin E with breve |
| Ė | U+0116 | E | Latin E with dot above |
| Ę | U+0118 | E | Latin E with ogonek |
| Ě | U+011A | E | Latin E with caron |
| ê | U+00EA | e | Latin e with circumflex |
| ë | U+00EB | e | Latin e with diaeresis |
| ē | U+0113 | e | Latin e with macron |
| ĕ | U+0115 | e | Latin e with breve |
| ė | U+0117 | e | Latin e with dot above |
| ę | U+0119 | e | Latin e with ogonek |
| ě | U+011B | e | Latin e with caron |
Note: The characters É (U+00C9), é (U+00E9), and è (U+00E8) are natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ì | U+00CC | I | Latin I with grave |
| Í | U+00CD | I | Latin I with acute |
| Î | U+00CE | I | Latin I with circumflex |
| Ï | U+00CF | I | Latin I with diaeresis |
| Ĩ | U+0128 | I | Latin I with tilde |
| Ī | U+012A | I | Latin I with macron |
| Ĭ | U+012C | I | Latin I with breve |
| Į | U+012E | I | Latin I with ogonek |
| İ | U+0130 | I | Latin I with dot above |
| í | U+00ED | i | Latin i with acute |
| î | U+00EE | i | Latin i with circumflex |
| ï | U+00EF | i | Latin i with diaeresis |
| ĩ | U+0129 | i | Latin i with tilde |
| ī | U+012B | i | Latin i with macron |
| ĭ | U+012D | i | Latin i with breve |
| į | U+012F | i | Latin i with ogonek |
| ı | U+0131 | i | Latin dotless i |
Note: The character ì (U+00EC) is natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ò | U+00D2 | O | Latin O with grave |
| Ó | U+00D3 | O | Latin O with acute |
| Ô | U+00D4 | O | Latin O with circumflex |
| Õ | U+00D5 | O | Latin O with tilde |
| Ō | U+014C | O | Latin O with macron |
| Ŏ | U+014E | O | Latin O with breve |
| Ő | U+0150 | O | Latin O with double acute |
| Ǒ | U+01D1 | O | Latin O with caron |
| ó | U+00F3 | o | Latin o with acute |
| ô | U+00F4 | o | Latin o with circumflex |
| õ | U+00F5 | o | Latin o with tilde |
| ō | U+014D | o | Latin o with macron |
| ŏ | U+014F | o | Latin o with breve |
| ő | U+0151 | o | Latin o with double acute |
| ǒ | U+01D2 | o | Latin o with caron |
Note: The characters Ö (U+00D6), ö (U+00F6), Ø (U+00D8), ø (U+00F8), and ò (U+00F2) are natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ù | U+00D9 | U | Latin U with grave |
| Ú | U+00DA | U | Latin U with acute |
| Û | U+00DB | U | Latin U with circumflex |
| Ũ | U+0168 | U | Latin U with tilde |
| Ū | U+016A | U | Latin U with macron |
| Ŭ | U+016C | U | Latin U with breve |
| Ů | U+016E | U | Latin U with ring above |
| Ű | U+0170 | U | Latin U with double acute |
| Ų | U+0172 | U | Latin U with ogonek |
| Ǔ | U+01D3 | U | Latin U with caron |
| ú | U+00FA | u | Latin u with acute |
| û | U+00FB | u | Latin u with circumflex |
| ũ | U+0169 | u | Latin u with tilde |
| ū | U+016B | u | Latin u with macron |
| ŭ | U+016D | u | Latin u with breve |
| ů | U+016F | u | Latin u with ring above |
| ű | U+0171 | u | Latin u with double acute |
| ų | U+0173 | u | Latin u with ogonek |
| ǔ | U+01D4 | u | Latin u with caron |
Note: The characters Ü (U+00DC), ü (U+00FC), and ù (U+00F9) are natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ý | U+00DD | Y | Latin Y with acute |
| Ŷ | U+0176 | Y | Latin Y with circumflex |
| Ÿ | U+0178 | Y | Latin Y with diaeresis |
| ý | U+00FD | y | Latin y with acute |
| ÿ | U+00FF | y | Latin y with diaeresis |
| ŷ | U+0177 | y | Latin y with circumflex |
| Original | Unicode | Converted To | Description |
|---|
| Ć | U+0106 | C | Latin C with acute |
| Ĉ | U+0108 | C | Latin C with circumflex |
| Ċ | U+010A | C | Latin C with dot above |
| Č | U+010C | C | Latin C with caron |
| ç | U+00E7 | c | Latin c with cedilla |
| ć | U+0107 | c | Latin c with acute |
| ĉ | U+0109 | c | Latin c with circumflex |
| ċ | U+010B | c | Latin c with dot above |
| č | U+010D | c | Latin c with caron |
Note: The character Ç (U+00C7) is natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ń | U+0143 | N | Latin N with acute |
| Ņ | U+0145 | N | Latin N with cedilla |
| Ň | U+0147 | N | Latin N with caron |
| ń | U+0144 | n | Latin n with acute |
| ņ | U+0146 | n | Latin n with cedilla |
| ň | U+0148 | n | Latin n with caron |
Note: The characters Ñ (U+00D1) and ñ (U+00F1) are natively supported in SMS.
| Original | Unicode | Converted To | Description |
|---|
| Ś | U+015A | S | Latin S with acute |
| Ŝ | U+015C | S | Latin S with circumflex |
| Ş | U+015E | S | Latin S with cedilla |
| Š | U+0160 | S | Latin S with caron |
| ś | U+015B | s | Latin s with acute |
| ŝ | U+015D | s | Latin s with circumflex |
| ş | U+015F | s | Latin s with cedilla |
| š | U+0161 | s | Latin s with caron |
| Original | Unicode | Converted To | Description |
|---|
| Ź | U+0179 | Z | Latin Z with acute |
| Ż | U+017B | Z | Latin Z with dot above |
| Ž | U+017D | Z | Latin Z with caron |
| ź | U+017A | z | Latin z with acute |
| ż | U+017C | z | Latin z with dot above |
| ž | U+017E | z | Latin z with caron |
| Original | Unicode | Converted To | Description |
|---|
| Đ | U+0110 | D | Latin D with stroke |
| đ | U+0111 | d | Latin d with stroke |
| Ð | U+00D0 | D | Latin Eth |
| ð | U+00F0 | d | Latin eth |
| Ğ | U+011E | G | Latin G with breve |
| ğ | U+011F | g | Latin g with breve |
| Ģ | U+0122 | G | Latin G with cedilla |
| ģ | U+0123 | g | Latin g with cedilla |
| Ķ | U+0136 | K | Latin K with cedilla |
| ķ | U+0137 | k | Latin k with cedilla |
| Ĺ | U+0139 | L | Latin L with acute |
| ĺ | U+013A | l | Latin l with acute |
| Ļ | U+013B | L | Latin L with cedilla |
| ļ | U+013C | l | Latin l with cedilla |
| Ľ | U+013D | L | Latin L with caron |
| ľ | U+013E | l | Latin l with caron |
| Ł | U+0141 | L | Latin L with stroke |
| ł | U+0142 | l | Latin l with stroke |
| Ŕ | U+0154 | R | Latin R with acute |
| ŕ | U+0155 | r | Latin r with acute |
| Ŗ | U+0156 | R | Latin R with cedilla |
| ŗ | U+0157 | r | Latin r with cedilla |
| Ř | U+0158 | R | Latin R with caron |
| ř | U+0159 | r | Latin r with caron |
| Ţ | U+0162 | T | Latin T with cedilla |
| ţ | U+0163 | t | Latin t with cedilla |
| Ť | U+0164 | T | Latin T with caron |
| ť | U+0165 | t | Latin t with caron |
| Ŧ | U+0166 | T | Latin T with stroke |
| ŧ | U+0167 | t | Latin t with stroke |
| Original Character | Unicode | Converted To | Description |
|---|
| ≈ | U+2248 | ~ | Almost equal to |
| Original Character | Unicode | Converted To | Description |
|---|
| ( | U+FF08 | ( | Fullwidth left parenthesis |
| ) | U+FF09 | ) | Fullwidth right parenthesis |
| [ | U+FF3B | [ | Fullwidth left square bracket |
| ] | U+FF3D | ] | Fullwidth right square bracket |
| { | U+FF5B | { | Fullwidth left curly bracket |
| } | U+FF5D | } | Fullwidth right curly bracket |
| 〈 | U+2329 | < | Left-pointing angle bracket |
| 〉 | U+232A | > | Right-pointing angle bracket |
| 〈 | U+3008 | < | Left angle bracket |
| 〉 | U+3009 | > | Right angle bracket |
| Original Character | Unicode | Converted To | Description |
|---|
| ‗ | U+2017 | _ | Double low line |
| ‾ | U+203E | - | Overline |
| ⁃ | U+2043 | - | Hyphen bullet |
| ¦ | U+00A6 | | | Broken bar |
| ! | U+FF01 | ! | Fullwidth exclamation mark |
| ? | U+FF1F | ? | Fullwidth question mark |
| , | U+FF0C | , | Fullwidth comma |
| . | U+FF0E | . | Fullwidth full stop |
| : | U+FF1A | : | Fullwidth colon |
| ; | U+FF1B | ; | Fullwidth semicolon |
All fullwidth digits (0-9) and letters (A-Z, a-z) are converted to their standard ASCII equivalents.
| Range | Unicode Range | Converted To |
|---|
| Fullwidth digits | U+FF10 - U+FF19 | 0-9 |
| Fullwidth uppercase | U+FF21 - U+FF3A | A-Z |
| Fullwidth lowercase | U+FF41 - U+FF5A | a-z |
| Original Character | Unicode | Converted To | Description |
|---|
| ✗ | U+2717 | x | Ballot X |
| ✘ | U+2718 | x | Heavy ballot X |
The following invisible/control characters are automatically removed from messages:
| Character | Unicode | Description |
|---|
| (invisible) | U+200B | Zero Width Space |
| (invisible) | U+200C | Zero Width Non-Joiner |
| (invisible) | U+200D | Zero Width Joiner |
| (invisible) | U+2060 | Word Joiner |
| (invisible) | U+FEFF | Byte Order Mark / Zero Width No-Break Space |
| (invisible) | U+00AD | Soft Hyphen |
| (invisible) | U+E0000-E007F | Tags block (128 deprecated language tag characters) |
The following special characters are natively supported by SMS and are preserved as-is:
- $ (Dollar)
- £ (Pound)
- ¥ (Yen)
- € (Euro)
- à, À (a/A with grave)
- è, é, È, É (e variants)
- ì (i with grave)
- ò (o with grave)
- ù (u with grave)
- Ç (C with cedilla)
- Ä, ä, Ö, ö, Ü, ü (German umlauts)
- Ñ, ñ (Spanish n with tilde)
- Å, å (Swedish a with ring)
- Ø, ø (Danish/Norwegian o with stroke)
- Æ, æ (Danish/Norwegian ligature)
- ß (German sharp s)
- Δ (Delta), Φ (Phi), Γ (Gamma), Λ (Lambda)
- Ω (Omega), Π (Pi), Ψ (Psi), Σ (Sigma)
- Θ (Theta), Ξ (Xi)
- @ (At sign)
- _ (Underscore)
- ¡ (Inverted exclamation)
- ¿ (Inverted question mark)
- § (Section sign)
- ¤ (Currency sign)
| Original Text | Converted Text |
|---|
| "Smart quotes" | "Smart quotes" |
| It's working — great! | It's working - great! |
| Price: €50 × 2 | Price: €50 * 2 |
| Café München | Café München |
| Łódź, Poland | Lodz, Poland |
| Cancel: ✗ | Cancel: x |