Call btoa('Hello 🌍') and the console answers with a red InvalidCharacterError. Plain ASCII works, but an emoji stops it cold — and an accented letter slips through only to decode back as é. The cause is one old assumption baked into a single function.
Why btoa() stops at the first non-Latin character
The browser’s built-in btoa and atob treat every character as a single byte — a value from 0 to 255. That range only holds Latin-1 characters, and the one assumption produces two different failures.
A globe emoji 🌍 sits at code point U+1F30D, far outside the byte range, so btoa refuses it outright with InvalidCharacterError. An accented é (U+00E9) is sneakier: it fits inside 0–255, so btoa accepts it without complaint — but it stores the raw Latin-1 byte instead of the UTF-8 bytes the rest of the web expects. Decode that result as UTF-8 and you get é. The MDN reference for btoa() spells out the same Latin-1 limitation.
Latin-1 directly vs going through UTF-8
The same input takes two different paths depending on how you feed it in.
| Input | btoa directly | Via UTF-8 bytes |
|---|---|---|
Hello | works | works |
café | no error, but decodes to é | works |
🌍 (emoji) | InvalidCharacterError | works |
Pure ASCII is identical in Latin-1 and UTF-8, so both paths agree. Everything above U+007F either throws or quietly changes the bytes.
The fix: turn the string into UTF-8 bytes first
The fix is small. Instead of handing the raw string to btoa, run it through TextEncoder to get a UTF-8 byte array, then encode those bytes. To decode, reverse the steps through TextDecoder('utf-8'). Every character is now split into single bytes, which is exactly what btoa expects.
Drop Hello 🌍 into the PiPi Worlds Base64 tool and you get SGVsbG8g8J+MjQ==. Paste that value straight back, decode it, and Hello 🌍 returns without losing a single byte — because the tool routes through UTF-8 internally. Apply the same principle in your own code and the error disappears.
Standard vs URL-safe Base64
Standard Base64 uses +, /, and a trailing = for padding. The catch is that those characters mean something else inside a URL or a filename.
That is what the URL-safe variant is for. It swaps + for -, / for _, and strips the = padding. The URL-safe version of the example above is SGVsbG8g8J-MjQ, with the padding gone. The header and payload segments of a JWT are exactly this base64url form, so when you want to read a token, the JWT decoder is the quicker stop. If you need to handle a full value destined for a query string, the URL encoder covers that case.
Before you paste: where does that token go?
Base64 is not encryption. It is a reversible encoding that anyone can undo. So when you decode a production access token or a credential, any online tool that ships your input to an unknown server becomes a leak path in its own right.
The PiPi Worlds Base64 tool runs encoding and decoding entirely in your browser, so the value you paste never leaves the page — safe for tokens and private snippets. Decoding auto-detects both standard and URL-safe input and quietly cleans up stray whitespace or line breaks. Paste the value you received and read it back, with no broken characters to chase.