What is Base64 Encoding?

Last updated: February 2026

A comprehensive guide to the Base64 encoding format, algorithm, and practical applications

What Is Base64 Encoding?

Base64 is a binary-to-text encoding scheme defined in RFC 4648 that represents binary data using 64 printable ASCII characters. The algorithm converts every 3 input bytes (24 bits) into 4 output characters, producing a 33.3% size overhead compared to the original binary data.

The name "Base64" refers to the 64-character alphabet used for encoding. Each character represents exactly 6 bits of data (26 = 64). This encoding exists because many transport protocols and storage systems only support text data. Email (SMTP), JSON, XML, HTML, and URL query strings all require text-safe representations of binary content. Base64 bridges this gap by transforming arbitrary bytes into a string that passes through text-only channels without corruption.

The encoding process is stateless and deterministic: the same input always produces the same output. Base64 does not compress data, does not encrypt data, and does not add error-checking capabilities. Its sole purpose is format conversion from binary to text.

Use the Base64 text encoder to convert any string to Base64, or the Base64 image encoder to encode image files directly in your browser.

How Does the Base64 Algorithm Work?

The Base64 algorithm processes input data in groups of 3 bytes (24 bits), splitting each group into 4 segments of 6 bits. Each 6-bit segment maps to one of 64 characters in the Base64 alphabet. When the input length is not a multiple of 3, padding characters (=) fill the remaining positions.

Step-by-Step Encoding Process

The algorithm follows 5 steps for each 3-byte group:

  1. Read 3 bytes from the input stream (24 bits total).
  2. Concatenate the 3 bytes into a single 24-bit binary number.
  3. Split the 24-bit number into 4 groups of 6 bits each.
  4. Map each 6-bit value (0-63) to the corresponding character in the Base64 alphabet.
  5. Append padding (=) if the input had fewer than 3 bytes remaining.

Encoding Example: "Man"

The word "Man" consists of 3 ASCII bytes: M (77), a (97), n (110). The table below traces each step of the encoding process.

StepDataValue
Input charactersM, a, n3 ASCII characters
Decimal values77, 97, 1103 bytes
Binary (8-bit each)01001101 01100001 0110111024 bits
Split into 6-bit groups010011 010110 000101 1011104 groups
Decimal index values19, 22, 5, 464 indices (0-63)
Base64 charactersT, W, F, u4 output characters
ResultTWFu

The 3-byte input "Man" becomes the 4-character output "TWFu". This 3:4 ratio applies to every group, producing the characteristic 33.3% size increase. For a detailed walkthrough of the algorithm with additional examples, see the Base64 algorithm reference.

What Characters Does Base64 Use?

The standard Base64 alphabet defined in RFC 4648 Section 4 contains 64 characters: 26 uppercase letters (A-Z), 26 lowercase letters (a-z), 10 digits (0-9), plus sign (+), and forward slash (/). The equals sign (=) serves as the padding character. Each character maps to a specific 6-bit index value from 0 to 63.

Index RangeCharactersCount
0 - 25A B C D E F G H I J K L M N O P Q R S T U V W X Y Z26
26 - 51a b c d e f g h i j k l m n o p q r s t u v w x y z26
52 - 610 1 2 3 4 5 6 7 8 910
62+1
63/1
Padding=1

For the complete index-to-character mapping with binary values, see the Base64 character table.

URL-Safe Base64 Variant

RFC 4648 Section 5 defines a URL-safe variant that replaces 2 characters from the standard alphabet. The plus sign (+) becomes a hyphen (-), and the forward slash (/) becomes an underscore (_). These substitutions prevent conflicts with URL encoding, where + represents a space and / is a path separator. The URL-safe variant is used in JWT tokens, filename encoding, and URL query parameters.

IndexStandard (RFC 4648 §4)URL-Safe (RFC 4648 §5)
62+-
63/_

Use the URL-safe Base64 encoder to generate URL-compatible Base64 strings.

How Does Base64 Padding Work?

Base64 padding uses the = character to ensure the encoded output length is always a multiple of 4 characters. Padding is required when the input byte count is not evenly divisible by 3. The number of padding characters depends on the remainder of the input length divided by 3.

Input Bytes mod 3Remaining BytesPadding AddedOutput Characters
00 (exact multiple of 3)None4n characters
11 byte (8 bits)==4n + 4 characters
22 bytes (16 bits)=4n + 4 characters

Padding Examples

InputByte Countmod 3Base64 OutputPadding
Man30TWFuNone
Ma22TWE=1 pad
M11TQ==2 pads
Base6460QmFzZTY0None
Hello52SGVsbG8=1 pad
A11QQ==2 pads

When 1 input byte remains, the algorithm produces 2 Base64 characters plus ==. When 2 input bytes remain, it produces 3 Base64 characters plus =. Padding allows the decoder to determine the exact number of original bytes without external metadata. Use the Base64 validator to check whether a string has correct padding.

What Are the Common Use Cases for Base64?

Base64 encoding is used in 6 primary contexts where binary data must travel through text-based systems. Each use case exploits the same property: Base64 output consists entirely of printable ASCII characters that survive text processing without corruption.

Data URIs for Web Images

Data URIs embed Base64-encoded images directly in HTML and CSS, eliminating separate HTTP requests. The format is data:image/png;base64,[encoded data]. This technique reduces latency for small images (under 10KB) by avoiding network round-trips. Larger images should remain as external files because Base64 adds 33% overhead and prevents browser caching.

Convert images to data URIs using the Base64 image encoder or generate ready-to-use HTML and CSS embed code with the Base64 embed code generator.

Email Attachments (MIME)

MIME (Multipurpose Internet Mail Extensions), defined in RFC 2045, uses Base64 to encode email attachments. SMTP (Simple Mail Transfer Protocol) was designed for 7-bit ASCII text and cannot transport raw binary data. MIME Base64 wraps encoded output at 76 characters per line with CRLF line endings. This format allows binary files (images, PDFs, archives) to travel through email infrastructure without corruption.

JWT Tokens

JSON Web Tokens (JWT), defined in RFC 7519, use URL-safe Base64 (base64url) to encode the header and payload segments. A JWT consists of 3 base64url-encoded parts separated by periods: header.payload.signature. The URL-safe variant avoids conflicts with URL special characters. Convert between standard and URL-safe formats using the URL-safe Base64 tool.

API Payloads

REST APIs frequently transmit binary data (images, documents, certificates) as Base64-encoded strings within JSON payloads. JSON does not support raw binary data, so Base64 provides the text representation. The Content-Transfer-Encoding: base64 header signals that a payload contains Base64 data. API schemas (OpenAPI/Swagger) use the format: byte type for Base64-encoded fields.

CSS Embedding

CSS files embed small images (icons, backgrounds, patterns) as Base64 data URIs in background-image properties. This bundles image data with the stylesheet, reducing the total number of HTTP requests. The embed code generator produces ready-to-use CSS snippets with data URIs.

Database and File Storage

Databases that lack binary column types store encoded data as text strings. Configuration files (JSON, YAML, XML) embed binary content as Base64 values. Source code embeds small binary resources (certificates, keys, icons) as Base64 string literals to avoid external file dependencies.

What Is the Difference Between Base64, Base32, and Base16?

Base64, Base32, and Base16 are all binary-to-text encoding schemes defined in RFC 4648. They differ in alphabet size, bits per character, size overhead, and intended use cases. Base64 provides the most compact output, Base16 provides the most human-readable output, and Base32 balances readability with compactness.

PropertyBase64Base32Base16 (Hex)
Alphabet size64 characters32 characters16 characters
Bits per character6 bits5 bits4 bits
Size overhead33% (4:3 ratio)60% (8:5 ratio)100% (2:1 ratio)
RFCRFC 4648 §4RFC 4648 §6RFC 4648 §8
Padding character==None
Case sensitiveYesNo (A-Z, 2-7)No (0-9, A-F)
Common useEmail, data URIs, APIsTOTP codes, Crockford IDsHex dumps, checksums, colors

For converting between Base64 and hexadecimal (Base16), use the Base64 to HEX converter.

What Are Data URIs and How Do They Use Base64?

A data URI is an inline data scheme defined in RFC 2397 that embeds file content directly in HTML, CSS, or JavaScript using the format data:[mediatype][;base64],<data>. When the ;base64 token is present, the data portion contains Base64-encoded binary content.

Data URI Structure

data:[<MIME type>][;base64],<encoded data>

Examples:
data:image/png;base64,iVBORw0KGgo...
data:image/svg+xml;base64,PHN2ZyB4...
data:text/plain;base64,SGVsbG8gV29ybGQ=
data:application/pdf;base64,JVBERi0x...

Data URI Components

ComponentDescriptionExample
data:URI scheme identifierdata:
MIME typeMedia type of the encoded dataimage/png
;base64Encoding declaration;base64
,Separator between metadata and data,
Encoded dataBase64-encoded binary contentiVBORw0KGgo...

Data URIs eliminate HTTP requests for small resources but increase the HTML/CSS file size by the encoded data length plus 33% Base64 overhead. Images under 10KB typically benefit from data URI embedding. Images above 10KB should remain as external files for browser caching and CDN delivery. For a complete guide, see the data URI reference. Generate data URIs using the image encoder or the embed code generator.

What Are the Limitations of Base64 Encoding?

Base64 encoding has 4 primary limitations: size overhead, lack of security, absence of compression, and increased memory consumption. Understanding these constraints prevents misuse in production systems.

Is Base64 Encoding Secure?

Base64 is not encryption and provides zero security. It is a reversible encoding scheme that anyone can decode without a key, password, or secret. Treating Base64 as a security measure is a common and dangerous mistake.

The Base64 algorithm is deterministic and public. Every programming language includes built-in Base64 decoding functions (atob() in JavaScript, base64.b64decode() in Python, Base64.getDecoder() in Java). An attacker who intercepts a Base64-encoded string can decode it in under 1 millisecond.

What Base64 Does Not Provide

To demonstrate how easily Base64 decodes, paste any encoded string into the Base64 text decoder. The original content appears instantly.

When Base64 Appears in Security Contexts

Base64 appears in JWT tokens, TLS certificates (PEM format), and SSH keys. In these cases, Base64 is the transport encoding, not the security layer. The actual security comes from cryptographic signatures (HMAC, RSA, ECDSA) applied to the data before Base64 encoding. Removing the Base64 layer exposes the signed or encrypted binary payload, not the original plaintext.

What RFC Standards Define Base64?

Four RFC documents define Base64 encoding and its applications. RFC 4648 is the primary specification. The other 3 RFCs define Base64 usage within specific protocols: email (MIME), JSON Web Signatures, and OpenPGP.

RFCTitleYearScope
RFC 4648The Base16, Base32, and Base64 Data Encodings2006Primary specification. Defines standard Base64 (Section 4), URL-safe Base64 (Section 5), Base32 (Section 6), and Base16 (Section 8).
RFC 2045MIME Part One: Format of Internet Message Bodies1996Defines Base64 as a Content-Transfer-Encoding for email. Specifies 76-character line wrapping with CRLF line endings.
RFC 7515JSON Web Signature (JWS)2015Uses base64url encoding (RFC 4648 Section 5) for JWT header and payload segments. Omits padding by default.
RFC 4880OpenPGP Message Format2007Uses Radix-64 encoding (a Base64 variant) with a 24-bit CRC checksum appended after the encoded data.

RFC 4648 superseded earlier specifications including RFC 3548 (2003) and RFC 2045 (1996) as the authoritative reference for Base64 encoding. The URL-safe variant (Section 5) was first standardized in RFC 4648, addressing the incompatibility of + and / with URL encoding.

Frequently Asked Questions

Is Base64 encoding the same as encryption?

No. Base64 is a reversible encoding scheme, not encryption. Any person or program can decode a Base64 string back to the original binary data without a key. Base64 provides zero confidentiality. For data protection, use encryption algorithms such as AES-256 or RSA.

Why does Base64 increase file size by 33%?

Base64 represents every 3 input bytes (24 bits) as 4 output characters (6 bits each). The ratio 4/3 equals approximately 1.333, producing a 33.3% size increase. Additional overhead comes from padding characters and, in MIME encoding (RFC 2045), line breaks every 76 characters.

Can Base64 encode any type of file?

Yes. Base64 operates on raw bytes, so it encodes any binary data: images (PNG, JPEG, GIF, WebP), PDFs, audio files, video files, executables, ZIP archives, and plain text. The encoded output is always a printable ASCII string regardless of the input format. Encode files using the image encoder or text encoder.

What is the maximum size for Base64 encoding?

No theoretical maximum exists in the Base64 specification (RFC 4648). Practical limits depend on the application: browser JavaScript engines typically handle strings up to 512MB, data URIs in CSS have browser-specific limits (Chrome allows approximately 2MB), and email systems using MIME often cap attachments at 25MB before encoding.

Is Base64 encoding reversible?

Yes. Base64 encoding is fully reversible and lossless. Decoding a Base64 string always produces the exact original binary data, byte for byte. The process is deterministic: the same input always produces the same output, and decoding always recovers the same input. Test this using the Base64 text decoder.