What Is URL Encoding? A Complete Guide to Percent-Encoding

Published 2026-02-18

What Is URL Encoding?

URL encoding — also known as percent-encoding — is the process of converting characters into a format that can be safely transmitted within a Uniform Resource Locator (URL). When data in a URL contains characters that have special meaning in the URL syntax or characters that are not allowed in URLs at all, those characters must be replaced with an encoded representation. The encoding consists of a percent sign (%) followed by two hexadecimal digits that represent the character's byte value in UTF-8. For example, a space character becomes %20, the ampersand & becomes %26, and the equals sign = becomes %3D. URL encoding is defined in RFC 3986 and is fundamental to how the web works — every time you submit a form, make an API call, or share a link, URL encoding ensures your data arrives intact.

Why URL Encoding Is Necessary

URLs were originally designed to carry only a limited set of ASCII characters — the 26 letters of the English alphabet (uppercase and lowercase), the digits 0–9, and a handful of special characters. This limitation exists because URLs are transmitted over the internet as plain text in HTTP headers, and certain characters carry specific semantic meaning within the URL structure. The colon (:) separates the scheme from the host. The slash (/) divides path segments. The question mark (?) signals the start of a query string. The ampersand (&) separates query parameters. The hash (#) marks a fragment identifier. If your data contains any of these characters, the URL parser will misinterpret them as structural delimiters rather than data. Without encoding, a URL like https://example.com/search?q=rock & roll would break — the space and ampersand would be misread as URL syntax, not as part of the search query. URL encoding solves this by replacing problematic characters with their safe percent-encoded equivalents.

How Percent-Encoding Works: The Mechanics

The percent-encoding process follows a straightforward algorithm. For any character that needs to be encoded, the browser or application: (1) determines the character's UTF-8 byte representation, (2) converts each byte to its hexadecimal value, and (3) prefixes each hexadecimal value with a percent sign. For ASCII characters (those with code points 0–127), this is simple because each character is exactly one byte. The letter A has ASCII value 65, which is 41 in hexadecimal, so it encodes as %41 (though letters don't need encoding). The space character has ASCII value 32, which is 20 in hex, giving us %20. For characters outside the ASCII range — like accented letters or emoji — the process is slightly more complex. The character e with acute (U+00E9) encodes to two UTF-8 bytes: 0xC3 and 0xA9, resulting in %C3%A9. Understanding this byte-level process helps you debug encoding issues and work with international text in URLs.

Reserved vs. Unreserved Characters

RFC 3986 divides URL characters into two categories. Unreserved characters never need encoding and can appear in any URL without modification. These are: uppercase letters A–Z, lowercase letters a–z, digits 0–9, and the four symbols hyphen (-), underscore (_), period (.), and tilde (~). Reserved characters have special meaning in URL syntax and should only appear unencoded when serving their structural purpose. These include: ! # $ & ' ( ) * + , / : ; = ? @ [ ]. When a reserved character appears in data (not as a structural delimiter), it must be percent-encoded. Everything else — characters outside the ASCII range, control characters, spaces, and most punctuation — must always be encoded. This distinction is what makes the difference between encodeURI() and encodeURIComponent() in JavaScript: the former preserves reserved characters (useful for encoding a complete URL), while the latter encodes them all (essential for encoding individual parameter values).

Common Percent-Encoded Values You'll See

Certain percent-encoded sequences appear so frequently in web development that they're worth memorizing. %20 is the space character — you'll see this in file download URLs and search queries. %2B is the plus sign; note that a literal + in query strings is sometimes treated as a space by older form parsers. %3D is the equals sign, often encoded within parameter values. %26 is the ampersand, critical to encode when it appears in values. %2F is the forward slash — important to encode in path segments when the slash is data, not a path separator. %3A is the colon. %3F is the question mark. %40 is the at symbol. %23 is the hash/pound sign. %25 is the percent sign itself — you must encode a literal percent as %25 to distinguish it from the encoding prefix. Double-encoding mistakes (encoding an already-encoded string) produce sequences like %2520 instead of %20, which is a common bug to watch for.

URL Encoding vs. HTML Encoding vs. Base64

URL encoding is often confused with other encoding techniques. HTML entity encoding (like & for & or < for <) is used to safely embed special characters within HTML markup — it's different from percent-encoding and serves a different purpose. Base64 encoding converts binary data to a text-safe string using 64 printable ASCII characters, and it's commonly used for embedding images in CSS or encoding binary data in email attachments. Base64 is not URL-safe by default (it uses + and / and = which need escaping in URLs); there's a URL-safe Base64 variant that replaces these with -, _, and omits padding. JavaScript string escaping (backslash sequences like \n for newline) is completely unrelated to URL encoding. Understanding which encoding to apply in which context prevents security vulnerabilities like injection attacks and ensures data integrity across system boundaries.

When and Where to Apply URL Encoding

URL encoding must be applied in specific contexts: Query parameters — always encode parameter names and values separately before joining with = and &. Path segments — encode each segment of a URL path, but do not encode the slash separators between segments. Form submissions — HTML forms submitted with method="GET" automatically URL-encode field values; the browser handles this for you. With method="POST" and enctype="application/x-www-form-urlencoded", the body is similarly encoded. API calls — always encode dynamic parameter values in API URLs, especially any user-provided input. Redirect URLs — when embedding one URL as a parameter within another URL, the inner URL must be fully encoded. Failing to encode at the right level is a frequent source of bugs. Use language-specific encoding functions rather than manual string replacement — they handle edge cases and character encodings correctly.

Decoding URL-Encoded Strings

URL decoding is the reverse process: reading percent-encoded sequences and converting them back to their original characters. Server-side frameworks and HTTP libraries typically handle decoding automatically — when you read a query parameter in Express, Django, Laravel, or Spring, the value has already been decoded. However, understanding the decoding process matters for debugging, log analysis, and manual URL construction. To decode manually: scan the string for % signs, read the two hexadecimal digits that follow, convert the pair to a byte value, then interpret the bytes as UTF-8. Be cautious with double-decoding: a common security mistake is to decode a URL, find that the result still contains percent-sequences, and decode again. This can allow attackers to bypass security filters by encoding their payloads multiple times. Always decode exactly once at the appropriate layer of your application. Our free URL encoder/decoder tool handles all of this correctly — try it to encode or decode any string instantly in your browser.

Encode URLs Instantly

Encode and decode URLs with full Unicode support, multiple encoding modes, and batch processing.

Open URL Encoder