URL Encoder & Decoder

Reading time: 7 min Last updated: February 10, 2026 Algorithm: RFC 3986 percent-encoding
Web development and URL structure
Quick Summary: Paste a URL or text, click Encode or Decode, and get the result instantly. Auto-detect mode identifies if your input is already encoded. The URL component breakdown shows each part of the URL parsed and labeled. All processing is client-side.

RFC 3986 Character Reference

Reserved characters have special meaning in URL syntax. Unreserved characters do not need encoding.

Table of Contents
  1. RFC 3986 URI Syntax
  2. Percent-Encoding Explained
  3. Reserved Characters Table
  4. URL vs URI
  5. Punycode and IDN
  6. Practical Applications
  7. Frequently Asked Questions

RFC 3986 URI Syntax

RFC 3986, published in 2005, defines the generic syntax for Uniform Resource Identifiers (URIs). It replaced the earlier RFC 2396 and consolidated the rules for how URIs are structured, parsed, and resolved. Every URL you type in a browser, every API endpoint you call, and every link on a webpage follows the syntax defined in this specification.

A URI has the general structure: scheme://authority/path?query#fragment. The scheme identifies the protocol (http, https, ftp, mailto). The authority contains the host (domain name or IP address), optionally preceded by userinfo and followed by a port number. The path identifies the specific resource within the host. The query provides additional parameters as key-value pairs. The fragment identifies a secondary resource within the primary resource.

Each component has specific rules about which characters are allowed literally and which must be percent-encoded. The scheme must start with a letter and can contain letters, digits, plus, hyphen, and period. The host can contain letters, digits, hyphens, and periods. The path can contain unreserved characters plus certain delimiters. The query and fragment have the most permissive character sets but still require encoding for characters outside their allowed sets.

Percent-Encoding Explained

Percent-encoding (also called URL encoding) is the mechanism defined by RFC 3986 for representing characters that are not allowed in a particular URI component. The encoding replaces each disallowed byte with a percent sign (%) followed by two hexadecimal digits representing the byte value. For example, a space (ASCII 32, hex 20) becomes %20, and an ampersand (ASCII 38, hex 26) becomes %26.

For multi-byte characters (like non-ASCII Unicode characters), each byte of the UTF-8 representation is individually percent-encoded. The emoji character (U+1F600), for instance, has the UTF-8 encoding F0 9F 98 80 and becomes %F0%9F%98%80 when percent-encoded. This ensures that any Unicode character can be represented in a URI while maintaining compatibility with systems that only support ASCII.

A common source of confusion is the difference between encoding an entire URL and encoding a URL component. When encoding a full URL, you want to preserve the structural characters (://?#&=) that give the URL its meaning. When encoding a single component — such as a query parameter value — you must encode all characters that have special meaning in URLs, including those structural characters. JavaScript provides two functions for these distinct purposes: encodeURI() for full URLs and encodeURIComponent() for components.

Double encoding is a frequent mistake that occurs when an already-encoded string is encoded again. The percent sign (%) in %20 becomes %25, turning %20 into %2520. This produces a URL that, when decoded, yields %20 rather than the intended space character. This tool detects potential double encoding and displays a warning to help you avoid this pitfall.

Reserved Characters Table

RFC 3986 divides characters into three categories: reserved, unreserved, and all others. Reserved characters have special meaning within URI syntax: : / ? # [ ] @ ! $ & ' ( ) * + , ; =. These characters are used as delimiters between URI components and must be percent-encoded when used as data within a component.

Unreserved characters are always safe to use literally in any URI component: A-Z a-z 0-9 - . _ ~. These characters never need to be encoded, though encoding them (e.g., using %41 for the letter A) is allowed and must be treated as equivalent by URI processors.

All other characters — including spaces, Unicode characters, and control characters — must always be percent-encoded. The space character deserves special mention because it has two common encodings: %20 (the standard percent-encoding) and + (the application/x-www-form-urlencoded format used in HTML forms). While both are widely understood, %20 is the correct encoding per RFC 3986.

URL vs URI

The terms URL (Uniform Resource Locator) and URI (Uniform Resource Identifier) are often used interchangeably, but they have a subtle technical distinction. A URI is any string that identifies a resource. A URL is a URI that also specifies how to locate that resource by including an access mechanism (protocol).

In practice, the distinction rarely matters for web developers. Every HTTP(S) URL is a URI. The term URI is used in technical specifications (like RFC 3986) because it is more general. The term URL is used in everyday conversation because it is more familiar. The WHATWG URL Standard, which defines how browsers parse URLs, simply uses the term URL and does not draw the distinction.

The third related term, URN (Uniform Resource Name), identifies a resource by name rather than location. Examples include ISBN numbers (urn:isbn:0451450523) and UUIDs (urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66). URNs are URIs but not URLs because they do not specify a retrieval mechanism.

Punycode and IDN

Internationalized Domain Names (IDN) allow domain names to contain non-ASCII characters, such as Chinese characters, Arabic script, or accented Latin letters. Since the DNS system only supports ASCII, IDN uses a system called Punycode (defined in RFC 3492) to encode Unicode domain names as ASCII strings prefixed with xn--.

For example, the German domain muenchen.de with the umlaut character would be encoded as xn--mnchen-3ya.de in Punycode. Browsers display the Unicode form to users but use the Punycode form for DNS resolution. This dual representation is transparent to most users but important for developers who need to validate, compare, or process URLs containing international domain names.

IDN also introduces security considerations. Visually similar characters from different scripts (called homoglyphs) can be used to create domain names that look identical to legitimate domains — a technique known as an IDN homograph attack. Modern browsers mitigate this by displaying the Punycode form instead of the Unicode form when a domain contains characters from multiple scripts.

Practical Applications

URL encoding is essential in numerous everyday development scenarios. When constructing API request URLs with query parameters, each parameter value must be encoded to prevent special characters from breaking the URL structure. A search query like "cats & dogs" must be encoded as "cats%20%26%20dogs" in the query string to prevent the ampersand from being interpreted as a parameter separator.

HTML form submissions using the GET method encode form data as URL query parameters using the application/x-www-form-urlencoded format. This format uses + for spaces and percent-encoding for other special characters. Understanding this encoding is important for debugging form submissions and constructing URLs programmatically.

Redirect URLs often contain encoded URLs as parameter values. For example, an OAuth authorization URL might include redirect_uri=https%3A%2F%2Fexample.com%2Fcallback. The nested encoding ensures that the inner URL is treated as a single parameter value rather than being parsed as part of the outer URL structure.

Frequently Asked Questions

What is the difference between encodeURI and encodeURIComponent?

encodeURI() encodes a complete URL, preserving characters that have special meaning in URLs like : / ? # & =. encodeURIComponent() encodes a single URL component, encoding all special characters. Use encodeURIComponent() for query parameter values and encodeURI() for full URLs.

What is percent-encoding?

Percent-encoding replaces unsafe or reserved characters with a percent sign followed by two hexadecimal digits representing the character's byte value. For example, a space becomes %20, an ampersand becomes %26, and a forward slash becomes %2F.

Which characters need to be encoded in URLs?

Reserved characters (: / ? # [ ] @ ! $ & ' ( ) * + , ; =) and all characters outside the unreserved set (A-Z a-z 0-9 - . _ ~) must be percent-encoded when used as data within a URL component. The exact set depends on which component of the URL you are constructing.

What is the difference between a URL and a URI?

A URI (Uniform Resource Identifier) is a general identifier for a resource. A URL (Uniform Resource Locator) is a URI that includes the access mechanism (protocol). All URLs are URIs, but not all URIs are URLs. In practice, the terms are often used interchangeably.

What is double encoding?

Double encoding occurs when an already percent-encoded string is encoded again, turning %20 into %2520. This produces incorrect URLs that decode to percent-encoded strings instead of the intended characters. This tool detects and warns about double encoding.

Is my data sent to a server?

No. All encoding and decoding happens entirely in your browser using JavaScript's built-in encodeURIComponent() and decodeURIComponent() functions. No data is transmitted to any server.

Can I encode multiple URLs at once?

Yes. Use batch mode by entering one URL per line. All URLs will be encoded or decoded simultaneously. Each URL is processed independently, so an error in one URL does not affect the others.

TB
Thibault Besson-Magdelain
Developer and technical writer focused on building practical web tools. Creator of TextToBinary.net.
Connect on LinkedIn

Back to Text & Data Converters