HTML Escape is the process of converting special HTML characters (like <, >, &, ") into HTML entities (like <, >, &, ").
Unescape is the reverse process, converting HTML entities back to their original characters.
Why Do We Need HTML Escaping?
- Prevent XSS (Cross-Site Scripting): Escaping is the first line of defense against XSS attacks. It prevents malicious HTML/JavaScript injection and is essential when displaying user-generated content.
- Display HTML Code as Text: Show HTML examples in documentation, render code snippets on web pages, or debug template engines.
- Data Serialization: Safely embed HTML in JSON/XML, prepare content for API responses, or store HTML in databases.
- Email & Header Safety: Escape special characters in email subjects, prepare data for HTTP headers, or handle form data correctly.
Entity Format Comparison
| Character | Named Entity | Decimal Entity | Hex Entity |
|---|---|---|---|
| < | < | < | < |
| > | > | > | > |
| & | & | & | & |
| " | " | " | " |
| ' | ' | ' | ' |
Code Examples
JavaScript (Browser)
// Escape HTML
function escapeHtml(text) {
const map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return text.replace(/[&<>"']/g, m => map[m]);
}
// Unescape HTML
function unescapeHtml(html) {
const doc = new DOMParser().parseFromString(html, 'text/html');
return doc.documentElement.textContent;
}
// Usage
const escaped = escapeHtml('<script>alert("XSS")</script>');
console.log(escaped);
// <script>alert("XSS")</script>Node.js (using 'he' library)
const he = require('he');
// Encode
const encoded = he.encode('<div class="test">Hello & goodbye</div>');
console.log(encoded);
// <div class="test">Hello & goodbye</div>
// Decode
const decoded = he.decode('<div>Hello</div>');
console.log(decoded); // <div>Hello</div>
// Different formats
he.encode('<tag>', { useNamedReferences: true }); // <tag>
he.encode('<tag>', { decimal: true }); // <tag>
he.encode('<tag>', { hexadecimal: true }); // <tag>Python
import html
# Escape
escaped = html.escape('<script>alert("XSS")</script>')
print(escaped)
# <script>alert("XSS")</script>
# Unescape
unescaped = html.unescape('<div>Hello</div>')
print(unescaped) # <div>Hello</div>Security Best Practices
⚠️ Important Security Notes:
- Escaping is NOT Enough: Combine with Content Security Policy (CSP) and backend validation.
- Context Matters: Use appropriate escaping for HTML content, JavaScript strings, URLs, and SQL.
- Don't Trust User Input: Always validate and sanitize on both client and server side.
- Use Framework Built-ins: React, Vue, and Angular escape by default—leverage these features.
Common Use Cases
- XSS Prevention: Escape user input before displaying:
<script>alert('XSS')</script>→<script>alert('XSS')</script> - Displaying Code: Show HTML code examples on web pages by escaping tags.
- JSON with HTML: Prepare HTML fragments for JSON serialization.
- Email Content: Escape special characters in email subjects and headers.
- Template Debugging: Unescape rendered template output to see actual HTML.
HTML轉義對於Web安全和資料顯示至關重要。我在處理安全審核中的XSS漏洞以及需要安全顯示使用者生成內容後建立了這個工具。無論您是在清理表單輸入、除錯範本引擎、準備JSON/XML序列化資料,還是教授Web安全概念,正確的HTML實體編碼都是必不可少的。該工具支援三種實體格式:命名實體(<、>)、十進制實體(<、>)和十六進制實體(<、>)——每種在不同情境中都很有用。所有處理都在瀏覽器中進行,安全處理敏感內容,無需將資料傳送到任何伺服器。
How to Use
轉義:將包含特殊字符的HTML或文字貼上到輸入面板。工具會自動即時將<、>、&、\"和'等字符轉換為HTML實體。選擇首選實體格式:命名(最易讀,如&)、十進制(廣泛相容,如&)或十六進制(緊湊,如&)。切換"保留換行符"以保留或刪除換行符。反轉義:貼上帶有實體的HTML,工具將其解碼回原始字符。"編碼所有字符"選項甚至會轉義非特殊字符,對於最大相容性的邊緣情況很有用。
Limitations & Important Notes
此工具處理HTML5規範中定義的HTML實體。它不會刪除或清理潛在危險的HTML屬性(如onclick、onerror)或JavaScript程式碼——它只轉義字符表示。對於真正的XSS防護,請將轉義與內容安全原則(CSP)和輸入驗證相結合。工具假定UTF-8編碼;其他編碼可能產生意外結果。對於非常大的文件(>5MB),瀏覽器記憶體限制可能導致效能下降。實體編碼會增加文字大小;不適合頻寬受限的情境。