HTML Escape is the process of converting special HTML characters (like <, >, &, ") into HTML entities (like <, >, &, ").
Unescape is the reverse process, converting HTML entities back to their original characters.
Why Do We Need HTML Escaping?
- Prevent XSS (Cross-Site Scripting): Escaping is the first line of defense against XSS attacks. It prevents malicious HTML/JavaScript injection and is essential when displaying user-generated content.
- Display HTML Code as Text: Show HTML examples in documentation, render code snippets on web pages, or debug template engines.
- Data Serialization: Safely embed HTML in JSON/XML, prepare content for API responses, or store HTML in databases.
- Email & Header Safety: Escape special characters in email subjects, prepare data for HTTP headers, or handle form data correctly.
Entity Format Comparison
| Character | Named Entity | Decimal Entity | Hex Entity |
|---|---|---|---|
| < | < | < | < |
| > | > | > | > |
| & | & | & | & |
| " | " | " | " |
| ' | ' | ' | ' |
Code Examples
JavaScript (Browser)
// Escape HTML
function escapeHtml(text) {
const map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return text.replace(/[&<>"']/g, m => map[m]);
}
// Unescape HTML
function unescapeHtml(html) {
const doc = new DOMParser().parseFromString(html, 'text/html');
return doc.documentElement.textContent;
}
// Usage
const escaped = escapeHtml('<script>alert("XSS")</script>');
console.log(escaped);
// <script>alert("XSS")</script>Node.js (using 'he' library)
const he = require('he');
// Encode
const encoded = he.encode('<div class="test">Hello & goodbye</div>');
console.log(encoded);
// <div class="test">Hello & goodbye</div>
// Decode
const decoded = he.decode('<div>Hello</div>');
console.log(decoded); // <div>Hello</div>
// Different formats
he.encode('<tag>', { useNamedReferences: true }); // <tag>
he.encode('<tag>', { decimal: true }); // <tag>
he.encode('<tag>', { hexadecimal: true }); // <tag>Python
import html
# Escape
escaped = html.escape('<script>alert("XSS")</script>')
print(escaped)
# <script>alert("XSS")</script>
# Unescape
unescaped = html.unescape('<div>Hello</div>')
print(unescaped) # <div>Hello</div>Security Best Practices
⚠️ Important Security Notes:
- Escaping is NOT Enough: Combine with Content Security Policy (CSP) and backend validation.
- Context Matters: Use appropriate escaping for HTML content, JavaScript strings, URLs, and SQL.
- Don't Trust User Input: Always validate and sanitize on both client and server side.
- Use Framework Built-ins: React, Vue, and Angular escape by default—leverage these features.
Common Use Cases
- XSS Prevention: Escape user input before displaying:
<script>alert('XSS')</script>→<script>alert('XSS')</script> - Displaying Code: Show HTML code examples on web pages by escaping tags.
- JSON with HTML: Prepare HTML fragments for JSON serialization.
- Email Content: Escape special characters in email subjects and headers.
- Template Debugging: Unescape rendered template output to see actual HTML.
HTML转义对于Web安全和数据显示至关重要。我在处理安全审计中的XSS漏洞以及需要安全显示用户生成内容后创建了这个工具。无论您是在清理表单输入、调试模板引擎、准备JSON/XML序列化数据,还是教授Web安全概念,正确的HTML实体编码都是必不可少的。该工具支持三种实体格式:命名实体(<、>)、十进制实体(<、>)和十六进制实体(<、>)——每种在不同上下文中都很有用。所有处理都在浏览器中进行,安全处理敏感内容,无需将数据发送到任何服务器。
How to Use
转义:将包含特殊字符的HTML或文本粘贴到输入面板。工具会自动实时将<、>、&、\"和'等字符转换为HTML实体。选择首选实体格式:命名(最易读,如&)、十进制(广泛兼容,如&)或十六进制(紧凑,如&)。切换\"保留换行符\"以保留或删除换行符。反转义:粘贴带有实体的HTML,工具将其解码回原始字符。\"编码所有字符\"选项甚至会转义非特殊字符,对于最大兼容性的边缘情况很有用。
Limitations & Important Notes
此工具处理HTML5规范中定义的HTML实体。它不会删除或清理潜在危险的HTML属性(如onclick、onerror)或JavaScript代码——它只转义字符表示。对于真正的XSS防护,请将转义与内容安全策略(CSP)和输入验证相结合。工具假定UTF-8编码;其他编码可能产生意外结果。对于非常大的文档(>5MB),浏览器内存限制可能导致性能下降。实体编码会增加文本大小;不适合带宽受限的场景。