正则表达式

问题

JavaScript 正则表达式有哪些常用语法？如何在实际开发中应用？

面试速答版

JavaScript 正则表达式有哪些常用语法？ 按功能分块记：

字符类：\d 数字、\w 单词字符、\s 空白、. 任意字符（不含换行），大写是反义。
字符集与范围：[abc] 任选一个、[a-z] 范围、[^abc] 排除。
量词：*（0+）、+（1+）、?（0或1）、{n,m}，默认贪婪，加 ? 变懒惰（如 .*?）。
边界与分组：^ $ 行首尾、\b 单词边界、(...) 捕获组、(?:...) 非捕获、(?<name>...) 命名组。
断言：(?=...) 正向先行、(?!...) 负向先行、(?<=...)/(?<!...) 后行断言。
修饰符：g 全局、i 忽略大小写、m 多行、s 让 . 匹配换行、u Unicode、y 粘连。

如何在实际开发中应用？ 常见落地场景和对应方法：

校验：手机号、邮箱、身份证、密码强度——用 regex.test(str)。
提取：用 str.match(regex) 或 str.matchAll(regex)（带 g 时拿所有捕获组）。
替换：str.replace(regex, replacer)，replacer 可以是字符串（用 $1 引用捕获组）或函数。
分割：str.split(regex) 按复杂分隔符切分。
避坑：带 g 标志的 RegExp 实例有 lastIndex，重复调用 test/exec 会有副作用；嵌套量词（如 (a+)+）容易触发灾难性回溯让浏览器卡死，写之前先想清楚分支边界。

答案

正则表达式（Regular Expression）是用于匹配字符串模式的工具。JavaScript 通过 RegExp 对象和字面量语法支持正则表达式。

创建正则表达式

// 字面量语法（推荐）
const regex1 = /pattern/flags;

// 构造函数（动态创建）
const regex2 = new RegExp('pattern', 'flags');

// 示例
const pattern = /hello/i;  // 不区分大小写匹配 hello
const dynamic = new RegExp(`user_${userId}`, 'g');

基础语法

字符类

// . 匹配任意字符（除换行符）
/a.c/.test('abc');  // true
/a.c/.test('a\nc'); // false

// \d 数字 [0-9]
/\d+/.test('123');  // true

// \w 单词字符 [a-zA-Z0-9_]
/\w+/.test('hello_123');  // true

// \s 空白字符
/\s/.test(' ');  // true
/\s/.test('\t'); // true
/\s/.test('\n'); // true

// 大写表示取反
// \D 非数字
// \W 非单词字符
// \S 非空白字符

字符集

// [abc] 匹配 a、b 或 c
/[abc]/.test('a');  // true
/[abc]/.test('d');  // false

// [a-z] 范围
/[a-z]/.test('x');  // true
/[0-9]/.test('5');  // true

// [^abc] 取反
/[^abc]/.test('d'); // true
/[^abc]/.test('a'); // false

// 组合
/[a-zA-Z0-9_]/.test('A');  // true

量词

// * 零个或多个
/ab*c/.test('ac');     // true
/ab*c/.test('abbc');   // true

// + 一个或多个
/ab+c/.test('ac');     // false
/ab+c/.test('abc');    // true

// ? 零个或一个
/ab?c/.test('ac');     // true
/ab?c/.test('abc');    // true
/ab?c/.test('abbc');   // false

// {n} 恰好 n 个
/a{3}/.test('aaa');    // true
/a{3}/.test('aa');     // false

// {n,} 至少 n 个
/a{2,}/.test('aa');    // true
/a{2,}/.test('aaa');   // true

// {n,m} n 到 m 个
/a{2,4}/.test('aa');   // true
/a{2,4}/.test('aaaaa'); // true（匹配前 4 个）

边界

// ^ 字符串开始
/^hello/.test('hello world');  // true
/^hello/.test('say hello');    // false

// $ 字符串结束
/world$/.test('hello world');  // true
/world$/.test('world hello');  // false

// \b 单词边界
/\bcat\b/.test('cat');         // true
/\bcat\b/.test('category');    // false

// \B 非单词边界
/\Bcat\B/.test('concatenate'); // true

分组与引用

// () 分组
const match = 'hello world'.match(/(hello) (world)/);
// ['hello world', 'hello', 'world']

// (?:) 非捕获分组
const match2 = 'hello world'.match(/(?:hello) (world)/);
// ['hello world', 'world']

// \1 反向引用
/(\w+)\s+\1/.test('hello hello');  // true
/(\w+)\s+\1/.test('hello world');  // false

// 命名捕获组
const match3 = 'hello world'.match(/(?<greeting>hello) (?<target>world)/);
// match3.groups = { greeting: 'hello', target: 'world' }

或运算

// | 或
/cat|dog/.test('cat');   // true
/cat|dog/.test('dog');   // true
/cat|dog/.test('bird');  // false

// 配合分组
/(red|blue) car/.test('red car');   // true
/(red|blue) car/.test('blue car');  // true

标志（Flags）

标志	说明
`g`	全局匹配
`i`	忽略大小写
`m`	多行模式
`s`	dotAll 模式（. 匹配换行）
`u`	Unicode 模式
`y`	粘性匹配

// g - 全局匹配
'hello hello'.match(/hello/);   // ['hello']
'hello hello'.match(/hello/g);  // ['hello', 'hello']

// i - 忽略大小写
/hello/i.test('HELLO');  // true

// m - 多行模式
const multiline = `hello
world`;
/^world/m.test(multiline);  // true（匹配第二行开头）

// s - dotAll
/hello.world/s.test('hello\nworld');  // true

// u - Unicode
/\u{1F600}/u.test('😀');  // true

常用方法

RegExp 方法

const regex = /hello/g;
const str = 'hello world hello';

// test - 测试是否匹配
regex.test(str);  // true

// exec - 执行匹配，返回详细信息
regex.lastIndex = 0;  // 重置索引
regex.exec(str);
// ['hello', index: 0, input: 'hello world hello', groups: undefined]
regex.exec(str);
// ['hello', index: 12, input: 'hello world hello', groups: undefined]
regex.exec(str);
// null

String 方法

const str = 'hello world';

// match - 返回匹配结果
str.match(/o/g);   // ['o', 'o']
str.match(/(\w+) (\w+)/);  // ['hello world', 'hello', 'world']

// matchAll - 返回所有匹配的迭代器
const matches = str.matchAll(/\w+/g);
for (const match of matches) {
  console.log(match[0], match.index);
}
// hello 0
// world 6

// search - 返回首次匹配位置
str.search(/world/);  // 6
str.search(/xyz/);    // -1

// replace - 替换
str.replace(/world/, 'everyone');  // 'hello everyone'
str.replace(/o/g, '0');            // 'hell0 w0rld'

// replaceAll - 全部替换（ES2021）
str.replaceAll('o', '0');  // 'hell0 w0rld'

// split - 分割
'a,b;c|d'.split(/[,;|]/);  // ['a', 'b', 'c', 'd']

替换中的特殊变量

const str = 'hello world';

// $& - 匹配的子串
str.replace(/\w+/g, '[$&]');  // '[hello] [world]'

// $` - 匹配左边的文本
str.replace(/world/, '[$`]');  // 'hello [hello ]'

// $' - 匹配右边的文本
str.replace(/hello/, "[$']");  // '[ world] world'

// $1, $2 - 捕获组
'hello world'.replace(/(\w+) (\w+)/, '$2 $1');  // 'world hello'

// 替换函数
'hello world'.replace(/\w+/g, (match, offset) => {
  return match.toUpperCase();
});  // 'HELLO WORLD'

实际应用

表单验证

// 邮箱验证
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
emailRegex.test('user@example.com');  // true

// 手机号验证（中国大陆）
const phoneRegex = /^1[3-9]\d{9}$/;
phoneRegex.test('13812345678');  // true

// 密码强度（至少8位，包含大小写字母和数字）
const passwordRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$/;
passwordRegex.test('Password123');  // true

// URL 验证
const urlRegex = /^https?:\/\/[\w.-]+(?:\/[\w./?%&=-]*)?$/;
urlRegex.test('https://example.com/path?query=1');  // true

// IP 地址
const ipRegex = /^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$/;
ipRegex.test('192.168.1.1');  // true

字符串处理

// 提取数字
'price: $99.99'.match(/\d+\.?\d*/g);  // ['99.99']

// 驼峰转连字符
'camelCase'.replace(/([A-Z])/g, '-$1').toLowerCase();  // 'camel-case'

// 连字符转驼峰
'kebab-case'.replace(/-(\w)/g, (_, c) => c.toUpperCase());  // 'kebabCase'

// 去除多余空格
'  hello   world  '.replace(/\s+/g, ' ').trim();  // 'hello world'

// 提取括号内容
'hello (world) and (universe)'.match(/\(([^)]+)\)/g);  // ['(world)', '(universe)']

// 模板字符串插值
const template = 'Hello, {{name}}! You have {{count}} messages.';
const data = { name: 'Alice', count: 5 };
template.replace(/\{\{(\w+)\}\}/g, (_, key) => String(data[key as keyof typeof data]));
// 'Hello, Alice! You have 5 messages.'

高级应用

// 千分位格式化
function formatNumber(num: number): string {
  return num.toString().replace(/\B(?=(\d{3})+(?!\d))/g, ',');
}
formatNumber(1234567);  // '1,234,567'

// HTML 转义
function escapeHtml(str: string): string {
  const escapeMap: Record<string, string> = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#39;'
  };
  return str.replace(/[&<>"']/g, char => escapeMap[char]);
}

// 提取 URL 参数
function parseQuery(url: string): Record<string, string> {
  const result: Record<string, string> = {};
  const regex = /[?&]([^=&]+)=([^&]*)/g;
  let match;
  while ((match = regex.exec(url)) !== null) {
    result[decodeURIComponent(match[1])] = decodeURIComponent(match[2]);
  }
  return result;
}
parseQuery('?name=alice&age=25');  // { name: 'alice', age: '25' }

// 高亮搜索词
function highlight(text: string, keyword: string): string {
  const regex = new RegExp(`(${keyword})`, 'gi');
  return text.replace(regex, '<mark>$1</mark>');
}

性能优化

// 1. 预编译正则
// ❌ 每次调用都创建新正则
function validate(email: string): boolean {
  return /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test(email);
}

// ✅ 只创建一次
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
function validate2(email: string): boolean {
  return emailRegex.test(email);
}

// 2. 避免贪婪匹配导致回溯
// ❌ 可能导致灾难性回溯
const badRegex = /a+a+b/;  // 对 'aaaaaaaaaaaaa' 会很慢

// ✅ 使用非贪婪或原子组
const goodRegex = /a+?a+?b/;

// 3. 使用 test 而非 match（只需判断时）
// ❌
if (/pattern/.match(str)) {}
// ✅
if (/pattern/.test(str)) {}

常见面试问题

Q1: 贪婪匹配和非贪婪匹配的区别？

答案：

类型	语法	行为
贪婪	`*`, `+`, `?`	尽可能多匹配
非贪婪	`*?`, `+?`, `??`	尽可能少匹配

const str = '<div>hello</div>';

// 贪婪：匹配最长
str.match(/<.*>/);   // ['<div>hello</div>']

// 非贪婪：匹配最短
str.match(/<.*?>/);  // ['<div>']

Q2: 正则表达式的正向/负向断言？

答案：

断言	语法	说明
正向先行	`(?=...)`	后面必须是
负向先行	`(?!...)`	后面不能是
正向后行	`(?<=...)`	前面必须是
负向后行	`(?<!...)`	前面不能是

// 正向先行：匹配后面跟着数字的字母
'a1b2c3'.match(/\w(?=\d)/g);  // ['a', 'b', 'c']

// 负向先行：匹配后面不跟数字的字母
'a1bc3'.match(/\w(?!\d)/g);  // ['1', 'b', 'c', '3']

// 正向后行：匹配前面是 $ 的数字
'$100 €200'.match(/(?<=\$)\d+/);  // ['100']

// 负向后行：匹配前面不是 $ 的数字
'$100 200'.match(/(?<!\$)\d+/);  // ['200']

Q3: 如何匹配中文？

答案：

// Unicode 范围
const chineseRegex = /[\u4e00-\u9fa5]+/g;
'hello 你好 world 世界'.match(chineseRegex);  // ['你好', '世界']

// Unicode 属性（ES2018）
/\p{Script=Han}+/gu.test('你好');  // true

Q4: exec 和 match 的区别？

答案：

特性	exec	match
调用方	RegExp	String
无 g 标志	返回首次匹配+捕获组	同 exec
有 g 标志	返回首次匹配，需循环	返回所有匹配，无捕获组
lastIndex	更新	不更新

const str = 'hello world';

// 无 g - 结果相同
/(\w+)/.exec(str);   // ['hello', 'hello', ...]
str.match(/(\w+)/);  // ['hello', 'hello', ...]

// 有 g - 结果不同
const regex = /\w+/g;
regex.exec(str);      // ['hello', ...] - 需循环获取全部
str.match(/\w+/g);    // ['hello', 'world'] - 一次返回全部

Q5: 如何写一个验证密码强度的正则？

答案：

// 要求：8-20位，必须包含大小写字母和数字
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,20}$/;

// 分解说明：
// ^              - 开始
// (?=.*[a-z])    - 必须包含小写字母
// (?=.*[A-Z])    - 必须包含大写字母
// (?=.*\d)       - 必须包含数字
// [a-zA-Z\d]     - 只允许字母和数字
// {8,20}         - 长度 8-20
// $              - 结束

strongPassword.test('Password123');   // true
strongPassword.test('password');      // false（无大写和数字）
strongPassword.test('PASSWORD123');   // false（无小写）

问题​

答案​

创建正则表达式​

基础语法​

字符类​

字符集​

量词​

边界​

分组与引用​

或运算​

标志（Flags）​

常用方法​

RegExp 方法​

String 方法​

替换中的特殊变量​

实际应用​

表单验证​

字符串处理​

高级应用​

性能优化​

常见面试问题​

Q1: 贪婪匹配和非贪婪匹配的区别？​

Q2: 正则表达式的正向/负向断言？​

Q3: 如何匹配中文？​

Q4: exec 和 match 的区别？​

Q5: 如何写一个验证密码强度的正则？​

相关链接​

问题

答案