Java中检测中文Unicode字符串是否包含中文简体字符或繁体字符的可靠方法是什么?假设默认情况下,简化和传统范围中常见的字符将被视为简化字符。
理想情况下,检查特定 Unicode 字符范围的正则表达式匹配。这些范围是否被记录和定义,这种方法可靠吗?
public class ChineseCharacterDetector {
public static boolean containsSimplifiedChinese(String input) {
for (char c : input.toCharArray()) {
if (isSimplifiedChinese(c)) {
return true;
}
}
return false;
}
public static boolean containsTraditionalChinese(String input) {
for (char c : input.toCharArray()) {
if (isTraditionalChinese(c)) {
return true;
}
}
return false;
}
private static boolean isSimplifiedChinese(char c) {
// Common simplified Chinese character range
return (c >= '\u4E00' && c <= '\u9FFF');
}
private static boolean isTraditionalChinese(char c) {
// Common traditional Chinese character ranges
return (c >= '\u4E00' && c <= '\u9FFF') || // Common characters
(c >= '\u3400' && c <= '\u4DBF') || // Extended-A
(c >= '\u20000' && c <= '\u2A6DF'); // Extended-B
}
public static void main(String[] args) {
String input = "你好,世界!Hello, 世界!";
if (containsSimplifiedChinese(input)) {
System.out.println("Contains Simplified Chinese characters");
} else if (containsTraditionalChinese(input)) {
System.out.println("Contains Traditional Chinese characters");
} else {
System.out.println("Contains neither Simplified nor Traditional Chinese characters");
}
}
}
isSimplifiedChinese 函数考虑常见简体中文范围的字符,而 isTraditionalChinese 函数考虑典型繁体中文范围以及某些扩展范围的字符。函数 containsSimplifiedChinese 和 containsTraditionalChinese 迭代输入文本,查找指定范围内的字符。