Java:如何检查字符是否属于特定的 unicode 块?

2023-04-06Java开发问题

本文介绍了Java:如何检查字符是否属于特定的 unicode 块?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我需要确定我的输入属于哪种自然语言.目标是区分混合输入中的 阿拉伯语 和英语单词，其中输入是 Unicode 并从 XML 文本节点中提取.我注意到类 Character.UnicodeBlock.和我的问题有关吗?我怎样才能让它工作?

I need to identify what natural language my input belongs to. The goal is to distinguish between Arabic and English words in a mixed input, where the input is Unicode and is extracted from XML text nodes. I have noticed the class Character.UnicodeBlock. Is it related to my problem? How can I get it to work?

Character.UnicodeBlock 方法对阿拉伯语很有用，但显然不适用于英语(或其他欧洲语言)，因为 BASIC_LATIN Unicode 块涵盖符号和不可打印字符和字母.所以现在我使用 String 对象的 matches() 方法和正则表达式 "[A-Za-z]+" 代替.我可以忍受它，但也许有人可以提出更好/更快的方法.

The Character.UnicodeBlock approach was useful for Arabic, but apparently doesn't do it for English (or other European languages) because the BASIC_LATIN Unicode block covers symbols and non-printable characters as well as letters. So now I am using the matches() method of the String object with the regex expression "[A-Za-z]+" instead. I can live with it, but perhaps someone can suggest a nicer/faster way.

2024-08-25 Java开发问题

相关推荐

如何使用 JAVA 向 COM PORT 发送数据?

如何使报表页面方向更改为“rtl"?

在 Eclipse 项目中使用西里尔文 .properties 文件

有没有办法在 Java 中检测 RTL 语言?

如何在 Java 中从 DB 加载资源包消息?

如何更改 Java 中的默认语言环境设置以使其保持一致?

热门文章

热门精品源码

最新VIP资源