忽略字符串比较中的重音字母

1

本文介绍了忽略字符串比较中的重音字母的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我需要比较 C# 中的 2 个字符串,并将重音字母与非重音字母相同.例如:

I need to compare 2 strings in C# and treat accented letters the same as non-accented letters. For example:

string s1 = "hello";
string s2 = "héllo";

s1.Equals(s2, StringComparison.InvariantCultureIgnoreCase);
s1.Equals(s2, StringComparison.OrdinalIgnoreCase);

这两个字符串必须相同(就我的应用程序而言),但是这两个语句的计算结果都为 false.C# 中有没有办法做到这一点?

These 2 strings need to be the same (as far as my application is concerned), but both of these statements evaluate to false. Is there a way in C# to do this?

推荐答案

EDIT 2012-01-20: Oh boy!解决方案要简单得多,并且几乎永远存在于框架中.正如 knightpfhor 所指出的:

EDIT 2012-01-20: Oh boy! The solution was so much simpler and has been in the framework nearly forever. As pointed out by knightpfhor :

string.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace);

<小时>

这是一个从字符串中去除变音符号的函数:


Here's a function that strips diacritics from a string:

static string RemoveDiacritics(string text)
{
  string formD = text.Normalize(NormalizationForm.FormD);
  StringBuilder sb = new StringBuilder();

  foreach (char ch in formD)
  {
    UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
    if (uc != UnicodeCategory.NonSpacingMark)
    {
      sb.Append(ch);
    }
  }

  return sb.ToString().Normalize(NormalizationForm.FormC);
}

更多详情在 MichKap 的博客上 (RIP...).

原理是将'é'变成2个连续的字符'e',锐角.然后它遍历字符并跳过变音符号.

The principle is that is it turns 'é' into 2 successive chars 'e', acute. It then iterates through the chars and skips the diacritics.

你好"变成他<acute>llo",而后者又变成你好".

"héllo" becomes "he<acute>llo", which in turn becomes "hello".

Debug.Assert("hello"==RemoveDiacritics("héllo"));

<小时>

注意:以下是相同功能的更紧凑的 .NET4+ 友好版本:


Note: Here's a more compact .NET4+ friendly version of the same function:

static string RemoveDiacritics(string text)
{
  return string.Concat( 
      text.Normalize(NormalizationForm.FormD)
      .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch)!=
                                    UnicodeCategory.NonSpacingMark)
    ).Normalize(NormalizationForm.FormC);
}

这篇关于忽略字符串比较中的重音字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

C# 中的多播委托奇怪行为?
Multicast delegate weird behavior in C#?(C# 中的多播委托奇怪行为?)...
2023-11-11 C#/.NET开发问题
6

参数计数与调用不匹配?
Parameter count mismatch with Invoke?(参数计数与调用不匹配?)...
2023-11-11 C#/.NET开发问题
26

如何将代表存储在列表中
How to store delegates in a List(如何将代表存储在列表中)...
2023-11-11 C#/.NET开发问题
6

代表如何工作(在后台)?
How delegates work (in the background)?(代表如何工作(在后台)?)...
2023-11-11 C#/.NET开发问题
5

没有 EndInvoke 的 C# 异步调用?
C# Asynchronous call without EndInvoke?(没有 EndInvoke 的 C# 异步调用?)...
2023-11-11 C#/.NET开发问题
2

Delegate.CreateDelegate() 和泛型:错误绑定到目标方法
Delegate.CreateDelegate() and generics: Error binding to target method(Delegate.CreateDelegate() 和泛型:错误绑定到目标方法)...
2023-11-11 C#/.NET开发问题
14