str_word_count() 用于非拉丁词?

2023-07-16php开发问题

本文介绍了str_word_count() 用于非拉丁词?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我正在尝试计算用非拉丁语言(保加利亚语)编写的变量中的单词数.但似乎 str_word_count() 没有计算非拉丁词.php文件的编码为UTF-8

im trying to count the number of words in variable written in non-latin language (Bulgarian). But it seems that str_word_count() is not counting non-latin words. The encoding of the php file is UTF-8

$str = "текст на кирилица";
echo 'Number of words: '.str_word_count($str);
//this returns 0

推荐答案

您可以使用正则表达式:

You may do it with regex:

$str = "текст на кирилица";
echo 'Number of words: '.count(preg_split('/s+/', $str));

这里我将单词定界符定义为空格字符.如果可能还有其他东西将被视为单词分隔符，您需要将其添加到您的正则表达式中.

here I'm defining word delimiter as space characters. If there may be something else that will be treated as word delimiter, you'll need to add it into your regex.

另外，请注意，由于在正则表达式中没有 utf 字符 (不在字符串中) - /u 修饰符不是必需的.但是如果你想要一些 utf 字符作为分隔符，你需要添加这个正则表达式修饰符.

Also, note, that since there's no utf characters in regex (not in string) - /u modifier isn't required. But if you'll want some utf characters to act as delimiter, you'll need to add this regex modifier.

更新:

如果您只想在文字中处理 西里尔文 字母，您可以使用:

If you want only cyrillic letters to be treated in words, you may use:

$str = "текст 
на 12453
кирилица";
echo 'Number of words: '.count(preg_split('/[^А-Яа-яЁё]+/u', $str));

这篇关于str_word_count() 用于非拉丁词?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

The End

PHP实现DeepL翻译API调用

DeepL的翻译效果还是很强大的，如果我们要用php实现DeepL翻译调用，该怎么办呢？以下是代码示例，希望能够帮到需要的朋友。在这里需要注意，这个DeepL的账户和api申请比较难，不支持中国大陆申请，需要拥有香港或者海外信用卡才行，没账号的话，目前某宝可以...

2025-08-20 php开发问题

168

PHP通过phpspreadsheet导入Excel日期数据处理方法

PHP通过phpspreadsheet导入Excel日期，导入系统后，全部变为了4开头的几位数字，这是为什么呢？原因很简单，将Excel的时间设置问文本，我们就能看到该日期本来的数值，上图对应的数值为：要怎么解决呢？进行数据转换就行，这里可以封装方法，或者用第三方的...

2024-10-23 php开发问题

287

相关推荐

PHP实现DeepL翻译API调用

PHP通过phpspreadsheet导入Excel日期数据处理方法

mediatemple - 无法使用 codeigniter 发送电子邮件

Laravel Gmail 配置错误

将 PHPMailer 用于 SMTP 的问题

关于如何在 GoDaddy 服务器中使用 PHPMailer 设置 SMTP 的问题

热门文章

热门精品源码

最新VIP资源