如何从 Java 中的输入文本中删除标点符号?

How can I remove punctuation from input text in Java?(如何从 Java 中的输入文本中删除标点符号?)
本文介绍了如何从 Java 中的输入文本中删除标点符号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在尝试使用 Java 中的用户输入来获取一个句子,我需要将其设为小写并删除所有标点符号.这是我的代码:

I am trying to get a sentence using input from the user in Java, and i need to make it lowercase and remove all punctuation. Here is my code:

    String[] words = instring.split("\s+");
    for (int i = 0; i < words.length; i++) {
        words[i] = words[i].toLowerCase();
    }
    String[] wordsout = new String[50];
    Arrays.fill(wordsout,"");
    int e = 0;
    for (int i = 0; i < words.length; i++) {
        if (words[i] != "") {
            wordsout[e] = words[e];
            wordsout[e] = wordsout[e].replaceAll(" ", "");
            e++;
        }
    }
    return wordsout;

我似乎找不到任何方法来删除所有非字母字符.我尝试过使用正则表达式和迭代器,但没有成功.感谢您的帮助.

I cant seem to find any way to remove all non-letter characters. I have tried using regexes and iterators with no luck. Thanks for any help.

推荐答案

这首先删除所有非字母字符,折叠为小写,然后拆分输入,在一行中完成所有工作:

This first removes all non-letter characters, folds to lowercase, then splits the input, doing all the work in a single line:

String[] words = instring.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\s+");

空格最初留在输入中,因此拆分仍然有效.

Spaces are initially left in the input so the split will still work.

通过在拆分之前删除垃圾字符,您可以避免遍历元素.

By removing the rubbish characters before splitting, you avoid having to loop through the elements.

这篇关于如何从 Java 中的输入文本中删除标点符号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

How to send data to COM PORT using JAVA?(如何使用 JAVA 向 COM PORT 发送数据?)
How to make a report page direction to change to quot;rtlquot;?(如何使报表页面方向更改为“rtl?)
Use cyrillic .properties file in eclipse project(在 Eclipse 项目中使用西里尔文 .properties 文件)
Is there any way to detect an RTL language in Java?(有没有办法在 Java 中检测 RTL 语言?)
How to load resource bundle messages from DB in Java?(如何在 Java 中从 DB 加载资源包消息?)
How do I change the default locale settings in Java to make them consistent?(如何更改 Java 中的默认语言环境设置以使其保持一致?)