关于python:如何计算句子中的单词数,忽略数字,标点符号和空格?

How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?

我该如何计算一个句子中的单词?我在用Python。

例如,我可能有一个字符串:

1
string ="I     am having  a   very  nice  23!@$      day."

那是7个字。我对每个单词后/前的随机空格量以及涉及数字或符号时都有困难。


"没有任何str.split()题元分裂的人物在全垒打(空格:

1
2
3
4
>>> s = 'I am having a very nice day.'
>>>
>>> len(s.split())
7

从相关的文献。

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.


你可以使用regex.findall()

1
2
3
4
import re
line =" I am having a very nice day."
count = len(re.findall(r'\w+', line))
print (count)


这是一个简单的字计数器使用正则表达式。脚本包括的一环,你可以terminate信息当你重新做。

1
2
3
4
5
6
7
8
9
#word counter using regex
import re
while True:
    string =raw_input("Enter the string:")
    count = len(re.findall("[a-zA-Z_]+", string))
    if line =="Done": #command to terminate the loop
        break
    print (count)
print ("Terminated")


1
2
s ="I     am having  a   very  nice  23!@$      day."
sum([i.strip(string.punctuation).isalpha() for i in s.split()])

以上的陈述会去通过每个文本块)和remove punctuations之前,如果verifying块冰真的alphabets字符串。


如何用一个简单的环数的两个数occurrences大学空间!???????

1
2
3
4
5
6
txt ="Just an example here move along"
count = 1
for i in txt:
if i =="":
   count += 1
print(count)


好的,这里是我做的这个版本。在noticed,你想要你的输出两个7均值,而你不想count特殊人物和编号。所以这里的冰的正则表达式的模式:

1
re.findall("[a-zA-Z_]+", string)

[a-zA-Z_]均值它将匹配任何字符(试比较a-zlowercase)和a-z(upper case)。

关于空间之中。如果你想要除去所有的额外的空间,去做:

1
2
3
string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string
while" " in string: # While  there are 2 spaces beetwen words in our string...
    string = string.replace(" ","") # ... replace them by one space!


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
    def wordCount(mystring):  
        tempcount = 0  
        count = 1  

        try:  
            for character in mystring:  
                if character =="":  
                    tempcount +=1  
                    if tempcount ==1:  
                        count +=1  

                    else:  
                        tempcount +=1
                 else:
                     tempcount=0

             return count  

         except Exception:  
             error ="Not a string"  
             return error  

    mystring ="I   am having   a    very nice 23!@$      day."          

    print(wordCount(mystring))

8输出的冰


1
2
3
4
5
6
7
8
9
10
11
12
import string

sentence ="I     am having  a   very  nice  23!@$      day."
# Remove all punctuations
sentence = sentence.translate(str.maketrans('', '', string.punctuation))
# Remove all numbers"
sentence = ''.join([word for word in sentence if not word.isdigit()])
count = 0;
for index in range(len(sentence)-1) :
    if sentence[index+1].isspace() and not sentence[index].isspace():
        count += 1
print(count)