How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?
我该如何计算一个句子中的单词?我在用Python。
例如,我可能有一个字符串:
1 | string ="I am having a very nice 23!@$ day." |
那是7个字。我对每个单词后/前的随机空格量以及涉及数字或符号时都有困难。
"没有任何
1 2 3 4 | >>> s = 'I am having a very nice day.' >>> >>> len(s.split()) 7 |
从相关的文献。
If sep is not specified or is
None , a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.
你可以使用
1 2 3 4 | import re line =" I am having a very nice day." count = len(re.findall(r'\w+', line)) print (count) |
这是一个简单的字计数器使用正则表达式。脚本包括的一环,你可以terminate信息当你重新做。
1 2 3 4 5 6 7 8 9 | #word counter using regex import re while True: string =raw_input("Enter the string:") count = len(re.findall("[a-zA-Z_]+", string)) if line =="Done": #command to terminate the loop break print (count) print ("Terminated") |
1 2 | s ="I am having a very nice 23!@$ day." sum([i.strip(string.punctuation).isalpha() for i in s.split()]) |
以上的陈述会去通过每个文本块)和remove punctuations之前,如果verifying块冰真的alphabets字符串。
如何用一个简单的环数的两个数occurrences大学空间!???????
1 2 3 4 5 6 | txt ="Just an example here move along" count = 1 for i in txt: if i =="": count += 1 print(count) |
好的,这里是我做的这个版本。在noticed,你想要你的输出两个
1 | re.findall("[a-zA-Z_]+", string) |
在
关于空间之中。如果你想要除去所有的额外的空间,去做:
1 2 3 | string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string while" " in string: # While there are 2 spaces beetwen words in our string... string = string.replace(" ","") # ... replace them by one space! |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | def wordCount(mystring): tempcount = 0 count = 1 try: for character in mystring: if character =="": tempcount +=1 if tempcount ==1: count +=1 else: tempcount +=1 else: tempcount=0 return count except Exception: error ="Not a string" return error mystring ="I am having a very nice 23!@$ day." print(wordCount(mystring)) |
8输出的冰
1 2 3 4 5 6 7 8 9 10 11 12 | import string sentence ="I am having a very nice 23!@$ day." # Remove all punctuations sentence = sentence.translate(str.maketrans('', '', string.punctuation)) # Remove all numbers" sentence = ''.join([word for word in sentence if not word.isdigit()]) count = 0; for index in range(len(sentence)-1) : if sentence[index+1].isspace() and not sentence[index].isspace(): count += 1 print(count) |