关于bash：grep在不同行上的文件中的多个字符串（即整个文件，而不是基于行的搜索）？

grep for multiple strings in file on different lines (ie. whole file, not line based search)?

我想对任何一行中包含单词Dansk、Svenska或Norsk的文件使用一个可用的返回代码进行grep(因为我只想知道字符串包含的信息，所以我的一行代码比这行代码更进一步)。

我有很多这样的文件，里面有行：

1
2
3
4
5
6
7
8

Disc Title: unknown
Title: 01, Length: 01:33:37.000 Chapters: 33, Cells: 31, Audio streams: 04, Subpictures: 20
Subtitle: 01, Language: ar - Arabic, Content: Undefined, Stream id: 0x20,
Subtitle: 02, Language: bg - Bulgarian, Content: Undefined, Stream id: 0x21,
Subtitle: 03, Language: cs - Czech, Content: Undefined, Stream id: 0x22,
Subtitle: 04, Language: da - Dansk, Content: Undefined, Stream id: 0x23,
Subtitle: 05, Language: de - Deutsch, Content: Undefined, Stream id: 0x24,
(...)

以下是我想要的伪代码：

1
2
3
4

for all files in directory;
if file contains"Dansk" AND"Norsk" AND"Svenska" then
then echo the filename
end

最好的方法是什么？可以在一条线上完成吗？

你可以使用：

1	grep -l Dansk * \| xargs grep -l Norsk \| xargs grep -l Svenska

如果还想在隐藏文件中查找：

1	grep -l Dansk .* \| xargs grep -l Norsk \| xargs grep -l Svenska

相关讨论

还有另一种使用bash和grep的方法：

对于单个文件'test.txt'：

1	grep -q Dansk test.txt && grep -q Norsk test.txt && grep -l Svenska test.txt

将打印test.txt如果文件包含全部三个(任意组合)。前两个grep不打印任何内容(-q)，最后一个grep只打印其他两个已通过的文件。

如果要对目录中的每个文件执行此操作：

1	for f in *; do grep -q Dansk $f && grep -q Norsk $f && grep -l Svenska $f; done

相关讨论

1	grep –irl word1 * \| grep –il word2 `cat -` \| grep –il word3 `cat -`

-i使搜索不区分大小写
-r通过文件夹进行递归文件搜索
-l将文件列表与找到的单词连接起来。
cat -使下一个grep查看传递给它的文件列表。

相关讨论

如何在不同的行上对文件中的多个字符串进行grep(使用管道符号)：

1
2
3

for file in *;do
test $(grep -E 'Dansk|Norsk|Svenska' $file | wc -l) -ge 3 && echo $file
done

笔记：

如果你在grep中使用双引号""，你就必须这样避开管道：\|来搜索dansk、norsk和svenska。

假设一行只有一种语言。

演练：http://www.cyberciti.biz/faq/howto-use-grep-command-in-linux-unix/

相关讨论

这将在多个文件中搜索多个单词：

1	egrep 'abc\|xyz' file1 file2 ..filen

相关讨论

使用ACK可以很容易地做到这一点：

1	ack -l 'cats' \| ack -xl 'dogs'

返回文件列表
-x：从stdin(上一次搜索)中获取文件，只搜索这些文件

你可以一直保持管道直到你得到你想要的文件。

相关讨论

1	awk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print"0" }'

然后可以用shell捕获返回值

如果你有红宝石(1.9+)

1	ruby -0777 -ne 'print if /Dansk/ and /Norsk/ and /Svenka/' file

相关讨论

简单地说：

1	grep 'word1\\|word2\\|word3' *

有关详细信息，请参阅此文章

相关讨论

这是格伦·杰克曼和库鲁米的混合答案，允许任意数量的正则表达式，而不是任意数量的固定词或固定的正则集。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

#!/usr/bin/awk -f
# by Dennis Williamson - 2011-01-25

BEGIN {
for (i=ARGC-2; i>=1; i--) {
patterns[ARGV[i]] = 0;
delete ARGV[i];
}
}

{
for (p in patterns)
if ($0 ~ p)
matches[p] = 1
# print # the matching line could be printed
}

END {
for (p in patterns) {
if (matches[p] != 1)
exit 1
}
}

像这样运行：

1	./multigrep.awk Dansk Norsk Svenska 'Language: .. - A.*c' dvdfile.dat

我用两步来完成。在一个文件中列出csv文件在这个页面的帮助下，我做了两个无脚本的步骤来获得我需要的东西。只需输入终端：

1 2	$ find /csv/file/dir -name '*.csv' > csv_list.txt $ grep -q Svenska `cat csv_list.txt` && grep -q Norsk `cat csv_list.txt` && grep -l Dansk `cat csv_list.txt`

它完全符合我的需要-打印包含所有三个单词的文件名。

也要注意像`'"这样的符号。

以下是我工作得很好的地方：

1
2
3
4

find . -path '*/.svn' -prune -o -type f -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \;
./path/to/file1.sh
./another/path/to/file2.txt
./blah/foo.php

如果我只想找到这三个文件的.sh文件，那么我可以使用：

1 2	find . -path '/.svn' -prune -o -type f -name".sh" -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \; ./path/to/file1.sh

扩展@kurumi的awk答案，这里有一个bash函数：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

all_word_search() {
gawk '
BEGIN {
for (i=ARGC-2; i>=1; i--) {
search_terms[ARGV[i]] = 0;
ARGV[i] = ARGV[i+1];
delete ARGV[i+1];
}
}
{
for (i=1;i<=NF; i++)
if ($i in search_terms)
search_terms[$1] = 1
}
END {
for (word in search_terms)
if (search_terms[word] == 0)
exit 1
}
'"$@"
return $?
}

用途：

1
2
3
4
5

if all_word_search Dansk Norsk Svenska filename; then
echo"all words found"
else
echo"not all words found"
fi

今天我遇到了这个问题，所有的一行程序都失败了，因为文件名中包含空格。

这就是我想出的方法：

1	grep -ril <WORD1> \| sed 's/.*/"&"/' \| xargs grep -il <WORD2>

如果您只需要两个搜索词，可以说，最易读的方法是运行每个搜索并与结果相交：

1	comm -12 <(grep -rl word1 . \| sort) <(grep -rl word2 . \| sort)