ack错过了结果(对比grep)

ack misses results (vs. grep)

我确定我对ack的文件/目录忽略默认值有些误解,但也许有人可以为我阐明一下:

1
2
3
4
5
6
mbuck$ grep logout -R app/views/
Binary file app/views/shared/._header.html.erb.bak.swp matches
Binary file app/views/shared/._header.html.erb.swp matches
app/views/shared/_header.html.erb.bak: <%= link_to logout_text, logout_path, { :title => logout_text, :class => 'login-menuitem' } %>
mbuck$ ack logout app/views/
mbuck$

而...

1
2
3
4
5
mbuck$ ack -u logout app/views/
Binary file app/views/shared/._header.html.erb.bak.swp matches
Binary file app/views/shared/._header.html.erb.swp matches
app/views/shared/_header.html.erb.bak
98:<%= link_to logout_text, logout_path, { :title => logout_text, :class => 'login-menuitem' } %>

简单地在没有选项的情况下调用ack无法在.bak文件中找到结果,但使用--unrestricted选项调用可以找到结果。 据我所知,ack默认情况下不会忽略.bak文件。

UPDATE

感谢下面的有用评论,以下是我的~/.ackrc的新内容:

1
2
--type-add=ruby=.haml,.rake
--type-add=css=.less

ack的特殊之处在于它没有要忽略的文件类型的黑名单,而是它将搜索的文件类型的白名单。

引用手册页:

With no file selections, ack-grep only searches files of types that it recognizes. If you have a file called foo.wango, and ack-grep doesn't know what a .wango file is, ack-grep won't search it.

(注意我使用Ubuntu,因为命名冲突,二进制文件被称为ack-grep)

ack --help-types将显示您的ack安装支持的类型列表。


如果您对ack将要搜索的文件感到困惑,只需添加-f选项即可。它将列出它找到的可搜索的所有文件。


ack --man说:

If you want ack to search every file,
even ones that it always ignores like
coredumps and backup files, use the
"?u" switch.

Why does ack ignore unknown files by
default? ack is designed by a
programmer, for programmers, for
searching large trees of code. Most
codebases have a lot files in them
which aren’t source files (like
compiled object files, source control
metadata, etc), and grep wastes a lot
of time searching through all of those
as well and returning matches from
those files.

That’s why ack’s behavior of not
searching things it doesn’t recognize
is one of its greatest strengths: the
speed you get from only searching the
things that you want to be looking at.

编辑:此外,如果您查看源代码,bak文件将被忽略。


从1973年开始,您可以使用简单的旧grep,而不是与ack进行摔跤。因为它使用明确列入黑名单的文件而不是白名单文件类型,所以它永远不会遗漏正确的结果。给定了几行配置(我在20世纪90年代在我的主目录'dotfiles'回购中创建),grep实际上匹配或超过了许多ack声称的优势 - 特别是速度:当搜索同一组文件时,grep比ack快。

让我开心的grep配置在我的.bashrc中看起来像这样:

1
2
3
4
5
6
7
8
9
10
11
12
13
# Custom 'grep' behaviour
# Search recursively
# Ignore binary files
# Output in pretty colors
# Exclude a bunch of files and directories by name
# (this both prevents false positives, and speeds it up)
function grp {
    grep -rI --color --exclude-dir=node_modules --exclude-dir=\.bzr --exclude-dir=\.git --exclude-dir=\.hg --exclude-dir=\.svn --exclude-dir=build --exclude-dir=dist --exclude-dir=.tox --exclude=tags"$@"
}

function grpy {
    grp --include=*.py"$@"
}

要忽略的文件和目录的确切列表可能会有所不同:我主要是Python开发人员,这些设置对我有用。

添加子自定义也很容易,正如我为我的'grpy'所示,我用来grep Python源代码。

定义像这样的bash函数比设置GREP_OPTIONS更好,这将导致来自登录shell的grep的所有执行行为不同,包括由您运行的程序调用的那些。这些程序可能会对grep的意外不同行为进行调查。

我的新功能'grp'和'grpy'故意不影响'grep',所以我仍然可以在任何需要的时候使用原始行为。