在C++中逐行读取文件

Read file line by line using ifstream in C++

file.txt的内容包括：

1
2
3
4
5
6
7

5 3
6 4
7 1
10 5
11 6
12 3
12 4

其中，5 3是一个坐标对。如何在C++中逐行处理此数据？

我可以得到第一行，但如何得到文件的下一行？

1 2	ifstream myfile; myfile.open ("text.txt");

首先，制作一个ifstream：

1 2	#include <fstream> std::ifstream infile("thefile.txt");

两种标准方法是：

假设每行由两个数字组成，并逐个标记地读取：

1
2
3
4
5

int a, b;
while (infile >> a >> b)
{
// process pair (a,b)
}

基于行的分析，使用字符串流：

1
2
3
4
5
6
7
8
9
10
11
12

#include <sstream>
#include <string>

std::string line;
while (std::getline(infile, line))
{
std::istringstream iss(line);
int a, b;
if (!(iss >> a >> b)) { break; } // error

// process pair (a,b)
}

您不应该混合(1)和(2)，因为基于令牌的解析不会吞掉新行，因此，如果在基于令牌的提取使您到达行的末尾之后使用getline()，则可能会以虚假的空行结束。

相关讨论

解决方案1是否将逗号作为令牌？
@爱德华德卡拉克：我不明白"象征性的逗号"是什么意思。逗号不代表整数。
OP用一个空格来分隔两个整数。我想知道如果操作a用作逗号a分隔符，while(infile>>a>>b)是否有效，因为这是我自己程序中的场景。
@爱德华：啊，所以当你说"令牌"时，你的意思是"分隔符"。正确的。用逗号，你会说：int a, b; char c; while ((infile >> a >> c >> b) && (c == ','))。
@Kerreksb：只有逗号被空格包围，即"1，2"，这才有效。如果行包含"1,2"，那么代码将尝试将"1,2"转换为整数(存储在A中)，而C和B将在下一行获得标记/分隔符。除了空白分隔符之外，您还需要使用std:：getline()并解析该行。
@马克：你确定吗？
@凯瑞克：嗯。我错了。我不知道它能做到。我可能有自己的代码需要重写。
关于while(getline(f, line)) { }结构的解释和错误处理，请看这篇(我的)文章：gehrcke.de/2011/06/&hellip；(我认为我不需要良心不好地在这里发布这个，它甚至稍微提前了这个答案的日期)。
农民任务指挥部
@伽罗瓦：茅草屋顶：—)
好好享受你的冷饮吧
跳过""注释行的最佳方法是使用第一种或第二种方法？谢谢。
@Elgnoh：在第一种方法中不能这样做，它假定您正在解析令牌，并且不知道"行"是什么。第二种方法很简单，只需检查行字符串的第一个字符(可能跳过空白)。
@Kerreksb:我理解，在第一种方法中，请澄清一下，>>返回对stream对象的引用。所以问题是，当流到达EOF时，将返回什么使while循环中断。
@vivekmaran：流在读取数字以形成最后一个元素时达到eof。在下一轮中，没有剩余的数字，试图读取超过流结尾的数据会使流"失败"，这是循环的退出条件。这是一个演示。
@Vivekmaran：如果你的阅读方式不是"提前阅读"，就像把单个的字符拿出来，那么你永远不会触发EOF，直到你真正跨过流的末端：wandbox.org/permlink/ofayftnefucamv
@Kerreksb：谢谢你拿到了eof部分，刚刚发现直接使用ifstream进行条件检查会触发bool操作符，如果eof被击中会返回false，并导致循环中断。
@Vivekmaran：不，对!fail()进行布尔转换检查，而不是对good()进行布尔转换检查，这在处理eof时有所不同(见此处)。
@Kerreksb，如果我需要在使用getline()读取整行后逐个读取输入，该怎么办？我的意思是，在python语言中，我可以从一个列表(a.k.a数组)中的一行一行地读取输入，然后我可以迭代这个列表，一次选择一个项目，然后做我想做的任何事情！我该如何处理C++中的这种情况？有什么建议吗？
@当然，您可以将每一行存储在一个容器中(例如std::vector)，然后在循环完成后处理该容器。当然，这意味着在继续之前，您需要能够使用整个文件，例如，您不能以交互方式读取行。(但我希望在python中也是如此。)如果性能是一个问题，那么最好在一个步骤中将整个文件读取到内存中，然后只存储换行符的位置，例如作为std::vector，但如果这是性能瓶颈，我只会这样做。
@阿努：也许问一个新问题？
@Kerreksb，这是我要解决的问题？使用上面的帖子，但没有得到线索，怎么做？有什么建议吗？
非常适合我，谢谢！

使用ifstream从文件中读取数据：

1	std::ifstream input("filename.ext" );

如果您确实需要逐行阅读，请执行以下操作：

1
2
3
4

for( std::string line; getline( input, line ); )
{
...for each line in input...
}

但您可能只需要提取坐标对：

1 2	int x, y; input >> x >> y;

更新：

在代码中使用ofstream myfile;，但是ofstream中的o代表output。如果要读取文件(输入)，请使用ifstream。如果您既想读又想写，请使用fstream。

相关讨论

在C++中逐行读取文件可以以不同的方式完成。

[fast]使用std:：getline()循环

最简单的方法是使用std:：getline()调用打开std:：ifstream并循环。代码清晰易懂。

1
2
3
4
5
6
7
8
9
10
11

#include <fstream>

std::ifstream file(FILENAME);
if (file.is_open()) {
std::string line;
while (getline(file, line)) {
// using printf() in all tests for consistency
printf("%s", line.c_str());
}
file.close();
}

[快速]使用Boost的文件描述源

另一种可能是使用Boost库，但是代码会变得更加冗长。性能与上面的代码非常相似(使用std:：getline()循环)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/stream.hpp>
#include <fcntl.h>

namespace io = boost::iostreams;

void readLineByLineBoost() {
int fdr = open(FILENAME, O_RDONLY);
if (fdr >= 0) {
io::file_descriptor_source fdDevice(fdr, io::file_descriptor_flags::close_handle);
io::stream <io::file_descriptor_source> in(fdDevice);
if (fdDevice.is_open()) {
std::string line;
while (std::getline(in, line)) {
// using printf() in all tests for consistency
printf("%s", line.c_str());
}
fdDevice.close();
}
}
}

[最快]使用C代码

如果性能对软件至关重要，您可以考虑使用C语言。此代码可以比上面的C++版本快4-5倍，参见下面的基准

1
2
3
4
5
6
7
8
9
10
11
12
13

FILE* fp = fopen(FILENAME,"r");
if (fp == NULL)
exit(EXIT_FAILURE);

char* line = NULL;
size_t len = 0;
while ((getline(&line, &len, fp)) != -1) {
// using printf() in all tests for consistency
printf("%s", line);
}
fclose(fp);
if (line)
free(line);

基准——哪个更快？

我用上面的代码做了一些性能基准测试，结果很有趣。我用包含100000行、1000000行和10000000行文本的ASCII文件测试了代码。每行文本平均包含10个单词。该程序是用-O3优化编译的，其输出被转发到/dev/null以从测量中删除测井时间变量。最后，但并非最不重要的是，每段代码都用printf()函数记录每一行，以确保一致性。

结果显示每段代码读取文件所用的时间(毫秒)。

两种C++方法之间的性能差异很小，在实践中不应该有任何差别。C代码的性能使基准测试令人印象深刻，并且在速度方面可以改变游戏规则。

1
2
3
4

10K lines 100K lines 1000K lines
Loop with std::getline() 105ms 894ms 9773ms
Boost code 106ms 968ms 9561ms
C code 23ms 243ms 2397ms

enter image description here

相关讨论

既然坐标是成对的，为什么不为它们编写一个结构呢？

1
2
3
4
5

struct CoordinatePair
{
int x;
int y;
};

然后可以为IStream编写重载的提取运算符：

1
2
3
4
5
6

std::istream& operator>>(std::istream& is, CoordinatePair& coordinates)
{
is >> coordinates.x >> coordinates.y;

return is;
}

然后你可以把一个坐标文件直接读取到一个向量中，就像这样：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

#include <fstream>
#include <iterator>
#include <vector>

int main()
{
char filename[] ="coordinates.txt";
std::vector<CoordinatePair> v;
std::ifstream ifs(filename);
if (ifs) {
std::copy(std::istream_iterator<CoordinatePair>(ifs),
std::istream_iterator<CoordinatePair>(),
std::back_inserter(v));
}
else {
std::cerr <<"Couldn't open" << filename <<" for reading
";
}
// Now you can work with the contents of v
}

相关讨论

如果输入为：

1
2
3

1,NYC
2,ABQ
...

您仍然可以应用相同的逻辑，如：

1
2
3
4
5
6
7
8
9
10
11
12

#include <fstream>

std::ifstream infile("thefile.txt");
if (infile.is_open()) {
int number;
std::string str;
char c;
while (infile >> number >> c >> str && c == ',')
std::cout << number <<"" << str <<"
";
}
infile.close();

这是将数据加载到C++程序中的通用解决方案，并使用RealLoad函数。这可以为csv文件修改，但分隔符是一个空格。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

int n = 5, p = 2;

int X[n][p];

ifstream myfile;

myfile.open("data.txt");

string line;
string temp ="";
int a = 0; // row index

while (getline(myfile, line)) { //while there is a line
int b = 0; // column index
for (int i = 0; i < line.size(); i++) { // for each character in rowstring
if (!isblank(line[i])) { // if it is not blank, do this
string d(1, line[i]); // convert character to string
temp.append(d); // append the two strings
} else {
X[a][b] = stod(temp); // convert string to double
temp =""; // reset the capture
b++; // increment b cause we have a new number
}
}

X[a][b] = stod(temp);
temp ="";
a++; // onto next row
}

此答案适用于Visual Studio 2017，如果您想从文本文件中读取与编译的控制台应用程序相关的位置。

首先将文本文件(本例中为test.txt)放入解决方案文件夹。编译后，将文本文件与applicationname.exe保存在同一文件夹中

C:users"username"source
epos"solutionname""solutionname"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

#include <iostream>
#include <fstream>

using namespace std;
int main()
{
ifstream inFile;
// open the file stream
inFile.open(".\\test.txt");
// check if opening a file failed
if (inFile.fail()) {
cerr <<"Error opeing a file" << endl;
inFile.close();
exit(1);
}
string line;
while (getline(inFile, line))
{
cout << line << endl;
}
// close the file stream
inFile.close();
}

虽然不需要手动关闭文件，但如果文件变量的范围更大，最好这样做：

1
2
3
4
5
6
7
8
9

ifstream infile(szFilePath);

for (string line =""; getline(infile, line); )
{
//do something with the line
}

if(infile.is_open())
infile.close();