关于C#：为什么客户端忙于接收数据时select()有时会超时

Why select() timeouts sometimes when the client is busy receiving data

我已经编写了简单的C/S应用程序来测试非阻塞套接字的特性，这里有一些关于服务器和客户端的简要信息：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

//On linux The server thread will send
//a file to the client using non-blocking socket
void *SendFileThread(void *param){
CFile* theFile = (CFile*) param;
int sockfd = theFile->GetSocket();
set_non_blocking(sockfd);
set_sock_sndbuf(sockfd, 1024 * 64); //set the send buffer to 64K

//get the total packets count of target file
int PacketCOunt = theFile->GetFilePacketsCount();
int CurrPacket = 0;
while (CurrPacket < PacketCount){
char buffer[512];
int len = 0;

//get packet data by packet no.
GetPacketData(currPacket, buffer, len);

//send_non_blocking_sock_data will loop and send
//data into buffer of sockfd until there is error
int ret = send_non_blocking_sock_data(sockfd, buffer, len);
if (ret < 0 && errno == EAGAIN){
continue；
} else if (ret < 0 || ret == 0 ){
break;
} else {
currPacket++;
}

......
}
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

//On windows, the client thread will do something like below
//to receive the file data sent by the server via block socket
void *RecvFileThread(void *param){
int sockfd = (int) param; //blocking socket
set_sock_rcvbuf(sockfd, 1024 * 256); //set the send buffer to 256

while (1){
struct timeval timeout;
timeout.tv_sec = 1;
timeout.tv_usec = 0;

fd_set rds;
FD_ZERO(&rds);
FD_SET(sockfd, &rds)'

//actually, the first parameter of select() is
//ignored on windows, though on linux this parameter
//should be (maximum socket value + 1)
int ret = select(sockfd + 1, &rds, NULL, NULL, &timeout );
if (ret == 0){
// log that timer expires
CLogger::log("RecvFileThread---Calling select() timeouts\
");
} else if (ret) {
//log the number of data it received
int ret = 0;
char buffer[1024 * 256];
int len = recv(sockfd, buffer, sizeof(buffer), 0);
// handle error
process_tcp_data(buffer, len);
} else {
//handle and break;
break;
}

}
}

令我惊讶的是，由于套接字缓冲区已满，服务器线程经常失败，例如要发送一个 14M 大小的文件，它会报告 50000 次失败，且 errno = EAGAIN。但是，通过日志记录我观察到传输过程中有数十次超时，流程如下：

第N次循环，select()成功，成功读取256K的数据。

在第 (N 1) 个循环中，select() 因超时而失败。

在第 (N 2) 次循环中，select() 成功并成功读取 256K 的数据。

为什么在接收过程中会出现交错的超时？谁能解释一下这个现象？

[更新]
1.上传一个14M的文件到服务器只需要8秒
2. 使用与1)相同的文件，服务器需要将近30秒的时间将所有数据发送到客户端。
3. 客户端使用的所有套接字都是阻塞的。服务器使用的所有套接字都是非阻塞的。

关于#2，我认为超时是#2比#1花费更多时间的原因，我想知道为什么客户端忙于接收数据时会有这么多超时。

[更新2]
感谢@Duck、@ebrobe、@EJP、@ja_mesa 的评论，我今天会做更多的调查
然后更新这篇文章。
关于为什么我在服务器线程中每个循环发送 512 个字节，这是因为我发现服务器线程发送数据的速度比客户端线程接收它们的速度快得多。我很困惑为什么客户端线程会发生超时。

相关讨论

您应该先调用 recv()，然后仅当 recv() 告诉您这样做时才调用 select()。不要先调用select()，那是浪费处理。 recv() 知道数据是立即可用还是必须等待数据到达：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

void *RecvFileThread(void *param){
int sockfd = (int) param; //blocking socket
set_sock_rcvbuf(sockfd, 1024 * 256); //set the send buffer to 256

char buffer[1024 * 256];

while (1){

int ret = 0;
int len = recv(sockfd, buffer, sizeof(buffer), 0);
if (len == -1) {
if (WSAGetLastError() != WSAEWOULDBLOCK) {
//handle error
break;
}

struct timeval timeout;
timeout.tv_sec = 1;
timeout.tv_usec = 0;

fd_set rds;
FD_ZERO(&rds);
FD_SET(sockfd, &rds)'

//actually, the first parameter of select() is
//ignored on windows, though on linux this parameter
//should be (maximum socket value + 1)
int ret = select(sockfd + 1, &rds, NULL, &timeout );
if (ret == -1) {
// handle error
break;
}

if (ret == 0) {
// log that timer expires
break;
}

// socket is readable so try read again
continue;
}

if (len == 0) {
// handle graceful disconnect
break;
}

//log the number of data it received
process_tcp_data(buffer, len);
}
}

在发送端也做类似的事情。先调用 send()，然后调用 select() 等待可写性，前提是 send() 告诉你这样做。