sscanf rejecting leading zeros in integer reads
我想使用
1 | "dd/mm/yyyy" |
" dd"和" mm"字段的长度都可以是2位数字(例如0、6或11,但不能为123)。"年"字段可以是0或四位数字段。这三个字段中任何一个的值为0意味着必须改用系统的日,月或年。
该格式必须严格,因此,如果输入的格式不适合该模式,则必须通知用户。
我的尝试是:
1 2 3 4 5 6 7 8 9 | int d, m, y; char const* input ="23/7/1990"; int n = sscanf(input,"%2u/%2u/%4u", &d, &m, &y); if (n != 3) throw InvalidDate("Invalid format"); // Fill 0 values with system date. // Check date correctness with `mktime` and `localtime`. |
问题是此
1 2 | char const* invalid1 ="23/ 12/ 1990"; char const* invalid2 ="23/12/1990/123whatever......." |
因此,是否有任何技巧/修饰符拒绝整数前的前导零,标记字符串的结尾或在解析更多输入时导致可检测的失败?
对于最后一种情况(invalid2;可在字符串末尾检测到失败),可能的解决方案是:
1 2 3 4 5 6 7 8 9 10 11 | int d, m, y; char trick; char const* input ="23/7/1990"; int n = sscanf(input,"%2u/%2u/%4u%c", &d, &m, &y, &trick); // If it fills four fields, means the input was too long. if (fields != 3) throw InvalidDate("Invalid format"); // Fill 0 values with system date. |
但是我不知道是否有更好的方法来检测
那么,不使用正则表达式或
顺便说一句,还有其他方法可以"欺骗"
这个怎么样?您可以使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | #include <stdio.h> #include <string.h> void process_date(const char* input){ int d, m, y; char sep1[3], sep2[3], trick; int n; n = sscanf( input,"%2u%2[^0-9]%2u%2[^0-9]%4u%c", &d, sep1, &m, sep2, &y, &trick); if(!(n == 5 && strcmp(sep1,"/") == 0 && strcmp(sep2,"/") == 0)){ fprintf(stderr,"Invalid format (input = %s).\ ", input); return; } printf("d = %d, m = %d, y = %d.\ ", d, m, y); } int main(){ process_date("23/7/1990"); process_date("23/12/1990"); process_date("23/7/0"); process_date("23/0/1990"); process_date("0/7/1990"); process_date("23/ 12/ 1990"); process_date("23/12/1990/123whatever......."); process_date("123/7/1990"); process_date("23/12/19a90"); process_date("2a/1"); process_date("a23/12/1990"); process_date("23/12/199000"); return 0; } |
输出:
1 2 3 4 5 6 7 8 9 10 11 12 | d = 23, m = 7, y = 1990. d = 23, m = 12, y = 1990. d = 23, m = 7, y = 0. d = 23, m = 0, y = 1990. d = 0, m = 7, y = 1990. Invalid format (input = 23/ 12/ 1990). Invalid format (input = 23/12/1990/123whatever.......). Invalid format (input = 123/7/1990). Invalid format (input = 23/12/19a90). Invalid format (input = 2a/1). Invalid format (input = a23/12/1990). Invalid format (input = 23/12/199000). |
修改了您的代码,并使其正常工作:
1 2 3 4 5 6 7 8 9 | void parseDate(const char *date) { char trick; int d, m, y, n = sscanf(date,"%2u/%2u/%4u%c", &d, &m, &y, &trick); (n != 3 || y < 999)) ? puts("Invalid format!") : printf("%u %u %u\ ", d, m, y); } |
您提到"年"可以是零或四位数字,因此我修改了您的代码以仅接受1000到9999。否则,
测试了这一点,然后将输出放入文件中。
结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | Sample date: 23/7/1990 Output: 23 7 1990 Sample date: 23/12/1990/123whatever....... Output: Invalid format! Sample date: 23/ 12/ 1990 Output: 23 12 1990 Sample date: 23/12/19a90 Output: Invalid format! Sample date: 2a/1 Output: Invalid format! Sample date: a23/12/1990 Output: Invalid format! Sample date: 23/12/199000 Output: Invalid format! |
您可以参考以下线程:如何在C ++中的std :: string中解析和验证日期?那里的一个答案建议使用
您可以使用boost regex库,它可以完成很多此类工作。检查以下代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | #include <boost/regex.hpp> #include <iostream> #include <string> int main() { // Expression to match boost::regex e("(^\\\\d{1,2})/(\\\\d{1,2})/(\\\\d{4})$"); // Results are here boost::match_results<std::string::const_iterator> results; std::string val_to_match ="1/11/1990"; if (boost::regex_search(val_to_match, results, e) && results.size() == 4) { std::cout <<"Matched" << results[0] << std::endl; int i = 1; while (i < 4) { std::cout <<"Value:" << i << " "<< results[i] << std::endl; i++; } } else { std::cout <<"Couldn't match \ "; } return 0; } |
所以。由于似乎所有人都同意,没有办法使
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | char const* input ="23/7/1990"; int d, m, y; { // Search blanks due to `sscanf` limitations. for (unsigned i = 0; i < 10 and input[i] != '\\0'; ++i) if (isspace(input[i])) throw InvalidDate("Invalid format"); } { // Check format (with extra input detection). char trick; int n = sscanf(input,"%2u/%2u/%4u%c", &d, &m, &y, &trick); if (n != 3 or (y != 0 and y < 1000)) throw InvalidDate("Invalid format"); } // Fill 0 values with system date. // Check date correctness with `mktime` and `localtime`. |
编辑:之前,我使用
如果过早发现'\ 0',则可以通过抛出异常来提高性能,但是在
对该解决方案的任何其他"抱怨"都是非常好的。
这样的事情怎么样?它不使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | int d, m, y; int date[3]; //holds day/month/year in its cells int tokenCount = 0; char* pc; int result = 0; char* pch = strtok(input,"/"); while (pch != NULL) { if (strlen(pch) == 0) { throw InvalidDate("Invalid format"); } //atoi is stupid, there's no way to tell whether the string didn't contain a valid integer or if it contained a zero result = strtol(pch, &pc, 10); if (*pc != 0) { throw InvalidDate("Invalid format"); } if (tokenCount > 2) //we got too many tokens { throw InvalidDate("Invalid format"); } date[tokenCount] = result; tokenCount++; pch = strtok(NULL,"/"); } if (tokenCount != 3) { //not enough tokens were supplied throw InvalidDate("Invalid format"); } d = date[0]; m = date[1]; y = date[2]; |
然后,您可以进行更多检查,例如月份是否在1到12之间。
要记住的一件事是