-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix #958: warn when feof() is used as a while loop condition #8422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
905393d
49f0467
6161e99
4aa6ce6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -481,6 +481,73 @@ void CheckIO::invalidScanfError(const Token *tok) | |
| CWE119, Certainty::normal); | ||
| } | ||
|
|
||
| static const Token* findFileReadCall(const Token *start, const Token *end, int varid) | ||
| { | ||
| const Token* found = Token::findmatch(start, "fgets|fgetc|getc|fread|fscanf (", end); | ||
| while (found) { | ||
| const std::vector<const Token*> args = getArguments(found); | ||
| if (!args.empty()) { | ||
| const bool match = (found->str() == "fscanf") | ||
| ? args.front()->varId() == varid | ||
| : args.back()->varId() == varid; | ||
| if (match) | ||
| return found; | ||
| } | ||
| found = Token::findmatch(found->next(), "fgets|fgetc|getc|fread|fscanf (", end); | ||
| } | ||
| return nullptr; | ||
| } | ||
|
|
||
| void CheckIO::checkWrongfeofUsage() | ||
| { | ||
| const SymbolDatabase *symbolDatabase = mTokenizer->getSymbolDatabase(); | ||
|
|
||
| logChecker("CheckIO::checkWrongfeofUsage"); | ||
|
|
||
| for (const Scope * scope : symbolDatabase->functionScopes) { | ||
| for (const Token *tok = scope->bodyStart->next(); tok != scope->bodyEnd; tok = tok->next()) { | ||
| // TODO: Handle do-while and for loops | ||
| if (!Token::simpleMatch(tok, "while ( ! feof (")) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. your code expects that the argument is just a variable. I like to make match patterns explicit.. |
||
| continue; | ||
|
|
||
| // Bail out if we reach a do-while loop | ||
| if (Token::simpleMatch(tok->previous(), "}") && Token::simpleMatch(tok->linkAt(-1)->previous(), "do")) | ||
| continue; | ||
|
|
||
| // Bail out if we cannot identify file pointer | ||
| const int fpVarId = tok->tokAt(5)->varId(); | ||
| if (fpVarId == 0) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this does not identify the file pointer very well. varid can be nonzero in some such patterns: |
||
| continue; | ||
|
|
||
| // Usage of feof is correct if a read happens before and within the loop. | ||
| // However, if we find a control flow statement in between the fileReadCall | ||
| // token and the while loop condition, then we bail out. | ||
| const Token *endCond = tok->linkAt(1); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer an explicit check somewhere that the code is as expected. |
||
| const Token *endBody = endCond->linkAt(1); | ||
|
|
||
| const Token *prevFileReadCallTok = findFileReadCall(scope->bodyStart, tok, fpVarId); | ||
| const Token *loopFileReadCallTok = findFileReadCall(tok, endBody, fpVarId); | ||
| const Token *prevControlFlowTok = Token::findmatch(prevFileReadCallTok, "return|break|goto|continue|throw", tok); | ||
| const Token *loopControlFlowTok = Token::findmatch(tok, "return|break|goto|continue|throw", loopFileReadCallTok); | ||
|
|
||
| if (prevFileReadCallTok && loopFileReadCallTok && !prevControlFlowTok && !loopControlFlowTok) | ||
| continue; | ||
|
|
||
| wrongfeofUsage(getCondTok(tok)); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| void CheckIO::wrongfeofUsage(const Token * tok) | ||
| { | ||
| reportError(tok, Severity::warning, | ||
| "wrongfeofUsage", | ||
| "Using feof() as a loop condition causes the last line to be processed twice.\n" | ||
| "feof() returns true only after a read has failed due to end-of-file, so the loop " | ||
| "body executes once more after the last successful read. Check the return value of " | ||
| "the read function instead (e.g. fgets, fread, fscanf)."); | ||
| } | ||
|
|
||
| //--------------------------------------------------------------------------- | ||
| // printf("%u", "xyz"); // Wrong argument type | ||
| // printf("%u%s", 1); // Too few arguments | ||
|
|
@@ -2030,6 +2097,7 @@ void CheckIO::runChecks(const Tokenizer &tokenizer, ErrorLogger *errorLogger) | |
| checkIO.checkWrongPrintfScanfArguments(); | ||
| checkIO.checkCoutCerrMisusage(); | ||
| checkIO.checkFileUsage(); | ||
| checkIO.checkWrongfeofUsage(); | ||
| checkIO.invalidScanf(); | ||
| } | ||
|
|
||
|
|
@@ -2044,6 +2112,7 @@ void CheckIO::getErrorMessages(ErrorLogger *errorLogger, const Settings *setting | |
| c.useClosedFileError(nullptr); | ||
| c.seekOnAppendedFileError(nullptr); | ||
| c.incompatibleFileOpenError(nullptr, "tmp"); | ||
| c.wrongfeofUsage(nullptr); | ||
| c.invalidScanfError(nullptr); | ||
| c.wrongPrintfScanfArgumentsError(nullptr, "printf",3,2); | ||
| c.invalidScanfArgTypeError_s(nullptr, 1, "s", nullptr); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| # wrongfeofUsage | ||
|
|
||
| **Message**: Using feof() as a loop condition causes the last line to be processed twice.<br/> | ||
| **Category**: Correctness<br/> | ||
| **Severity**: Warning<br/> | ||
| **Language**: C/C++ | ||
|
|
||
| ## Description | ||
|
|
||
| `feof()` returns non-zero only after a read operation has failed because the end of file was reached. When used as the sole condition of a loop, the loop body executes one extra time after the last successful read: the read fails silently (or returns partial data), and only then does `feof()` return true and terminate the loop. | ||
|
|
||
| This checker warns when it finds feof in the loop condition and either: | ||
| - no file-read call (e.g. `fgets`, `fread`, `fscanf`) precedes the loop and is also present inside the loop body | ||
| - a control-flow statement (`return`, `break`, `goto`, `continue`, `throw`) is present in the the loop body. | ||
|
|
||
| ## How to fix | ||
|
|
||
| Check the return value of the read function directly in the loop condition. | ||
|
|
||
| Before: | ||
| ```c | ||
| void process(FILE *fp) { | ||
| char line[256]; | ||
| while (!feof(fp)) { /* wrong: processes last line twice */ | ||
| fgets(line, sizeof(line), fp); | ||
| puts(line); | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| After: | ||
| ```c | ||
| void process(FILE *fp) { | ||
| char line[256]; | ||
| while (fgets(line, sizeof(line), fp) != NULL) { | ||
| puts(line); | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine such function call:
If skip_empty_lines is not defined in the current TU (translation unit) then you need to bailout somehow so there are not false positives.
If skip_empty_lines is defined in the current TU you could look into that and determine if there is file read but I am not against that we bailout for that case also.
If you pass f to a function it's highly likely it is to read it.. why else would it be passed to the function since you use feof on f also..