Protect against crashes during header write#66
Conversation
| } | ||
|
|
||
| /// Returns the next encodable message, seeking to the beginning of the next message. | ||
| func nextEncodedMessage(previousReadWasEmpty: Bool = false) throws -> Data? { |
There was a problem hiding this comment.
The case we previously caught with previousReadWasEmpty is now caught by our new checks below!
|
|
||
| /// Calculates the offset in the file where the header should end. | ||
| static var expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile | ||
| static let expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile |
There was a problem hiding this comment.
This change is unrelated to the overall PR but I saw it and couldn't unsee it.
| maximumBytes: header.maximumBytes, | ||
| overwritesOldMessages: header.overwritesOldMessages), | ||
| maximumBytes: randomHighValue), | ||
| header: header, |
There was a problem hiding this comment.
oops. we were previously re-creating the header rather than using the one we already had.
If we crash while writing a header, we can end up in a state where our offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not protect against an invalid offsetInFileAtEndOfNewestMessage prior to this change, which could result in an infinite loop while reading messages.
a34648d to
05feae3
Compare
| XCTAssertEqual(messages, []) | ||
| } | ||
|
|
||
| func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws { |
There was a problem hiding this comment.
This test was passing before I deleted it in 4d3be01, but it executed the same code path as test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile, and I liked the new test better since it didn't rely on random large numbers.
There was a problem hiding this comment.
What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?
fec5525 to
31e6caa
Compare
| XCTAssertEqual(messages, []) | ||
| } | ||
|
|
||
| func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws { |
There was a problem hiding this comment.
What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?
d5f950e to
4d6b753
Compare
| case .emptyRead: | ||
| guard !previousReadWasEmpty else { | ||
| // If the previous read was also empty, then the file has been corrupted. | ||
| guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else { |
There was a problem hiding this comment.
| guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else { | |
| guard startingOffset >= offsetInFileAtEndOfNewestMessage else { |
There was a problem hiding this comment.
In this condition, so we will seek to FileHeader.expectedEndOfHeaderInFile when startingOffset >= offsetInFileAtEndOfNewestMessage. Why do we need to go back, is it related with overwritesOldMessages == true?
There was a problem hiding this comment.
I went back and forth on whether to use ! here – I appreciate your weighing in. I'll change it 🙂
We check startingOffset >= offsetInFileAtEndOfNewestMessage here because if startingOffset < offsetInFileAtEndOfNewestMessage, we expect to be reading a message! If we read empty data under this condition, we know that our offsetInFileAtEndOfNewestMessage is incorrect, or that a message we read earlier wrote an incorrect message length – i.e. we know that the file is corrupted.
This check is implicitly related to overwritesOldMessages == true. As we discussed in #64, when overwritesOldMessages == false, we should never seek back to the beginning of the file. On this PR, when overwritesOldMessages == false we will not seek back due to a corrupted file: either this check will catch a corrupted file, or this check will.
You can prove the above to yourself by setting overwritesOldMessages: false in the unit tests I wrote on this PR.
There was a problem hiding this comment.
Got it. Now I got fully understanding for overwritesOldMessages == true : )
There was a problem hiding this comment.
The link in "this check will" I think is missing a line number. Can we update for clarity?
There was a problem hiding this comment.
Oops! Updated above, and linking here:
CacheAdvance/Sources/CacheAdvance/CacheReader.swift
Lines 58 to 68 in fd03b6d
Codecov Report
@@ Coverage Diff @@
## main #66 +/- ##
==========================================
+ Coverage 96.26% 96.36% +0.10%
==========================================
Files 14 14
Lines 562 578 +16
==========================================
+ Hits 541 557 +16
Misses 21 21
|
f104ed6 to
c1ce688
Compare
|
|
||
| // Make the file corrupted by setting the offset at end of newest message to be further in the file. | ||
| // This could happen if a crash occurred during a write of `header.offsetInFileAtEndOfNewestMessage` on a big-endian device. | ||
| // Big-endian devices write the most significant digits first, meaning that if we were offsetInFileAtEndOfNewestMessage from 00001010 to 00010000, it would be possible to crash with the following bytes written to disk: 00011010. |
This reverts commit 69881ef.
… > offsetInFileAtEndOfNewestMessage to avoid possible loop (#72) * Revert "Protect against crashes during header write (#66)" This reverts commit 69881ef. * Split two ranges to read messages if offsetInFileOfOldestMessage > offsetInFileAtEndOfNewestMessage to avoid possible loop * bump version * keep tests * add comment * add parameter comments * support all the way back to Xcode 11 * Apply suggestions from code review Co-authored-by: Dan Federman <dfed@me.com> * Hanle cache file as empty when offsetInFileOfOldestMessage == offsetInFileAtEndOfNewestMessage * set func nextEncodedMessage private * handle offsetInFile > endOffset in func encodedMessagesFromOffset Co-authored-by: Dan Federman <dfed@me.com>
If we crash while writing a header, we can end up in a state where our offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not protect against an invalid offsetInFileAtEndOfNewestMessage prior to this change, which could result in an infinite loop while reading messages.
This PR was inspired by #64 and #64 (comment).