Protect against crashes during header write by dfed · Pull Request #66 · dfed/CacheAdvance

dfed · 2022-10-11T18:14:57Z

If we crash while writing a header, we can end up in a state where our offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not protect against an invalid offsetInFileAtEndOfNewestMessage prior to this change, which could result in an infinite loop while reading messages.

This PR was inspired by #64 and #64 (comment).

dfed · 2022-10-11T18:15:28Z

    }

    /// Returns the next encodable message, seeking to the beginning of the next message.
-    func nextEncodedMessage(previousReadWasEmpty: Bool = false) throws -> Data? {


The case we previously caught with previousReadWasEmpty is now caught by our new checks below!

dfed · 2022-10-11T18:16:13Z


    /// Calculates the offset in the file where the header should end.
-    static var expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile
+    static let expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile


This change is unrelated to the overall PR but I saw it and couldn't unsee it.

dfed · 2022-10-11T18:17:17Z

-                maximumBytes: header.maximumBytes,
-                overwritesOldMessages: header.overwritesOldMessages),
+                maximumBytes: randomHighValue),
+            header: header,


oops. we were previously re-creating the header rather than using the one we already had.

If we crash while writing a header, we can end up in a state where our offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not protect against an invalid offsetInFileAtEndOfNewestMessage prior to this change, which could result in an infinite loop while reading messages.

dfed · 2022-10-11T18:23:03Z

        XCTAssertEqual(messages, [])
    }

-    func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws {


This test was passing before I deleted it in 4d3be01, but it executed the same code path as test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile, and I liked the new test better since it didn't rely on random large numbers.

What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?

bachand

I took a first pass over this @dfed . It takes me awhile to load all of this back in my brain 😅 . Looking real good. I plan to return tomorrow AM if it's ready for another review then.

bachand · 2022-10-12T01:21:32Z

        XCTAssertEqual(messages, [])
    }

-    func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws {


What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?

jianjunwoo · 2022-10-12T08:18:40Z

        case .emptyRead:
-            guard !previousReadWasEmpty else {
-                // If the previous read was also empty, then the file has been corrupted.
+            guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else {


Suggested change

guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else {

guard startingOffset >= offsetInFileAtEndOfNewestMessage else {

In this condition, so we will seek to FileHeader.expectedEndOfHeaderInFile when startingOffset >= offsetInFileAtEndOfNewestMessage. Why do we need to go back, is it related with overwritesOldMessages == true?

I went back and forth on whether to use ! here – I appreciate your weighing in. I'll change it 🙂

We check startingOffset >= offsetInFileAtEndOfNewestMessage here because if startingOffset < offsetInFileAtEndOfNewestMessage, we expect to be reading a message! If we read empty data under this condition, we know that our offsetInFileAtEndOfNewestMessage is incorrect, or that a message we read earlier wrote an incorrect message length – i.e. we know that the file is corrupted.

This check is implicitly related to overwritesOldMessages == true. As we discussed in #64, when overwritesOldMessages == false, we should never seek back to the beginning of the file. On this PR, when overwritesOldMessages == false we will not seek back due to a corrupted file: either this check will catch a corrupted file, or this check will.

You can prove the above to yourself by setting overwritesOldMessages: false in the unit tests I wrote on this PR.

Updated the guard: c1ce688

We avoid the need for the = check since we check for == above

Got it. Now I got fully understanding for overwritesOldMessages == true : )

The link in "this check will" I think is missing a line number. Can we update for clarity?

Oops! Updated above, and linking here:

CacheAdvance/Sources/CacheAdvance/CacheReader.swift

Lines 58 to 68 in fd03b6d

// Check our assumptions before we try to read the message.

let endOfMessage = startingOffset + UInt64(MessageSpan.storageLength) + UInt64(messageLength)

let startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage = startingOffset < offsetInFileAtEndOfNewestMessage && endOfMessage <= offsetInFileAtEndOfNewestMessage

let startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile = offsetInFileAtEndOfNewestMessage < startingOffset && endOfMessage <= maximumBytes

guard

startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage

|| startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile

else {

// The offsetInFileAtEndOfNewestMessage is incorrect. This likely occured due to a crash when writing our header file.

throw CacheAdvanceError.fileCorrupted

}

codecov · 2022-10-12T14:35:25Z

Codecov Report

Merging #66 (f56d321) into main (16e6563) will increase coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   96.26%   96.36%   +0.10%     
==========================================
  Files          14       14              
  Lines         562      578      +16     
==========================================
+ Hits          541      557      +16     
  Misses         21       21

Impacted Files	Coverage Δ
Sources/CacheAdvance/FileHeader.swift	`100.00% <ø> (ø)`
Sources/CacheAdvance/CacheAdvance.swift	`100.00% <100.00%> (ø)`
Sources/CacheAdvance/CacheReader.swift	`94.25% <100.00%> (+1.10%)`	⬆️

bachand

Thanks @dfed ! Great work.

bachand · 2022-10-12T17:56:49Z

+
+        // Make the file corrupted by setting the offset at end of newest message to be further in the file.
+        // This could happen if a crash occurred during a write of `header.offsetInFileAtEndOfNewestMessage` on a big-endian device.
+        // Big-endian devices write the most significant digits first, meaning that if we were offsetInFileAtEndOfNewestMessage from 00001010 to 00010000, it would be possible to crash with the following bytes written to disk: 00011010.


This reverts commit 69881ef.

… > offsetInFileAtEndOfNewestMessage to avoid possible loop (#72) * Revert "Protect against crashes during header write (#66)" This reverts commit 69881ef. * Split two ranges to read messages if offsetInFileOfOldestMessage > offsetInFileAtEndOfNewestMessage to avoid possible loop * bump version * keep tests * add comment * add parameter comments * support all the way back to Xcode 11 * Apply suggestions from code review Co-authored-by: Dan Federman <dfed@me.com> * Hanle cache file as empty when offsetInFileOfOldestMessage == offsetInFileAtEndOfNewestMessage * set func nextEncodedMessage private * handle offsetInFile > endOffset in func encodedMessagesFromOffset Co-authored-by: Dan Federman <dfed@me.com>

dfed commented Oct 11, 2022

View reviewed changes

dfed force-pushed the dfed--protect-against-crash-during-header-write branch from a34648d to 05feae3 Compare October 11, 2022 18:18

Delete duplicative test

4d3be01

dfed commented Oct 11, 2022

View reviewed changes

Get code compiling on older Xcode versions

31e6caa

dfed force-pushed the dfed--protect-against-crash-during-header-write branch from fec5525 to 31e6caa Compare October 11, 2022 18:31

dfed added 2 commits October 11, 2022 14:33

Bump podspec

3e51da8

Merge branch 'main' into dfed--protect-against-crash-during-header-write

0b63b1e

dfed mentioned this pull request Oct 11, 2022

Add readerForwardOnly to CacheAdvance init parameter to avoid possible infinite loop #64

Closed

dfed requested a review from bachand October 11, 2022 20:48

dfed marked this pull request as ready for review October 11, 2022 20:48

bachand reviewed Oct 12, 2022

View reviewed changes

Bachand feedback first pass

4d6b753

dfed force-pushed the dfed--protect-against-crash-during-header-write branch from d5f950e to 4d6b753 Compare October 12, 2022 04:08

jianjunwoo reviewed Oct 12, 2022

View reviewed changes

Simplify guard

c1ce688

dfed force-pushed the dfed--protect-against-crash-during-header-write branch from f104ed6 to c1ce688 Compare October 12, 2022 14:48

Merge branch 'main' into dfed--protect-against-crash-during-header-write

f56d321

bachand approved these changes Oct 12, 2022

View reviewed changes

dfed merged commit 69881ef into main Oct 12, 2022

dfed deleted the dfed--protect-against-crash-during-header-write branch October 12, 2022 18:42

jianjunwoo added a commit to jianjunwoo/CacheAdvance that referenced this pull request Nov 4, 2022

Revert "Protect against crashes during header write (dfed#66)"

b267dbd

This reverts commit 69881ef.

dfed mentioned this pull request Nov 4, 2022

Split into two ranges to read messages if offsetInFileOfOldestMessage > offsetInFileAtEndOfNewestMessage to avoid possible loop #72

Merged

dfed mentioned this pull request Nov 19, 2022

Make ObjectiveCError type Sendable without warnings #73

Merged

	guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else {
	guard startingOffset >= offsetInFileAtEndOfNewestMessage else {

	// Check our assumptions before we try to read the message.
	let endOfMessage = startingOffset + UInt64(MessageSpan.storageLength) + UInt64(messageLength)
	let startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage = startingOffset < offsetInFileAtEndOfNewestMessage && endOfMessage <= offsetInFileAtEndOfNewestMessage
	let startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile = offsetInFileAtEndOfNewestMessage < startingOffset && endOfMessage <= maximumBytes
	guard
	startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage
	\|\| startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile
	else {
	// The offsetInFileAtEndOfNewestMessage is incorrect. This likely occured due to a crash when writing our header file.
	throw CacheAdvanceError.fileCorrupted
	}

Uh oh!

Conversation

dfed commented Oct 11, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bachand left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jianjunwoo Oct 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dfed Oct 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Oct 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bachand left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jianjunwoo Oct 12, 2022 •

edited

Loading

dfed Oct 12, 2022 •

edited

Loading

codecov Bot commented Oct 12, 2022 •

edited

Loading