Skip to content

Protect against crashes during header write#66

Merged
dfed merged 8 commits into
mainfrom
dfed--protect-against-crash-during-header-write
Oct 12, 2022
Merged

Protect against crashes during header write#66
dfed merged 8 commits into
mainfrom
dfed--protect-against-crash-during-header-write

Conversation

@dfed

@dfed dfed commented Oct 11, 2022

Copy link
Copy Markdown
Owner

If we crash while writing a header, we can end up in a state where our offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not protect against an invalid offsetInFileAtEndOfNewestMessage prior to this change, which could result in an infinite loop while reading messages.

This PR was inspired by #64 and #64 (comment).

}

/// Returns the next encodable message, seeking to the beginning of the next message.
func nextEncodedMessage(previousReadWasEmpty: Bool = false) throws -> Data? {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case we previously caught with previousReadWasEmpty is now caught by our new checks below!


/// Calculates the offset in the file where the header should end.
static var expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile
static let expectedEndOfHeaderInFile = Field(rawValue: Field.allCases.endIndex)!.expectedEndOfFieldInFile

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unrelated to the overall PR but I saw it and couldn't unsee it.

maximumBytes: header.maximumBytes,
overwritesOldMessages: header.overwritesOldMessages),
maximumBytes: randomHighValue),
header: header,

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. we were previously re-creating the header rather than using the one we already had.

If we crash while writing a header, we can end up in a state where our
offsetInFileAtEndOfNewestMessage is invalid. Our reading code did not
protect against an invalid offsetInFileAtEndOfNewestMessage prior to
this change, which could result in an infinite loop while reading
messages.
@dfed dfed force-pushed the dfed--protect-against-crash-during-header-write branch from a34648d to 05feae3 Compare October 11, 2022 18:18
XCTAssertEqual(messages, [])
}

func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was passing before I deleted it in 4d3be01, but it executed the same code path as test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile, and I liked the new test better since it didn't rely on random large numbers.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

@dfed dfed force-pushed the dfed--protect-against-crash-during-header-write branch from fec5525 to 31e6caa Compare October 11, 2022 18:31

@bachand bachand left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a first pass over this @dfed . It takes me awhile to load all of this back in my brain 😅 . Looking real good. I plan to return tomorrow AM if it's ready for another review then.

Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
Comment thread Tests/CacheAdvanceTests/CacheAdvanceTests.swift
Comment thread Tests/CacheAdvanceTests/CacheAdvanceTests.swift
XCTAssertEqual(messages, [])
}

func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm understanding is this deleted test is basically a special case of test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewestMessageIsBeyondEndOfNewestMessageButBeforeEndOfFile where we had no messages. Is that correct?

Comment thread Tests/CacheAdvanceTests/CacheAdvanceTests.swift
Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
Comment thread Sources/CacheAdvance/CacheReader.swift
@dfed dfed force-pushed the dfed--protect-against-crash-during-header-write branch from d5f950e to 4d6b753 Compare October 12, 2022 04:08
Comment thread Sources/CacheAdvance/CacheReader.swift Outdated
case .emptyRead:
guard !previousReadWasEmpty else {
// If the previous read was also empty, then the file has been corrupted.
guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
guard !(startingOffset < offsetInFileAtEndOfNewestMessage) else {
guard startingOffset >= offsetInFileAtEndOfNewestMessage else {

@jianjunwoo jianjunwoo Oct 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this condition, so we will seek to FileHeader.expectedEndOfHeaderInFile when startingOffset >= offsetInFileAtEndOfNewestMessage. Why do we need to go back, is it related with overwritesOldMessages == true?

@dfed dfed Oct 12, 2022

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on whether to use ! here – I appreciate your weighing in. I'll change it 🙂

We check startingOffset >= offsetInFileAtEndOfNewestMessage here because if startingOffset < offsetInFileAtEndOfNewestMessage, we expect to be reading a message! If we read empty data under this condition, we know that our offsetInFileAtEndOfNewestMessage is incorrect, or that a message we read earlier wrote an incorrect message length – i.e. we know that the file is corrupted.

This check is implicitly related to overwritesOldMessages == true. As we discussed in #64, when overwritesOldMessages == false, we should never seek back to the beginning of the file. On this PR, when overwritesOldMessages == false we will not seek back due to a corrupted file: either this check will catch a corrupted file, or this check will.

You can prove the above to yourself by setting overwritesOldMessages: false in the unit tests I wrote on this PR.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the guard: c1ce688

We avoid the need for the = check since we check for == above

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Now I got fully understanding for overwritesOldMessages == true : )

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link in "this check will" I think is missing a line number. Can we update for clarity?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Updated above, and linking here:

// Check our assumptions before we try to read the message.
let endOfMessage = startingOffset + UInt64(MessageSpan.storageLength) + UInt64(messageLength)
let startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage = startingOffset < offsetInFileAtEndOfNewestMessage && endOfMessage <= offsetInFileAtEndOfNewestMessage
let startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile = offsetInFileAtEndOfNewestMessage < startingOffset && endOfMessage <= maximumBytes
guard
startingOffsetIsBeforeEndOfNewestMessageAndDoesNotExceedEndOfNewestMessage
|| startingOffsetIsAfterEndOfNewestMessageAndDoesNotExceedEndOfFile
else {
// The offsetInFileAtEndOfNewestMessage is incorrect. This likely occured due to a crash when writing our header file.
throw CacheAdvanceError.fileCorrupted
}

@codecov

codecov Bot commented Oct 12, 2022

Copy link
Copy Markdown

Codecov Report

Merging #66 (f56d321) into main (16e6563) will increase coverage by 0.10%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   96.26%   96.36%   +0.10%     
==========================================
  Files          14       14              
  Lines         562      578      +16     
==========================================
+ Hits          541      557      +16     
  Misses         21       21              
Impacted Files Coverage Δ
Sources/CacheAdvance/FileHeader.swift 100.00% <ø> (ø)
Sources/CacheAdvance/CacheAdvance.swift 100.00% <100.00%> (ø)
Sources/CacheAdvance/CacheReader.swift 94.25% <100.00%> (+1.10%) ⬆️

@dfed dfed force-pushed the dfed--protect-against-crash-during-header-write branch from f104ed6 to c1ce688 Compare October 12, 2022 14:48

@bachand bachand left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dfed ! Great work.


// Make the file corrupted by setting the offset at end of newest message to be further in the file.
// This could happen if a crash occurred during a write of `header.offsetInFileAtEndOfNewestMessage` on a big-endian device.
// Big-endian devices write the most significant digits first, meaning that if we were offsetInFileAtEndOfNewestMessage from 00001010 to 00010000, it would be possible to crash with the following bytes written to disk: 00011010.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it.

Comment thread Sources/CacheAdvance/CacheReader.swift
@dfed dfed merged commit 69881ef into main Oct 12, 2022
@dfed dfed deleted the dfed--protect-against-crash-during-header-write branch October 12, 2022 18:42
jianjunwoo added a commit to jianjunwoo/CacheAdvance that referenced this pull request Nov 4, 2022
dfed added a commit that referenced this pull request Nov 10, 2022
… > offsetInFileAtEndOfNewestMessage to avoid possible loop (#72)

* Revert "Protect against crashes during header write (#66)"

This reverts commit 69881ef.

* Split two ranges to read messages if offsetInFileOfOldestMessage > offsetInFileAtEndOfNewestMessage to avoid possible loop

* bump version

* keep tests

* add comment

* add parameter comments

* support all the way back to Xcode 11

* Apply suggestions from code review

Co-authored-by: Dan Federman <dfed@me.com>

* Hanle cache file as empty when offsetInFileOfOldestMessage == offsetInFileAtEndOfNewestMessage

* set func nextEncodedMessage private

* handle offsetInFile > endOffset in func encodedMessagesFromOffset

Co-authored-by: Dan Federman <dfed@me.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants