-
-
Notifications
You must be signed in to change notification settings - Fork 6
Prevent infinite recursion loop while reading messages #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b728029
45b66b5
41e32b5
4ae0a03
81ac1d6
8780b71
0bb76ff
a56f206
2470113
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,20 +36,39 @@ public final class CacheAdvance<T: Codable> { | |
| /// - Warning: `shouldOverwriteOldMessages` must be consistent for the life of a cache. Changing this value after logs have been persisted to a cache will prevent appending new messages to this cache. | ||
| /// - Warning: `decoder` must have a consistent implementation for the life of a cache. Changing this value after logs have been persisted to a cache may prevent reading messages from this cache. | ||
| /// - Warning: `encoder` must have a consistent implementation for the life of a cache. Changing this value after logs have been persisted to a cache may prevent reading messages from this cache. | ||
| public init( | ||
| public convenience init( | ||
| fileURL: URL, | ||
| maximumBytes: Bytes, | ||
| shouldOverwriteOldMessages: Bool, | ||
| decoder: MessageDecoder = JSONDecoder(), | ||
| encoder: MessageEncoder = JSONEncoder()) | ||
| throws | ||
| { | ||
| self.fileURL = fileURL | ||
|
|
||
| writer = try FileHandle(forWritingTo: fileURL) | ||
| reader = try CacheReader(forReadingFrom: fileURL) | ||
| header = try CacheHeaderHandle(forReadingFrom: fileURL, maximumBytes: maximumBytes, overwritesOldMessages: shouldOverwriteOldMessages) | ||
| self.init( | ||
| fileURL: fileURL, | ||
| writer: try FileHandle(forWritingTo: fileURL), | ||
| reader: try CacheReader(forReadingFrom: fileURL), | ||
| header: try CacheHeaderHandle( | ||
| forReadingFrom: fileURL, | ||
| maximumBytes: maximumBytes, | ||
| overwritesOldMessages: shouldOverwriteOldMessages), | ||
| decoder: decoder, | ||
| encoder: encoder) | ||
| } | ||
|
|
||
| /// An internal initializer with no logic. Can be used to create pathological test cases. | ||
| required init( | ||
| fileURL: URL, | ||
| writer: FileHandle, | ||
| reader: CacheReader, | ||
| header: CacheHeaderHandle, | ||
| decoder: MessageDecoder, | ||
| encoder: MessageEncoder) | ||
| { | ||
| self.fileURL = fileURL | ||
| self.writer = writer | ||
| self.reader = reader | ||
| self.header = header | ||
| self.decoder = decoder | ||
| self.encoder = encoder | ||
| } | ||
|
|
@@ -93,12 +112,16 @@ public final class CacheAdvance<T: Codable> { | |
|
|
||
| let cacheHasSpaceForNewMessageBeforeEndOfFile = writer.offsetInFile + bytesNeededToStoreMessage <= header.maximumBytes | ||
| if header.overwritesOldMessages { | ||
| if !cacheHasSpaceForNewMessageBeforeEndOfFile { | ||
| let truncateAtOffset: UInt64? | ||
| if cacheHasSpaceForNewMessageBeforeEndOfFile { | ||
| // We have room for this message. No need to truncate. | ||
| truncateAtOffset = nil | ||
| } else { | ||
| // This message can't be written without exceeding our maximum file length. | ||
| // We'll need to start writing the file from the beginning of the file. | ||
|
|
||
| // Trim the file to the current position to remove soon-to-be-abandoned data from the file. | ||
| try writer.truncate(at: writer.offsetInFile) | ||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe the way we managed to get into the state the previous commit is testing is that we were truncating the file before we updated the header. We should always update our header before making a write, not after. I inspected other places where we write data and we were already careful in the other spots to update our header first. |
||
| // Trim the file to the current writer position to remove soon-to-be-abandoned data from the file. | ||
| truncateAtOffset = writer.offsetInFile | ||
|
|
||
| // Set the offset back to the beginning of the file. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The next comment says:
I don't think that's accurate anymore since we haven't truncated yet. |
||
| try writer.seek(to: FileHeader.expectedEndOfHeaderInFile) | ||
|
|
@@ -121,6 +144,12 @@ public final class CacheAdvance<T: Codable> { | |
| // If the application crashes between writing the header and writing the message data, we'll have lost the messages between the previous offsetInFileOfOldestMessage and the new offsetInFileOfOldestMessage. | ||
| try header.updateOffsetInFileOfOldestMessage(to: offsetInFileOfOldestMessage) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to load back into my mind how this code works. If the cache file allows overwriting we start to write from the beginning of the file again. It seems from the code that when we start overwriting, we seek to the next oldest message. What happens if the new message we are trying to write is so long that we need to overwrite multiple older messages?
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We handle that in the following code just above the code you highlighted here: This code does the following: We then are willing to write up to where the reader is located after this method returns. |
||
|
|
||
| // Truncate the file if it needs truncation before we write the next message, and after we update our header. | ||
| // If the application crashes between truncating this message data and writing the next message, our file will still be consistent. | ||
| if let truncateAtOffset = truncateAtOffset { | ||
| try writer.truncate(at: truncateAtOffset) | ||
| } | ||
|
Comment on lines
+147
to
+151
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm having trouble reminding myself why we need to truncate the end of the file. What would really help me in reading this code is a reminder of how the file is formatted. Do we have any graphic or ASCII art anywhere that shows what the format of the file is? If not, I think that would be really helpful to add as it would allow me to validate this code more confidently.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do not have any ascii art, though I agree that would help a ton. High level, the format is: where Truncating the file prevents us from reading data that should have been deleted when we try to read all messages. Since the only data we have about ordering is the start of first message and end of last message in the header, we'll just try to read all data between these points. If we haven't deleted the data at the end of the file, we would read these supposedly-deleted messages in when trying to read all messages. |
||
|
|
||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Putting here so we have a thread. My understanding is that we intend for the reader to always be at the start of the oldest message in the file. Is that understanding correct?
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes your understanding is correct. |
||
| // Let the reader know where the oldest message begins. | ||
| reader.offsetInFileOfOldestMessage = offsetInFileOfOldestMessage | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -42,7 +42,7 @@ final class CacheReader { | |
| } | ||
|
|
||
| /// Returns the next encodable message, seeking to the beginning of the next message. | ||
| func nextEncodedMessage() throws -> Data? { | ||
| func nextEncodedMessage(previousReadWasEmpty: Bool = false) throws -> Data? { | ||
| let startingOffset = offsetInFile | ||
|
|
||
| guard startingOffset != offsetInFileAtEndOfNewestMessage else { | ||
|
|
@@ -60,11 +60,17 @@ final class CacheReader { | |
| return message | ||
|
|
||
| case .emptyRead: | ||
| guard !previousReadWasEmpty else { | ||
| // If the previous read was also empty, then the file has been corrupted. | ||
| // Two empty reads in a row means that offsetInFileAtEndOfNewestMessage is incorrect. | ||
| // This inconsistency likely is likely due to a crash occurring during a message write. | ||
| throw CacheAdvanceError.fileCorrupted | ||
| } | ||
|
Comment on lines
+63
to
+68
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To make sure I follow: we never expect to be in this state anymore, right? It could be useful to make clear that this is a safety net.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah let me update this comment 👍 |
||
| // We know the next message is at the end of the file header. Let's seek to it. | ||
| try reader.seek(to: FileHeader.expectedEndOfHeaderInFile) | ||
|
|
||
| // We know there's a message to read now that we're at the start of the file. | ||
| return try nextEncodedMessage() | ||
| return try nextEncodedMessage(previousReadWasEmpty: true) | ||
|
|
||
| case .invalidFormat: | ||
| throw CacheAdvanceError.fileCorrupted | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -96,6 +96,35 @@ final class CacheAdvanceTests: XCTestCase { | |||||||||||||||||||||||||||||||||||||||
| XCTAssertEqual(messages, []) | ||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| func test_messages_throwsFileCorruptedWhenOffsetInFileAtEndOfNewsetMessageOutOfSync() throws { | ||||||||||||||||||||||||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am struggling to connect how the code that we changed in The premise of this test case is that we have an empty file. The offset of the start of the oldest message is at some point beyond the end of the header. And the offset of the end of the newest message is at the end of the header. We created a fallback fix in Before our change in I cannot figure out a sequence of messages where the previous code in CacheAdvance/Sources/CacheAdvance/CacheAdvance.swift Lines 104 to 111 in 2470113
As far as I can tell, the previous code in
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree with this analysis!
I disagree! If we were writing a message of length If
At which point, I think we've proved that our original analysis was correct. Am I missing something? To dig in a bit further, I believe this issue occurred because:
We only update CacheAdvance/Sources/CacheAdvance/CacheAdvance.swift Lines 238 to 246 in ed9ece4
Which means that the only way for Now, this analysis (thank you for making me write this out!) indicates that my fix to prevent this data corruption in the future was incorrect. The issue isn't when we truncate the file, but rather when we update
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This explanation makes sense. It seems like I was off by one. Thank you very much for writing this out.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I thought more about this and I've circled back to my original conclusion. Yes, I agree that
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We talked on Zoom and agreed with where I landed in my last comment.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: the indentation on this method is off.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll update this in my next PR
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||||||||||||||||||||||||||||||||||||
| let randomHighValue: UInt64 = 10_1000 | ||||||||||||||||||||||||||||||||||||||||
| let header = try CacheHeaderHandle( | ||||||||||||||||||||||||||||||||||||||||
| forReadingFrom: testFileLocation, | ||||||||||||||||||||||||||||||||||||||||
| maximumBytes: randomHighValue, | ||||||||||||||||||||||||||||||||||||||||
| overwritesOldMessages: true) | ||||||||||||||||||||||||||||||||||||||||
| let cache = CacheAdvance<TestableMessage>( | ||||||||||||||||||||||||||||||||||||||||
| fileURL: testFileLocation, | ||||||||||||||||||||||||||||||||||||||||
| writer: try FileHandle(forWritingTo: testFileLocation), | ||||||||||||||||||||||||||||||||||||||||
| reader: try CacheReader(forReadingFrom: testFileLocation), | ||||||||||||||||||||||||||||||||||||||||
| header: try CacheHeaderHandle( | ||||||||||||||||||||||||||||||||||||||||
| forReadingFrom: testFileLocation, | ||||||||||||||||||||||||||||||||||||||||
| maximumBytes: header.maximumBytes, | ||||||||||||||||||||||||||||||||||||||||
| overwritesOldMessages: header.overwritesOldMessages), | ||||||||||||||||||||||||||||||||||||||||
| decoder: JSONDecoder(), | ||||||||||||||||||||||||||||||||||||||||
| encoder: JSONEncoder()) | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| // Make sure the header data is persisted before we read it as part of the `messages()` call below. | ||||||||||||||||||||||||||||||||||||||||
| try header.synchronizeHeaderData() | ||||||||||||||||||||||||||||||||||||||||
| // Our file is empty. Make the file corrupted by setting the offset at end of newest message to be further in the file. | ||||||||||||||||||||||||||||||||||||||||
| // This should never happen, but past versions of this repo could lead to a file having this kind of inconsistency if a crash occurred at the wrong time. | ||||||||||||||||||||||||||||||||||||||||
| try header.updateOffsetInFileAtEndOfNewestMessage( | ||||||||||||||||||||||||||||||||||||||||
| to: FileHeader.expectedEndOfHeaderInFile + UInt64(MessageSpan.storageLength) + 1) | ||||||||||||||||||||||||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this file is empty would
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm happy to change this to just be |
||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| XCTAssertThrowsError(try cache.messages()) { | ||||||||||||||||||||||||||||||||||||||||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before the next commit, this test will infinite loop and crash. |
||||||||||||||||||||||||||||||||||||||||
| XCTAssertEqual($0 as? CacheAdvanceError, CacheAdvanceError.fileCorrupted) | ||||||||||||||||||||||||||||||||||||||||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should be able to detect this case and throw an appropriate error. We should also be able to prevent this case from happening in production. We solve both of these "shoulds" in the next few commits. |
||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| func test_isWritable_returnsTrueWhenStaticHeaderMetadataMatches() throws { | ||||||||||||||||||||||||||||||||||||||||
| let originalCache = try createCache(overwritesOldMessages: false) | ||||||||||||||||||||||||||||||||||||||||
| XCTAssertTrue(try originalCache.isWritable()) | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a way to inject these private properties so that we can mangle them prior to executing our test.