Refactor Excel writers and readers for performance and limit handling#13
Merged
Conversation
…date benchmarks to include SpreadCheetah
…writers; enhance tests for round-trip validation
…ord buffering; implement FlushRecords method for improved performance
…ng; update FlushThreshold to SpillThreshold for better performance
… refactor XlsbSheetWriter to support writing rows with XlsbCell; enhance benchmarks and tests for round-trip validation
…st ExcelReader comparisons with MiniExcel, Sylvan, and SpreadCheetah
… method for RowWriter and implement row flushing based on a configurable threshold in SheetWriter
…ment GetSpan and Advance methods to reduce memory allocations during value formatting
…ate value writing methods into a single WriteValue method for improved efficiency and reduced memory allocations
- Introduced `ExcelLimitExceededException` to handle limit violations. - Created `ExcelReaderOptions` to configure maximum limits for decompressed bytes and shared strings. - Implemented `LimitChecks` to validate limits during reading operations. - Added `DecompressedByteCounter` to track total decompressed bytes. - Updated `LimitedReadStream` to enforce limits on read operations. - Refactored `XlsReader`, `XlsxReader`, and `XlsbReader` to utilize new limit checks and options. - Enhanced shared string parsing to respect configured limits. - Added unit tests to verify limit enforcement for decompressed bytes and shared strings.
…nts; add tests for XML dialect compliance
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13 +/- ##
==========================================
- Coverage 90.79% 85.99% -4.81%
==========================================
Files 54 60 +6
Lines 3336 3862 +526
Branches 603 692 +89
==========================================
+ Hits 3029 3321 +292
- Misses 172 382 +210
- Partials 135 159 +24 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Benchmark ResultsMeasured on ExcelReader.Benchmarks.ParseBenchmark
ExcelReader.Benchmarks.ReadBenchmark
ExcelReader.Benchmarks.WriteBenchmark
ExcelReader.Benchmarks.XlsReadBenchmark
ExcelReader.Benchmarks.XlsWriteBenchmark
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces configurable input size limits for Excel file reading and writing, enhancing security and robustness by bounding resource usage and preventing excessive memory allocations. It adds an
ExcelReaderOptionsclass for specifying limits, updates all reader factory methods to accept these options, and throws a newExcelLimitExceededExceptionwhen limits are exceeded. The documentation is updated to explain the new defaults and how to tune or disable them.Configurable Input Limits and Exception Handling
ExcelReaderOptionsclass, allowing users to configure maximum total decompressed bytes, maximum cell/row value buffer size, and maximum shared string size. Defaults are set to safe values but can be overridden or disabled. (src/ExcelReader.Core/Reader/ExcelReaderOptions.cssrc/ExcelReader.Core/Reader/ExcelReaderOptions.csR1-R11)ExcelLimitExceededExceptionexception, which is thrown when any configured limit is exceeded, providing details about the limit and actual usage. (src/ExcelReader.Core/Reader/ExcelLimitExceededException.cssrc/ExcelReader.Core/Reader/ExcelLimitExceededException.csR1-R31)LimitCheckshelper class to centralize logic for checking and enforcing the configured limits. (src/ExcelReader.Core/Reader/LimitChecks.cssrc/ExcelReader.Core/Reader/LimitChecks.csR1-R33)API Changes
Excelfactory methods (sync and async) to accept an optionalExcelReaderOptionsparameter, passing it to the underlying readers. This change is source-compatible and allows consumers to tune or disable limits as needed. (src/ExcelReader.Core/Reader/Excel.cs[1] [2] [3] [4]Documentation and Benchmark Updates
README.mdto document the new size limits, show how to tune or disable them, and refresh benchmark results to reflect recent performance and allocation improvements. (README.md[1] [2] [3]These changes make the library safer by default and provide flexibility for advanced users to adjust limits based on their requirements.