In computer science, the log-structured merge-tree (or LSM tree) is a data structure with performance characteristics that make it attractive for providing indexed access to files with high insert volume, such as transactional log data. LSM trees, like other search trees, maintain key-value pairs. LSM trees maintain data in two or more separate structures, each of which is optimized for its respective underlying storage medium; data is synchronized between the two structures efficiently, in batches.
|Type||Hybrid (two tree-like components)|
|Invented by||Patrick O'Neil, Edward Cheng, Dieter Gawlick, Elizabeth O'Neil|
|Time complexity in big O notation|
One simple version of the LSM tree is a two-level LSM tree. As described by Patrick O'Neil, a two-level LSM tree comprises two tree-like structures, called C0 and C1. C0 is smaller and entirely resident in memory, whereas C1 is resident on disk. New records are inserted into the memory-resident C0 component. If the insertion causes the C0 component to exceed a certain size threshold, a contiguous segment of entries is removed from C0 and merged into C1 on disk. The performance characteristics of LSM trees stem from the fact that each component is tuned to the characteristics of its underlying storage medium, and that data is efficiently migrated across media in rolling batches, using an algorithm reminiscent of merge sort.
Most LSM trees used in practice employ multiple levels. Level 0 is kept in main memory, and might be represented using a tree. The on-disk data is organized into sorted runs of data. Each run contains data sorted by the index key. A run can be represented on disk as a single file, or alternatively as a collection of files with non-overlapping key ranges. To perform a query on a particular key to get its associated value, one must search in the Level 0 tree and each run.
A particular key may appear in several runs, and what that means for a query depends on the application. Some applications simply want the newest key-value pair with a given key. Some applications must combine the values in some way to get the proper aggregate value to return. For example, in Apache Cassandra, each value represents a row in a database, and different versions of the row may have different sets of columns.
In order to keep down the cost of queries, the system must avoid a situation where there are too many runs.
Extensions to the 'leveled' method to incorporate B+ tree structures have been suggested, for example bLSM and Diff-Index.
- O'Neil 1996, p. 4
- "Leveled Compaction in Apache Cassandra : DataStax". web.archive.org. February 13, 2014.
- "SQLite4 with LSM Wiki". SQLite.
- "An application server together with a database manager". Retrieved April 3, 2018.
Tarantool’s disk-based storage engine is a fusion of ideas from modern filesystems, log-structured merge trees and classical B-trees.
- "GitHub - wiredtiger/wiredtiger: WiredTiger's source tree". December 4, 2019 – via GitHub.
- Dix, Paul (October 7, 2015). "[New] InfluxDB Storage Engine | Time Structured Merge Tree".
- Valialkin, Aliaksandr (May 23, 2019). "How VictoriaMetrics makes instant snapshots for multi-terabyte time series data". Medium.
- Huang, Gui; Cheng, Xuntao; Wang, Jianying; Wang, Yujie; He, Dengcheng; Zhang, Tieying; Li, Feifei; Wang, Sheng; Cao, Wei (2019). "X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing". Proceedings of the 2019 International Conference on Management of Data. SIGMOD '19. New York, NY, USA: ACM: 651–665. doi:10.1145/3299869.3314041. ISBN 9781450356435.
- O'Neil, Patrick E.; Cheng, Edward; Gawlick, Dieter; O'Neil, Elizabeth (June 1996). "The log-structured merge-tree (LSM-tree)". Acta Informatica. 33 (4): 351–385. CiteSeerX 10.1.1.44.2782. doi:10.1007/s002360050048.
- Li, Yinan; He, Bingsheng; Luo, Qiong; Yi, Ke (2009). "Tree Indexing on Flash Disks". 2009 IEEE 25th International Conference on Data Engineering. pp. 1303–6. CiteSeerX 10.1.1.144.6961. doi:10.1109/ICDE.2009.226. ISBN 978-1-4244-3422-0.
- Luo, Chen; Carey, Michael J. (July 2019). "LSM-based storage techniques: a survey". The VLDB Journal. doi:10.1007/s00778-019-00555-y.