Desktop version

Home arrow Computer Science arrow Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems


[1] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman: Data Structures and Algorithms. Addison-Wesley, 1983. ISBN: 978-0-201-00023-8

[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein: Introduction to Algorithms, 3rd edition. MIT Press, 2009. ISBN: 978-0-262-53305-8

[3] Justin Sheehy and David Smith: “Bitcask: A Log-Structured Hash Table for Fast Key/Value Data,” Basho Technologies, April 2010.

[4] Yinan Li, Bingsheng He, Robin Jun Yang, et al.: “Tree Indexing on Solid State Drives,” Proceedings of the VLDB Endowment, volume 3, number 1, pages 1195-1206, September 2010.

[5] Goetz Graefe: “Modern B-Tree Techniques,” Foundations and Trends in Databases, volume 3, number 4, pages 203-402, August 2011. doi:10.1561/1900000028

[6] Jeffrey Dean and Sanjay Ghemawat: “LevelDB Implementation Notes,” lev-

[7] Dhruba Borthakur: “The History of RocksDB,”, November 24, 2013.

[8] Matteo Bertozzi: “Apache HBase I/O - HFile,”, June, 29 2012.

[9] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, et al.: “Bigtable: A Distributed Storage System for Structured Data,” at 7th USENIX Symposium on Operating System Design and Implementation (OSDI), November 2006.

[10] Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil: “The Log- Structured Merge-Tree (LSM-Tree),” Acta Informatica, volume 33, number 4, pages 351-385, June 1996. doi:10.1007/s002360050048

[11] Mendel Rosenblum and John K. Ousterhout: “The Design and Implementation of a Log-Structured File System,” ACM Transactions on Computer Systems, volume 10, number 1, pages 26-52, February 1992. doi:10.1145/146941.146943

[12] Adrien Grand: “What Is in a Lucene Index?,” at Lucene/Solr Revolution, November 14, 2013.

[13] Deepak Kandepet: “Hacking Lucene—The Index Format,”, October 1, 2011.

[14] Michael McCandless: “Visualizing Lucene’s Segment Merges,” blog.mikemccand-, February 11, 2011.

[15] Burton H. Bloom: “Space/Time Trade-offs in Hash Coding with Allowable Errors,” Communications of the ACM, volume 13, number 7, pages 422-426, July 1970. doi:10.1145/362686.362692

[16] “Operating Cassandra: Compaction,” Apache Cassandra Documentation v4.0, 2016.

[17] Rudolf Bayer and Edward M. McCreight: “Organization and Maintenance of Large Ordered Indices,” Boeing Scientific Research Laboratories, Mathematical and Information Sciences Laboratory, report no. 20, July 1970.

[18] Douglas Comer: “The Ubiquitous B-Tree,” ACM Computing Surveys, volume 11, number 2, pages 121-137, June 1979. doi:10.1145/356770.356776

[19] Emmanuel Goossaert: “Coding for SSDs,”, February 12, 2014.

[20] C. Mohan and Frank Levine: “ARIES/IM: An Efficient and High Concurrency Index Management Method Using Write-Ahead Logging,” at ACM International Conference on Management of Data (SIGMOD), June 1992. doi: 10.1145/130283.130338

[21] Howard Chu: “LDAP at Lightning Speed,” at Build Stuff’14, November 2014.

[22] Bradley C. Kuszmaul: “A Comparison of Fractal Trees to Log-Structured Merge (LSM) Trees,”, April 22, 2014.

[23] Manos Athanassoulis, Michael S. Kester, Lukas M. Maas, et al.: “Designing Access Methods: The RUM Conjecture,” at 19th International Conference on Extending Database Technology (EDBT), March 2016. doi:10.5441/002/edbt.2016.42

[24] Peter Zaitsev: “Innodb Double Write,”, August 4, 2006.

[25] Tomas Vondra: “On the Impact of Full-Page Writes,”, November 23, 2016.

[26] Mark Callaghan: “The Advantages of an LSM vs a B-Tree,”, January 19, 2016.

[27] Mark Callaghan: “Choosing Between Efficiency and Performance with RocksDB,” at Code Mesh, November 4, 2016.

[28] Michi Mutsuzaki: “MySQL vs. LevelDB,”, August 2011.

[29] Benjamin Coverston, Jonathan Ellis, et al.: “CASSANDRA-1608: Redesigned Compaction,, July 2011.

[30] Igor Canadi, Siying Dong, and Mark Callaghan: “RocksDB Tuning Guide,” git-, 2016.

[31] MySQL 5.7 Reference Manual. Oracle, 2014.

[32] Books Online for SQL Server 2012. Microsoft, 2012.

[33] Joe Webb: “Using Covering Indexes to Improve Query Performance,” simple-, 29 September 2008.

[34] Frank Ramsak, Volker Markl, Robert Fenk, et al.: “Integrating the UB-Tree into a Database System Kernel,” at 26th International Conference on Very Large Data Bases (VLDB), September 2000.

[35] The PostGIS Development Group: “PostGIS 2.1.2dev Manual,”, 2014.

[36] Robert Escriva, Bernard Wong, and Emin Gun Sirer: “HyperDex: A Distributed, Searchable Key-Value Store,” at ACM SIGCOMM Conference, August 2012. doi: 10.1145/2377677.2377681

[37] Michael McCandless: “Lucene’s FuzzyQuery Is 100 Times Faster in 4.0,”, March 24, 2011.

[38] Steffen Heinz, Justin Zobel, and Hugh E. Williams: “Burst Tries: A Fast, Efficient Data Structure for String Keys,” ACM Transactions on Information Systems, volume 20, number 2, pages 192-223, April 2002. doi:10.1145/506309.506312

[39] Klaus U. Schulz and Stoyan Mihov: “Fast String Correction with Levenshtein Automata,” International Journal on Document Analysis and Recognition, volume 5, number 1, pages 67-85, November 2002. doi:10.1007/s10032-002-0082-8

[40] Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze: Introduction to Information Retrieval. Cambridge University Press, 2008. ISBN: 978-0-521-86571-5, available online at

[41] Michael Stoneb raker, Samuel Madden, Daniel J. Abadi, et al.: “The End of an Architectural Era (It’s Time for a Complete Rewrite),” at 33rd International Conference on Very Large Data Bases (VLDB), September 2007.

[42] “VoltDB Technical Overview White Paper,” VoltDB, 2014.

[43] Stephen M. Rumble, Ankita Kejriwal, and John K. Ousterhout: “Log-Structured Memory for DRAM-Based Storage,” at 12th USENIX Conference on File and Storage Technologies (FAST), February 2014.

[44] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stone- braker: “OLTP Through the Looking Glass, and What We Found There,” at ACM International Conference on Management of Data (SIGMOD), June 2008. doi: 10.1145/1376616.1376713

[45] Justin DeBrabant, Andrew Pavlo, Stephen Tu, et al.: “Anti-Caching: A New Approach to Database Management System Architecture,” Proceedings of the VLDB Endowment, volume 6, number 14, pages 1942-1953, September 2013.

[46] Joy Arulraj, Andrew Pavlo, and Subramanya R. Dulloor: “Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems,” at ACM International Conference on Management of Data (SIGMOD), June 2015. doi: 10.1145/2723372.2749441

[47] Edgar F. Codd, S. B. Codd, and C. T. Salley: “Providing OLAP to User-Analysts: An IT Mandate,” E. F. Codd Associates, 1993.

[48] Surajit Chaudhuri and Umeshwar Dayal: “An Overview of Data Warehousing and OLAP Technology,” ACM SIGMOD Record, volume 26, number 1, pages 65-74, March 1997. doi:10.1145/248603.248616

[49] Per-Ake Larson, Cipri Clinciu, Campbell Fraser, et al.: “Enhancements to SQL Server Column Stores,” at ACM International Conference on Management of Data (SIGMOD), June 2013.

[50] Franz Farber, Norman May, Wolfgang Lehner, et al.: “The SAP HANA Database - An Architecture Overview,” IEEE Data Engineering Bulletin, volume 35, number 1, pages 28-33, March 2012.

[51] Michael Stonebraker: “The Traditional RDBMS Wisdom Is (Almost Certainly) All Wrong,” presentation at EPFL, May 2013.

[52] Daniel J. Abadi: “Classifying the SQL-on-Hadoop Solutions,”, October 2, 2013.

[53] Marcel Kornacker, Alexander Behm, Victor Bittorf, et al.: “Impala: A Modern, Open-Source SQL Engine for Hadoop,” at 7th Biennial Conference on Innovative Data Systems Research (CIDR), January 2015.

[54] Sergey Melnik, Andrey Gubarev, Jing Jing Long, et al.: “Dremel: Interactive Analysis of Web-Scale Datasets,” at 36th International Conference on Very Large Data Bases (VLDB), pages 330-339, September 2010.

[55] Ralph Kimball and Margy Ross: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edition. John Wiley & Sons, July 2013. ISBN: 978-1-118-53080-1

[56] Derrick Harris: “Why Apple, eBay, and Walmart Have Some of the Biggest Data Warehouses You’ve Ever Seen,”, March 27, 2013.

[57] Julien Le Dem: “Dremel Made Simple with Parquet,”, September 11, 2013.

[58] Daniel J. Abadi, Peter Boncz, Stavros Harizopoulos, et al.: “The Design and Implementation of Modern Column-Oriented Database Systems,” Foundations and Trends in Databases, volume 5, number 3, pages 197-280, December 2013. doi: 10.1561/1900000024

[59] Peter Boncz, Marcin Zukowski, and Niels Nes: “MonetDB/X100: HyperPipelining Query Execution,” at 2nd Biennial Conference on Innovative Data Systems Research (CIDR), January 2005.

[60] Jingren Zhou and Kenneth A. Ross: “Implementing Database Operations Using SIMD Instructions,” at ACM International Conference on Management of Data (SIG- MOD), pages 145-156, June 2002. doi:10.1145/564691.564709

[61] Michael Stonebraker, Daniel J. Abadi, Adam Batkin, et al.: “C-Store: A Column- oriented DBMS,” at 31st International Conference on Very Large Data Bases (VLDB), pages 553-564, September 2005.

[62] Andrew Lamb, Matt Fuller, Ramakrishna Varadarajan, et al.: “The Vertica Analytic Database: C-Store 7 Years Later,” Proceedings of the VLDB Endowment, volume 5, number 12, pages 1790-1801, August 2012.

[63] Julien Le Dem and Nong Li: “Efficient Data Storage for Analytics with Apache Parquet 2.0,” at Hadoop Summit, San Jose, June 2014.

[64] Jim Gray, Surajit Chaudhuri, Adam Bosworth, et al.: “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals,” Data Mining and Knowledge Discovery, volume 1, number 1, pages 29-53, March 2007. doi:10.1023/A:1009726021843

< Prev   CONTENTS   Source   Next >