To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzes reviews to verify trustworthiness.
I was looking for a detailed introduction to distributed data systems in terms of their function, design, capabilities, limitations-and-workarounds, and performance, and in this book I found exactly what I was looking for. Note that the discussion on measurable metrics is qualitative; for example, you won't find graphs or other quantitative measures of performance, rather a general commentary on what type or level of performance could be expected. This helps keep the text very accessible; yet thorough and detailed. The organization of topics and the frequent calling out of parallels among different systems, problems and solutions, also keeps together this sizeable volume as one coherent piece.
A very good book for professionals/students/developers who want to learn distributed systems and data systems but also new to this area. Reader need to have basic knowledge in operating system especially file system, and computer network.
I bought this book to learn more about data intensive applications and I was a bit skeptical at first but I must say Martin Kleppmann did an amazing job crafting this book. I highly recommend this book to beginners and even people who are already knowledgeable in the field.
A very nice and condensed version of the storage principles of different databases and the algorithms used behind the scenes. This is information you would only find through multiple papers and blogs otherwise, and isn't as accessible. The book has done a pretty job of collecting all that information and putting it in one place.
This is a fantastic book. It gets into the details of storing, searching, and managing data at scale. It lays out the pros and cons of different approaches, and gives specifics about many commonly used tools. I am having trouble putting the book down.
Working as a professional data infrastructure developer, self educating has been one of the hardest challenges because there is a dizzying amount of technology specific content on the internet but it extremely difficult to find something that gives you an in depth, technology-agnostic and academic understanding of data engineering and distributed data systems. I highly recommend this book IF you have a background in software development. It is a real gem.