

Buy anything from 5,000+ international stores. One checkout price. No surprise fees. Join 2M+ shoppers on Desertcart.
Desertcart purchases this item on your behalf and handles shipping, customs, and support to GERMANY.
Summary Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Book Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. What's Inside Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills About the Authors Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing. Table of Contents A new paradigm for Big Data PART 1 BATCH LAYER Data model for Big Data Data model for Big Data: Illustration Data storage on the batch layer Data storage on the batch layer: Illustration Batch layer Batch layer: Illustration An example batch layer: Architecture and algorithms An example batch layer: Implementation PART 2 SERVING LAYER Serving layer Serving layer: Illustration PART 3 SPEED LAYER Realtime views Realtime views: Illustration Queuing and stream processing Queuing and stream processing: Illustration Micro-batch stream processing Micro-batch stream processing: Illustration Lambda Architecture in depth Review: I bought this book because some one online recommended it. I went non-stop for a few chapters ... - I don't have much background in large scale system and big data. I bought this book because some one online recommended it. I went non-stop for a few chapters and will definitely continue reading until I am done. Given limited background, I don't have problems grasping the concepts in the book at all. It's a very good book with theory clearly conveyed and examples beautifully demonstrated. I really hope all books in the distributed system area can do such a good job to make learning experience a lot smoother. Review: Lambda Architecture FTW - Great explanation of both the theory and practice of the lambda architecture. While the practice chapters are nice, it's the theory chapters that really shine. The book explains down to the byte level why components are implemented the way they are. For example, there's an immense amount of detail as to why using a db that doesn't support random writes allows for an application to query the batch layer's results without locking. The only downside to the book is that the architecture and exosystem is so new that there's not really a lot of pragmatic solutions. For example, the theory describes a query layer that can merge the results of batch and real time processing for client applications. However, in real life there are no pragmatic solutions for doing this so you'd have to write your own. It'll be interesting to see how the lambda architecture matures and to see future editions of this book. Hopefully, future editions will be as well written and have a better ecosystem for practice chapters.

| Best Sellers Rank | #1,727,176 in Books ( See Top 100 in Books ) #487 in Data Mining (Books) #1,916 in Software Development (Books) #2,234 in Internet & Telecommunications |
| Customer Reviews | 4.2 out of 5 stars 103 Reviews |
H**U
I bought this book because some one online recommended it. I went non-stop for a few chapters ...
I don't have much background in large scale system and big data. I bought this book because some one online recommended it. I went non-stop for a few chapters and will definitely continue reading until I am done. Given limited background, I don't have problems grasping the concepts in the book at all. It's a very good book with theory clearly conveyed and examples beautifully demonstrated. I really hope all books in the distributed system area can do such a good job to make learning experience a lot smoother.
Z**I
Lambda Architecture FTW
Great explanation of both the theory and practice of the lambda architecture. While the practice chapters are nice, it's the theory chapters that really shine. The book explains down to the byte level why components are implemented the way they are. For example, there's an immense amount of detail as to why using a db that doesn't support random writes allows for an application to query the batch layer's results without locking. The only downside to the book is that the architecture and exosystem is so new that there's not really a lot of pragmatic solutions. For example, the theory describes a query layer that can merge the results of batch and real time processing for client applications. However, in real life there are no pragmatic solutions for doing this so you'd have to write your own. It'll be interesting to see how the lambda architecture matures and to see future editions of this book. Hopefully, future editions will be as well written and have a better ecosystem for practice chapters.
A**R
The perfect book to understand big data concepts
In all honesty, the book has simplified big data architecture and its general premise in an eye opening way. Starting from the batch layer and spending a good amount of time addressing different aspects of it gave me a valuable lesson as a developer in understanding the complexity as well as the necessity of evaluating my data entries and their impact in the future formation of worthy analytics/results. My girlfriend and I enjoyed every chapter in this book. I guarantee you that you won't regret buying this book. I am looking forward to another book from you guys on the topic because its the first time where I couldn't wait to pick up the book and get to the end of it.
A**R
typo
If you are looking for a survey of different approaches of handling big data, you want to read "ELEMENTS OF SCALE: COMPOSING AND SCALING DATA PLATFORMS". ([...]) This book is dedicated to Lambda Architecture (one that is surveyed in the above article.) The book is very organized. Introduction in chapter 1 will be the road map of the whole book. Motivating with a simple web application based on RDBMS, the author showed how the approach to scale it becomes undesirable. After enumerating a list of desired properties, he proposed Lambda architecture, an approach in contrast to fully incremental architecture (with RDBMS). The Lambda architecture is partitioned into three layers: 1. batch layer that computes different views on big data 2. serving layer that answers user queries using views from the batch layer and speed layer. 3. speed layer that compensates an approximate answer over a period time when the batch layer is working on the complete answers. In the remaining chapters, the author dive deep into the rationale and requirements of all the different pieces of Lambda Architecture. To understand the context of Lambda Architecture, also refer to the wikipedia for crticism.
Y**N
Everything looks good until page 20 ...
I feel really sorry for those who gave 5 stars for this book. I purchased the book and started reading it eagerly as soon as I received it. It got my attention until I got to page 20 with a statement saying "...... If anything ever goes wrong, you can discard the state for the entire speed layer, and everything will be back to normal within a few hours." Within a few hours? No high-traffic production sites can afford a few hours down-time. At that point, I decided to return the book, which I did. I did scan through the rest of the book, though. First, the so-called lambda architecture might sound like a new term, but many high concurrency websites already work that way. For a high concurrency web site, the first-layer would be memcached-based, which gives O(1) low latency on all queries. The second layer would be a clustered app-server layer. The third layer could be a high-concurrency, extremely low-latency layer like a NoSQL cluster. The far backend could be Hadoop- or Spark-based for batch jobs. This is the known architecture in production for high traffic websites that need to support millions of concurrent users. Secondly, the bulk of the book is actually about Hadoop in the so-called batch layer. Hadoop once generated some excitement, but has lost its steam due to the new kid in the spot named Spark, which can do whatever Hadoop can do, but 10x - 20x faster with a fractional cost.
D**G
Insight into the lambda architecture
This book serves as a guide to building the lambda architecture from scratch, processing and serving big data. It is not a book to teach you about big data technologies though. If you are looking to learn and implement a system with batch and real time layers, this is the best book there is.
R**D
Binding of the book is BAD!
The book delivered is already falling apart. The binding on this item is awful! Will not buy from this seller again.
A**.
Great content. Bad structure/assembling quality
I just received this book. Content is great but as I started to turn the pages, they started to fall off. I buy a lot of books and it has been a very long time since I saw such a bad quality in the book physical structure. 5 stars on the content. 0 stars on the book physical structure.
B**M
Delivers what it promises
This book makes you understand in a systematic way all the big data technologies and where to use them. The proposed lambda architecture is a nice framework for understanding and building complex big data systems.
M**M
Enriched me with several concepts and knowledge applicable to a wide variety of problems
I think this book is very helpful to understand how a Lambda Architecture can help in tackling problems related to storing, processing and querying big amount of data. What I enjoyed most about this book is that it's organized into theoretical and illustration (practical) chapters, so the theoretical concepts are outlined and explained first, and then the next chapter guide through a practical application with an example technology which support the requirements and the use case. This makes the book really worthy in my opinion, also because the overall organization is well structured and guides through the different parts of the Lambda Architecture really well. I also think the last chapter illustrates really well the further trade-offs which may be applied to a Lambda Architecture, indeed helping the reader realize this is a general blue-print which could be tailored in the several layers and details to the actual problem at hand. I think reading this book has enriched me with several concepts and knowledge applicable to a wide variety of problems related to storing, processing and querying big amount of data.
E**D
Clasico de la arquitectura Big Data
Si estas en diseñando arquitecturas para big data o incluso si piensas que algun dia tu aplicación podria llegar a big data este es tu libro. Los conceptos serian CQRS y Event Sourcing pero a gran escala y para dar respuestas en real time.
F**L
excellent
Excellent ouvrage précis, détaillé sur un cas de big data. Ouvrage didactique, mais qui nécessite une certaine concentration en raison de la complexité technologique décrite. S'adresse à un public averti de développeurs (nombreuses illustrations avec échantillons de code Java)
A**G
Five Stars
it is a good book.
Trustpilot
4 days ago
3 weeks ago