Samza uses YARN for resource negotiation.这引发了对Spark、Storm、Samza 等流媒体解决方案的更多兴趣。 That has generated more interest in streaming solutions such as Spark, Storm, Samza and others. The Samza team is working to address these.然后是流处理层,这可能包括Storm、SparkStreaming或者Samza 。 Then there's a stream processing layer, which might include Storm, Spark Streaming, or Samza . Samza has made huge progress with release 1.0.
超越MapReduce并与Samza 实时处理数据,并与Spark一起迭代. Go beyond MapReduce and process data in real time with Samza and iteratively with Spark. Apache Samza is a distributed stream processing framework. 这是一套分布式流机器学习算法,可运行于S4、Storm、Samza 之上。 It's a set of distributed streaming machine learning algorithms that run on top of S4, Storm, and Samza . Samza provides this ability to easily connect to different systems.但这些并非Samza 1.0的全部,因为它还带来了一些更重要的新功能:SQL和DevOps改进。 That's not all for Samza 1.0 though, as it comes with some more important new features: SQL and DevOps imporovements. Samza 1.0的另一个改进方面是集群管理器的独立性。Another area of improvement for Samza 1.0 is cluster manager independence. Shetty说,LinkedIn有一群工程师致力于从事流式处理开发工作,而Samza 是这些工作不可或缺的一部分。 Shetty said there is a group of engineers dedicated to stream processing at LinkedIn, and Samza is an integral part of that work. Samza was built to provide a lightweight framework for continuous data processing. 他是多个可扩展的数据系统空间的开源项目的作者之一,包括Voldemort、Azkaban、Kafka和Samza 。 He was among the original authors of a number of open source projects in the big data space, including Voldemort, Kafka, and Samza . Samza 1.0通过提供独立运行模式来解决这个问题。Samza 1.0 addresses this by offering a standalone mode of running applications.此外,需要使用Kafka主题将多个Samza 作业连接在一起,这使得构建应用程序非常耗时且容易出错。 In addition, multiple Samza jobs needed to be wired together using Kafka topics, which made building applications time consuming and error-prone. Samza 大大简化了许多流处理任务,并实现低延迟性能。Samza greatly simplifies many parts of stream processing and offers low latency performance. 技术可扩展性仅使用供应商提供的工具与介绍的任何开源工具(Spark,Samza ,Tachyon等)兼容. Technological Extensibility Use only vendor-provided tools Mix up with any brand-new open source tools introduced(Spark, Samza , Tachyon, etc.). 但Samza 团队不满足于此,他们又向前迈进了一步,让他们的API与ApacheBeam兼容。 But the Samza team went one step further, making their API compatible with Apache Beam. Facebook的许可证影响了许多重要的开源项目,包括Samza ,Flink,Marmotta,Kafka和Bahir。 A number of important open source projects have been impacted by Facebook's license, including Samza , Flink, Marmotta, Kafka and Bahir. Samza 1.0还支持表和流的连接,并改进了Samza应用程序的可测试性。Samza 1.0 also brings support for joining streams with tables, and improves testability of Samza applications.为实时流分析设计的框架包括ApacheStorm和ApacheSamza (通常与Kafka和HadoopYARN一起使用)。 Frameworks that are designed for real-time stream analytics include Apache Storm and Apache Samza (usually used with Kafka and Hadoop YARN). 在Samza 1.0之前,Samza 需要借助YARN进行资源管理和应用程序的分布式执行。 Prior to Samza 1.0, Samza required YARN for resource management and distributed execution of applications. 虽然Kafka可用于很多流处理系统,但按照设计,Samza 可以更好地发挥Kafka独特的架构优势和保障。 While Kafka can be used by many stream processing systems, Samza is designed specifically to take advantage of Kafka's unique architecture and guarantees. Samza 的代码可作为Yarn作业运行,还可以实施StreamTask接口,借此定义process()调用。And the Samza code runs as a Yarn job, and we can implement the StreamTask interface, which defines a process() call. 在某种程度上,Samza 团队已经意识到,虽然稳定性和性能是Samza的核心优势,但它的编程API却相当低级。 The Samza team realized at some point that while stability and performance have been Samza's core strengths, its programming API was fairly low-level. Samza 与YARN和Kafka紧密集成可提供更大灵活性,更易用的多团队使用,以及更简单的复制和状态管理。Samza integrates tightly with YARN and Kafka in order to provide flexibility, easy multi-team usage, and straightforward replication and state management. ApacheBeam是一个开源项目,提供了一组统一的API,用于跨执行引擎移植处理管道,包括Samza 、Spark和Flink。 Apache Beam is an open-source project that provides a unified API, allowing pipelines to be ported across execution engines including Samza , Spark, or Flink. Samza 本身很适合有多个团队需要使用(但相互之间并不一定紧密协调)不同处理阶段的多个数据流的组织。Samza itself is a good fit for organizations with multiple teams using(but not necessarily tightly coordinating around) data streams at various stages of processing. Samza 本身非常适合具有多个数据流的组织,这些数据流需要在过程的不同阶段由多个团队使用(但不一定是彼此密切协作)。Samza itself is a good fit for organizations with multiple teams using(but not necessarily tightly coordinating around) data streams at various stages of processing.
展示更多例子
结果: 53 ,
时间: 0.014
English
Bahasa indonesia
日本語
عربى
Български
বাংলা
Český
Dansk
Deutsch
Ελληνικά
Español
Suomi
Français
עִברִית
हिंदी
Hrvatski
Magyar
Italiano
Қазақ
한국어
മലയാളം
मराठी
Bahasa malay
Nederlands
Norsk
Polski
Português
Română
Русский
Slovenský
Slovenski
Српски
Svenska
தமிழ்
తెలుగు
ไทย
Tagalog
Turkce
Українська
اردو
Tiếng việt