![]() Using Flume, we can get the data from multiple servers immediately into Hadoop.Īlong with the log files, Flume is also used to import huge volumes of event data produced by social networking sites like Facebook and Twitter, and e-commerce websites like Amazon and Flipkart.įlume supports a large set of sources and destinations types.įlume supports multi-hop flows, fan-in fan-out flows, contextual routing, etc. Some of the notable features of Flume are as follows −įlume ingests log data from multiple web servers into a centralized store (HDFS, HBase) efficiently. Parshall flume sizes are designated by the throat width, W, and dimensions are available for flumes from the 1-in size for free flow of 0.03 ft 3 /s at 0.2 ft of measuring head up to the 50-ft size with 3,000 ft 3 /s at a head of. Natural Resources Conservation Services). ![]() It guarantees reliable message delivery.įlume is reliable, fault tolerant, scalable, manageable, and customizable. Figure 8-9 - Parshall flume dimensions - sheet 2 of 2 (courtesy of U.S. The transactions in Flume are channel-based where two transactions (one sender and one receiver) are maintained for each message. ![]() When the rate of incoming data exceeds the rate at which data can be written to the destination, Flume acts as a mediator between data producers and the centralized stores and provides a steady flow of data between them.įlume provides the feature of contextual routing. Using Apache Flume we can store the data in to any of the centralized stores (HBase, HDFS). Here, Apache Flume comes to our rescue.įlume is used to move the log data generated by application servers into HDFS at a higher speed. To do so, they would need to move the available log data in to Hadoop for analysis. Applications of FlumeĪssume an e-commerce web application wants to analyze the customer behavior from a particular region. It is principally designed to copy streaming data (log data) from various web servers to HDFS. Apache Flume is a tool/service/data ingestion mechanism for collecting aggregating and transporting large amounts of streaming data such as log files, events (etc.) from various sources to a centralized data store.įlume is a highly reliable, distributed, and configurable tool.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |