mapreducepatterns | MapReduce Design Patterns | Architecture library
kandi X-RAY | mapreducepatterns Summary
Support
Quality
Security
License
Reuse
- Submit parallel jobs
- Submit a parallel job
- Main entry point
- Demonstrates how to delete bloom filtering
- Run a DistributedGrep
- Demonstrates how to submit a DistinctUserCount
- The main method for testing
- Main entry point for testing
- Runs the average driver
- Runs a job
- Entry point to the JobChaining driver
- Command - line tool
- Runs a Bloom filtering algorithm
- Main command line entry point
- Test program
- Entry point for the Replicated join driver
- Entry point for the join
- Entry point for the composite join driver
- Main method
- Creates a new bloom filter
- Main command line
mapreducepatterns Key Features
mapreducepatterns Examples and Code Snippets
Trending Discussions on mapreducepatterns
Trending Discussions on mapreducepatterns
QUESTION
I'm new to Hadoop and currently I'm learning mapreduce design pattern from Donald Miner & Adam Shook MapReduce Design Pattern book. So in this book there is Cartesian Product Pattern. My question is:
- When does record reader send data to mapper?
- Where is the code that send the data to mapper?
What I see is next function in CartesianRecordReader class read both split without sending the data.
Here is the source code https://github.com/adamjshook/mapreducepatterns/blob/master/MRDP/src/main/java/mrdp/ch5/CartesianProduct.java
That's all, thanks in advance :)
ANSWER
Answered 2018-Dec-04 at 07:27When does record reader send data to mapper?
Let me answer by giving you an idea how how the mapper and the RecordReader are related. This is the Hadoop code that sends data to the mapper. 1
RecordReader input;
K1 key = input.createKey();
V1 value = input.createValue();
while (input.next(key, value)) {
// map pair to output
mapper.map(key, value, output, reporter);
if(incrProcCount) {
reporter.incrCounter(SkipBadRecords.COUNTER_GROUP,
SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS, 1);
}
}
Basically, the Hadoop will call next
until it returns false
, and at every call key
and value
will obtain new values. Key
being normally the bytes read so far and value
the next line in the file.
Where is the code that send the data to mapper?
That code is at the source code of hadoop (Probably at the MapContextImpl class) but it resembles what I have wrote in the code snippet.
EDIT : The source code is at MapRunner.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install mapreducepatterns
You can use mapreducepatterns like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the mapreducepatterns component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .
Support
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesExplore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits
Save this library and start creating your kit
Share this Page