org.apache.hadoop.examples
Class RandomTextWriter
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.hadoop.examples.RandomTextWriter
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public class RandomTextWriter
- extends org.apache.hadoop.conf.Configured
- implements org.apache.hadoop.util.Tool
This program uses map/reduce to just run a distributed job where there is
no interaction between the tasks and each task writes a large unsorted
random sequence of words.
In order for this program to generate data for terasort with a 5-10 words
per key and 20-100 words per value, have the following config:
mapreduce.randomtextwriter.minwordskey
5
mapreduce.randomtextwriter.maxwordskey
10
mapreduce.randomtextwriter.minwordsvalue
20
mapreduce.randomtextwriter.maxwordsvalue
100
mapreduce.randomtextwriter.totalbytes
1099511627776
Equivalently, RandomTextWriter
also supports all the above options
and ones supported by Tool
via the command-line.
To run: bin/hadoop jar hadoop-${version}-examples.jar randomtextwriter
[-outFormat output format class] output
Method Summary |
static void |
main(String[] args)
|
int |
run(String[] args)
This is the main routine for launching a distributed random write job. |
Methods inherited from class org.apache.hadoop.conf.Configured |
getConf, setConf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
getConf, setConf |
TOTAL_BYTES
public static final String TOTAL_BYTES
- See Also:
- Constant Field Values
BYTES_PER_MAP
public static final String BYTES_PER_MAP
- See Also:
- Constant Field Values
MAPS_PER_HOST
public static final String MAPS_PER_HOST
- See Also:
- Constant Field Values
MAX_VALUE
public static final String MAX_VALUE
- See Also:
- Constant Field Values
MIN_VALUE
public static final String MIN_VALUE
- See Also:
- Constant Field Values
MIN_KEY
public static final String MIN_KEY
- See Also:
- Constant Field Values
MAX_KEY
public static final String MAX_KEY
- See Also:
- Constant Field Values
RandomTextWriter
public RandomTextWriter()
run
public int run(String[] args)
throws Exception
- This is the main routine for launching a distributed random write job.
It runs 10 maps/node and each node writes 1 gig of data to a DFS file.
The reduce doesn't do anything.
- Specified by:
run
in interface org.apache.hadoop.util.Tool
- Throws:
IOException
Exception
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
Copyright © 2009 The Apache Software Foundation