摘自象书
一个Job里可以从多个同质或异质的输入源读取数据,并使用各自的Mapper
MultipleInputs.addInputPath(conf, ncdcInputPath,
TextInputFormat.class, MaxTemperatureMapper.class)
MultipleInputs.addInputPath(conf, metOfficeInputPath,
TextInputFormat.class, MetOfficeMaxTemperatureMapper.class);
MultiOutputFormat可以让你按一定规则指定、分隔reduce output的文件名,如
...
static class StationNameMultipleTextOutputFormat
extends MultipleTextOutputFormat<NullWritable, Text> {
private NcdcRecordParser parser = new NcdcRecordParser();
protected String generateFileNameForKeyValue(NullWritable key, Text value,
String name) {
parser.parse(value);
return parser.getStationId();
}
}
...
另有MultiOutputs类,在此不表