问题描述
我有点难以理解 Hadoop 中的数据,如何将数据放入地图和缩减功能.我知道我们可以定义输入格式和输出格式,然后定义输入和输出的键类型.但是举个例子,如果我们想要一个对象作为输入类型,Hadoop 在内部是如何做到的呢?
I'm having a bit difficult in understanding in Hadoop, how the data put into the map and reduced functions. I know that we can define the input format and output format and then the key types for input and output. But for an example if we want an object to be the input type, how does Hadoop internally does that ?
谢谢...
推荐答案
您可以使用 Hadoop InputFormat 和 OutputFormat 接口来创建您的自定义格式..一个示例可以将 MapReduce 作业的输出格式化为 JSON..类似这-
you can use Hadoop InputFormat and OutputFormat interfaces to create your custom formats..an example could be to format the output of your MapReduce job as JSON..something like this -
public class JsonOutputFormat extends TextOutputFormat<Text, IntWritable> {
@Override
public RecordWriter<Text, IntWritable> getRecordWriter(
TaskAttemptContext context) throws IOException,
InterruptedException {
Configuration conf = context.getConfiguration();
Path path = getOutputPath(context);
FileSystem fs = path.getFileSystem(conf);
FSDataOutputStream out =
fs.create(new Path(path,context.getJobName()));
return new JsonRecordWriter(out);
}
private static class JsonRecordWriter extends
LineRecordWriter<Text,IntWritable>{
boolean firstRecord = true;
@Override
public synchronized void close(TaskAttemptContext context)
throws IOException {
out.writeChar('{');
super.close(null);
}
@Override
public synchronized void write(Text key, IntWritable value)
throws IOException {
if (!firstRecord){
out.writeChars(",
");
firstRecord = false;
}
out.writeChars(""" + key.toString() + "":""+
value.toString()+""");
}
public JsonRecordWriter(DataOutputStream out)
throws IOException{
super(out);
out.writeChar('}');
}
}
}
这篇关于如何(在 Hadoop 中)将数据放入正确类型的 map 和 reduce 函数中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)