问题描述
我以为他们指的是 Reducer,但在我的程序中我有
I thought that they refer to the Reducer but in my program I have
公共静态类 MyMapper 扩展映射器
和
公共静态类 MyReducer 扩展减速机<文本、文本、NullWritable、文本>
如果我有
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
我得到以下异常
map 中键的类型不匹配:预期 org.apache.hadoop.io.NullWritable,收到 org.apache.hadoop.io.Text
但如果我有
job.setOutputKeyClass(Text.class);
没有问题.
我的代码是否有问题,或者这是因为 NullWritable 或其他原因而发生的?
Is there sth wrong with my code or this happens because of NullWritable or sth else?
我还必须使用 job.setInputFormatClass 和 job.setOutputFormatClass 吗?因为我的程序没有它们也能正常运行.
Also do I have to use job.setInputFormatClass and job.setOutputFormatClass? Because my programs runs correctly without them.
推荐答案
调用 job.setOutputKeyClass( NullWritable.class ); 将设置预期的类型作为 map 和 reduce 阶段的输出.
Calling job.setOutputKeyClass( NullWritable.class ); will set the types expected as output from both the map and reduce phases.
如果您的 Mapper 发出的类型与 Reducer 不同,您可以使用 JobConf 的 setMapOutputKeyClass() 和 setMapOutputValueClass() 方法.这些隐式设置了 Reducer 期望的输入类型.
If your Mapper emits different types than the Reducer, you can set the types emitted by the mapper with the JobConf's setMapOutputKeyClass() and setMapOutputValueClass() methods. These implicitly set the input types expected by the Reducer.
(来源:雅虎开发者教程)
关于第二个问题,默认的 InputFormat 是 TextInputFormat.这将每个输入文件的每一行视为单独的记录,并且不执行解析.如果您需要以不同的格式处理您的输入,您可以调用这些方法,以下是一些示例:
Regarding your second question, the default InputFormat is the TextInputFormat. This treats each line of each input file as a separate record, and performs no parsing. You can call these methods if you need to process your input in a different format, here are some examples:
InputFormat | Description | Key | Value
--------------------------------------------------------------------------------------------------------------------------------------------------------
TextInputFormat | Default format; reads lines of text files | The byte offset of the line | The line contents
KeyValueInputFormat | Parses lines into key, val pairs | Everything up to the first tab character | The remainder of the line
SequenceFileInputFormat | A Hadoop-specific high-performance binary format | user-defined | user-defined
OutputFormat 的默认实例是 TextOutputFormat,它将(键、值)对写入文本文件的各行.下面是一些例子:
The default instance of OutputFormat is TextOutputFormat, which writes (key, value) pairs on individual lines of a text file. Some examples below:
OutputFormat | Description
---------------------------------------------------------------------------------------------------------
TextOutputFormat | Default; writes lines in "key value" form
SequenceFileOutputFormat | Writes binary files suitable for reading into subsequent MapReduce jobs
NullOutputFormat | Disregards its inputs
(来源:其他雅虎开发者教程)
这篇关于job.setOutputKeyClass 和 job.setOutputReduceClass 指的是哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)