Advantages of using NullWritable in Hadoop(在 Hadoop 中使用 NullWritable 的优势)
问题描述
对 null
键/值使用 NullWritable
比使用 null
文本(即 new Text(null)代码>).我从《Hadoop:权威指南》一书中看到以下内容.
What are the advantages of using NullWritable
for null
keys/values over using null
texts (i.e. new Text(null)
). I see the following from the «Hadoop: The Definitive Guide» book.
NullWritable
是 Writable
的一种特殊类型,因为它具有零长度序列化.无字节被写入流或从流中读取.它用作占位符;例如,在MapReduce,一个键或者一个值在不需要的时候可以声明为NullWritable
使用那个位置——它有效地存储了一个常量空值.NullWritable 也可以当您想要存储值列表时,可用作 SequenceFile
中的键,而不是到键值对.它是一个不可变的单例:可以通过调用来检索实例NullWritable.get()
NullWritable
is a special type ofWritable
, as it has a zero-length serialization. No bytes are written to, or read from, the stream. It is used as a placeholder; for example, in MapReduce, a key or a value can be declared as aNullWritable
when you don’t need to use that position—it effectively stores a constant empty value. NullWritable can also be useful as a key inSequenceFile
when you want to store a list of values, as opposed to key-value pairs. It is an immutable singleton: the instance can be retrieved by callingNullWritable.get()
我不清楚如何使用 NullWritable
写出输出?会不会在开始的输出文件中有一个常量值表示这个文件的key或者value是null
,这样MapReduce框架就可以忽略读取null
keys/值(以 null
为准)?另外,null
文本实际上是如何序列化的?
I do not clearly understand how the output is written out using NullWritable
? Will there be a single constant value in the beginning output file indicating that the keys or values of this file are null
, so that the MapReduce framework can ignore reading the null
keys/values (whichever is null
)? Also, how actually are null
texts serialized?
谢谢,
文卡特
推荐答案
键/值类型必须在运行时给出,所以任何写或读 NullWritables
的东西都会提前知道它将是处理该类型;文件中没有标记或任何内容.从技术上讲,NullWritables
是读取"的,只是读取"一个 NullWritable
实际上是无操作的.你可以亲眼看到根本没有写或读:
The key/value types must be given at runtime, so anything writing or reading NullWritables
will know ahead of time that it will be dealing with that type; there is no marker or anything in the file. And technically the NullWritables
are "read", it's just that "reading" a NullWritable
is actually a no-op. You can see for yourself that there's nothing at all written or read:
NullWritable nw = NullWritable.get();
ByteArrayOutputStream out = new ByteArrayOutputStream();
nw.write(new DataOutputStream(out));
System.out.println(Arrays.toString(out.toByteArray())); // prints "[]"
ByteArrayInputStream in = new ByteArrayInputStream(new byte[0]);
nw.readFields(new DataInputStream(in)); // works just fine
关于new Text(null)
的问题,你可以再试一试:
And as for your question about new Text(null)
, again, you can try it out:
Text text = new Text((String)null);
ByteArrayOutputStream out = new ByteArrayOutputStream();
text.write(new DataOutputStream(out)); // throws NullPointerException
System.out.println(Arrays.toString(out.toByteArray()));
Text
根本无法使用 null
String
.
这篇关于在 Hadoop 中使用 NullWritable 的优势的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:在 Hadoop 中使用 NullWritable 的优势


基础教程推荐
- “未找到匹配项"使用 matcher 的 group 方法时 2022-01-01
- 设置 bean 时出现 Nullpointerexception 2022-01-01
- FirebaseListAdapter 不推送聊天应用程序的单个项目 - Firebase-Ui 3.1 2022-01-01
- 如何使用 Java 创建 X509 证书? 2022-01-01
- 减少 JVM 暂停时间 >1 秒使用 UseConcMarkSweepGC 2022-01-01
- 降序排序:Java Map 2022-01-01
- 无法使用修饰符“public final"访问 java.util.Ha 2022-01-01
- Java:带有char数组的println给出乱码 2022-01-01
- 在 Libgdx 中处理屏幕的正确方法 2022-01-01
- Java Keytool 导入证书后出错,"keytool error: java.io.FileNotFoundException &拒绝访问" 2022-01-01