问题描述
我有一个 HashMap <字符串,数组列表 <整数 > >.我想将我的 HashMap 对象(hmap)序列化到 HDFS 位置,然后在 Mapper 和 Reducers 将其反序列化以使用它.
I have a HashMap < String,ArrayList < Integer > >. I want to serialize my HashMap object(hmap) to HDFS location and later deserialize it at Mapper and Reducers for using it.
为了在 HDFS 上序列化我的 HashMap 对象,我使用了如下的普通 java 对象序列化代码,但出现错误(权限被拒绝)
To serialize my HashMap object on HDFS I used normal java object serialization code as follows but got error (permission denied)
try
{
FileOutputStream fileOut =new FileOutputStream("hashmap.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(hm);
out.close();
}
catch(Exception e)
{
e.printStackTrace();
}
我遇到了以下异常
java.io.FileNotFoundException: hashmap.ser (Permission denied)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
at java.io.FileOutputStream.<init>(FileOutputStream.java:110)
at KMerIndex.createIndex(KMerIndex.java:121)
at MyDriverClass.formRefIndex(MyDriverClass.java:717)
at MyDriverClass.main(MyDriverClass.java:768)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
有人可以建议或分享如何在 hdfs 上的 hadoop 中序列化对象的示例代码吗?
Can someone please suggest or share the sample code of how to serialize object in hadoop on hdfs ?
推荐答案
请尝试使用 SerializationUtils 来自 Apache Commons Lang.
Please try using SerializationUtils from Apache Commons Lang.
下面是方法
static Object clone(Serializable object) //Deep clone an Object using serialization.
static Object deserialize(byte[] objectData) //Deserializes a single Object from an array of bytes.
static Object deserialize(InputStream inputStream) //Deserializes an Object from the specified stream.
static byte[] serialize(Serializable obj) //Serializes an Object to a byte array for storage/serialization.
static void serialize(Serializable obj, OutputStream outputStream) //Serializes an Object to the specified stream.
在存储到 HDFS 时,您可以存储从序列化返回的 byte[].在获取对象时,您可以将类型转换为相应的对象,例如:文件对象并可以将其取回.
While storing in to HDFS you can store byte[] which was returned from serialize.
While getting the Object you can type cast to corresponding object for ex: File object and can get it back.
在我的例子中,我在 Hbase 列中存储了一个哈希图,我在我的映射器方法中将它检索回来,作为 Hashmap .. 并且成功了.强>
当然,你也可以用同样的方法...
Surely, you can also do that in the same way...
另一件事是你也可以使用 Apache Commons IO 参考这个 (org.apache.commons.io.FileUtils);但稍后您需要将此文件复制到 HDFS.因为您希望 HDFS 作为数据存储.
Another thing is You can also Use Apache Commons IO refer this (org.apache.commons.io.FileUtils);
but later you need to copy this file to HDFS. since you wanted HDFS as datastore.
FileUtils.writeByteArrayToFile(new File("pathname"), myByteArray);
注意: jar apache commons io 和 apache commons lang 在 hadoop 集群中始终可用.
Note : Both jars apache commons io and apache commons lang are always available in hadoop cluster.
这篇关于如何在 hadoop 中序列化对象(在 HDFS 中)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)