JavaScript 中的 UTF-16 到 UTF-8 转换

2023-10-01前端开发问题
74

本文介绍了JavaScript 中的 UTF-16 到 UTF-8 转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

I have Base64 encoded data that is in UTF-16 I am trying to decode the data but most libraries only support UTF-8. I believe I have to drop the null bites but I am unsure how.

Currently I am using David Chambbers Polyfill for Base64, but I have also tried other libraries such as phpjs.org, none of which support UTF-16.

One thing to point out is on Chrome the atob method works with out problem, Firefox I get results described here, and in IE I am only returned the first character.

Any help is greatly appreciated

解决方案

You want to decode UTF-16, not convert to UTF-8. Decoding means that the result is a string of abstract characters. Of course there is an internal encoding for strings as well, UTF-16 or UCS-2 in javascript, but that's an implementation detail.

With strings the goal is that you don't have to worry about encodings but just about manipulating characters "as they are". So you can write string methods that don't need to decode input at all. Of course there are many edge cases where this falls apart.

You cannot decode utf-16 just by removing nulls. I mean this will work fine for the first 256 code points of unicode, but you will get garbage when any of the other ~110000 characters in unicode are used. You cannot even get the most popular non-ASCII characters like em dash or any smart quotes working.

Also, looking at your example, it looks like UTF-16LE.

//Braindead decoder that assumes fully valid input
function decodeUTF16LE( binaryStr ) {
    var cp = [];
    for( var i = 0; i < binaryStr.length; i+=2) {
        cp.push( 
             binaryStr.charCodeAt(i) |
            ( binaryStr.charCodeAt(i+1) << 8 )
        );
    }

    return String.fromCharCode.apply( String, cp );
}

var base64decode = atob; //In chrome and firefox, atob is a native method available for base64 decoding

var base64 = "VABlAHMAdABpAG4AZwA";
var binaryStr = base64decode(base64);
var result = decodeUTF16LE(binaryStr);

Now you can even get smart quotes working:

var base64 = "HCBoAGUAbABsAG8AHSA="
var binaryStr = base64decode(base64);
var result = decodeUTF16LE(binaryStr);
//""hello""

这篇关于JavaScript 中的 UTF-16 到 UTF-8 转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

js删除数组中指定元素的5种方法
在JavaScript中,我们有多种方法可以删除数组中的指定元素。以下给出了5种常见的方法并提供了相应的代码示例: 1.使用splice()方法: let array = [0, 1, 2, 3, 4, 5];let index = array.indexOf(2);if (index -1) { array.splice(index, 1);}// array = [0,...
2024-11-22 前端开发问题
182

layui要如何改变时间日历布局大小?
问题描述 我想改变layui时间日历布局大小,这个要怎么操作呢? 解决办法 可以用css样式对时间日历进行重新布局,具体代码如下: !DOCTYPE htmlhtmlheadmeta charset="UTF-8"title/titlelink rel="stylesheet" href="../../layui/css/layui.css" /style#test-...
2024-10-24 前端开发问题
271

JavaScript小数运算出现多位的解决办法
在开发JS过程中,会经常遇到两个小数相运算的情况,但是运算结果却与预期不同,调试一下发现计算结果竟然有那么长一串尾巴。如下图所示: 产生原因: JavaScript对小数运算会先转成二进制,运算完毕再转回十进制,过程中会有丢失,不过不是所有的小数间运算会...
2024-10-18 前端开发问题
301

JavaScript(js)文件字符串中丢失"\"斜线的解决方法
问题描述: 在javascript中引用js代码,然后导致反斜杠丢失,发现字符串中的所有\信息丢失。比如在js中引用input type=text onkeyup=value=value.replace(/[^\d]/g,) ,结果导致正则表达式中的\丢失。 问题原因: 该字符串含有\,javascript对字符串进行了转...
2024-10-17 前端开发问题
437

layui中table列表 增加属性 edit="date",不生效怎么办?
如果你想在 layui 的 table 列表中增加 edit=date 属性但不生效,可能是以下问题导致的: 1. 缺少日期组件的初始化 如果想在表格中使用日期组件,需要在页面中引入 layui 的日期组件,并初始化: script type="text/javascript" src="/layui/layui.js"/scrip...
2024-06-11 前端开发问题
455

Rails/Javascript:如何将 rails 变量注入(非常)简单的 javascript
Rails/Javascript: How to inject rails variables into (very) simple javascript(Rails/Javascript:如何将 rails 变量注入(非常)简单的 javascript)...
2024-04-20 前端开发问题
5