Google Chrome 扩展中的网页抓取(JavaScript + Chrome API)-前端问题

Web Scraping in a Google Chrome Extension (JavaScript + Chrome APIs)(Google Chrome 扩展中的网页抓取(JavaScript + Chrome API))

本文介绍了Google Chrome 扩展中的网页抓取(JavaScript + Chrome API)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用 JavaScript 和任何其他可用技术执行 从 Google Chrome 扩展程序中对当前未打开的标签页进行网页抓取 的最佳选项是什么?也接受其他 JavaScript 库.

What are the best options for performing Web Scraping of a not currently open tab from within a Google Chrome Extension with JavaScript and whatever more technologies are available. Other JavaScript-libraries are also accepted.

重要的是掩盖抓取行为，使其表现得像正常的网络请求.没有 AJAX 或 XMLHttpRequest 的迹象，例如 X-Requested-With: XMLHttpRequest 或 Origin.

The important thing is to mask the scraping to behave like a normal web-request. No indications of AJAX or XMLHttpRequest, like X-Requested-With: XMLHttpRequest or Origin.

必须可以从 JavaScript 访问抓取的内容，以便在扩展程序中进行进一步操作和呈现，最有可能作为字符串.

The scraped content must be accessible from JavaScript for further manipulation and presentation within the extension, most probably as a string.

在任何 WebKit/Chrome 特定的 API 中是否有任何钩子可用于发出正常的网络请求并获取操作结果?

Are there any hooks in any WebKit/Chrome-specific API:s that can be used to make a normal web-request and get the results for manipulation?

var pageContent = getPageContent(url); // TODO: Implement
var items = $(pageContent).find('.item');
// Display items with further selections

使用磁盘上的本地文件进行这项工作的奖励积分，用于初始调试.但如果这是唯一的一点就是停止解决方案，那么请忽略奖励积分.

Bonus-points to make this work from a local file on disk, for initial debugging. But if that is the only point is stopping a solution, then disregard the bonus-points.

Google Chrome 扩展中的网页抓取(JavaScript + Chrome API)

问题描述

推荐答案

基础教程推荐