简体   繁体   English

使用 Puppeteer 保存网页抓取密码的最安全方法是什么?

[英]What's the safest way to save password for web scraping using Puppeteer?

I'm trying to scrape a website that needs login.我正在尝试抓取需要登录的网站。 The code snippet below works by saving username and password in a config JSON file.下面的代码片段通过将用户名和密码保存在配置 JSON 文件中来工作。 If someone gets access to the config file, the login information will be leaked out.如果有人访问配置文件,登录信息将被泄露。 Is there a better way to enhance security, say encrypting the username and password?有没有更好的方法来增强安全性,比如加密用户名和密码? Thanks in advance!提前致谢!

await page.goto("https://www.facebook.com/", { waitUntil: "networkidle0" });
await page.type("#email", config.username, { delay: 30 });
await page.type("#pass", config.password, { delay: 30 });

Option 1: JWT选项 1:JWT

You can use something like Json Web Token to encrypt data (And store it).您可以使用Json Web Token 之类的东西来加密数据(并存储它)。 The "problem" with JWT is you need a token to encrypt/decrypt the data and that token have to be in a safe location (You mind, maybe). JWT 的“问题”是您需要一个令牌来加密/解密数据,并且该令牌必须位于安全位置(您介意,也许)。 So, everytime you start the server, you need to give it the token.所以,每次启动服务器时,都需要给它令牌。

// Encrypt
let data = {username: 'ciro-gomes', password: 'Dá bilhão?'};
var encrypted = jwt.sign(data, 'my random token');
// Store encrypted data
// Decrypt
var data = jwt.verify(encrypted, 'my random token');
// data = {username: 'ciro-gomes', password: 'Dá bilhão?'}

Option 2: Use tokens provided by the website选项 2:使用网站提供的令牌

Normally, after login, the website gives you some cookies which are used to authenticate the user during the next access (Without asking for password).通常,在登录后,网站会为您提供一些 cookie,用于在下次访问时对用户进行身份验证(无需输入密码)。 You should store these cookies and, when needed, restore them to Puppeteer.您应该存储这些 cookie,并在需要时将它们恢复到 Puppeteer。
Of curse someone could use this data to access the website, but it is an website's issue.当然有人可以使用这些数据访问网站,但这是网站的问题。 Normally, when the user try to access important stuff, the website ask the password to be safe.通常,当用户尝试访问重要内容时,网站会要求密码安全。

By the very fact you store the password locally and send it out later means that you either assume the machine it is being done on is "100% secure" (or secure enough, for your purposes), or that you don't care much about that password.事实上,您将密码存储在本地并稍后将其发送出去意味着您要么假设正在执行它的机器是“100% 安全的”(或足够安全,就您的目的而言),要么您不太在意关于那个密码。 Encrypting the config file will only work if you encrypt it using something isn't stored on that machine, but then you'll have to provide that something every time you run that script.仅当您使用未存储在该机器上的某些内容对其进行加密时,加密配置文件才有效,但是每次运行该脚本时都必须提供该内容。

You can, obviously, obfuscate it one way or another, but it will NEVER be secure.显然,您可以以一种或另一种方式混淆它,但它永远不会安全。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM