简体   繁体   English

file_get_contents和编码

[英]file_get_contents and encoding

i have an external webpage with some content like this: 我有一个外部网页,内容如下:

<script>
str = "hello";
fun('\202' + str + '\203\303\287');
</script>

In my PHP page, I am trying to retrieve (a part of) the argument of fun() in the following way: 在我的PHP页面中,我尝试通过以下方式检索fun()的参数(的一部分):

$html=file_get_contents("webpage.html");
$regex_pattern = "/fun\(\'(.*)\'(.*)\'(.*)\'\)/";
preg_match_all($regex_pattern,$html,$matches);

$p1=$matches[1][0];
$p2=$matches[3][0];

echo "p1: ".$p1.", length: ".strlen($p1)."<br>";

What I get is that $p1 is equal to \\202 and the length is 4. However, I would like to retrieve the character associated to \\202 (and the same for the sequence of characters represented by $p2). 我得到的是$ p1等于\\ 202,长度为4。但是,我想检索与\\ 202关联的字符(对于$ p2表示的字符序列也是如此)。

I browsed past questions related to similar matters but I was not able to get it working with the proposed solutions. 我浏览了过去与类似问题有关的问题,但无法与建议的解决方案一起使用。

Any hints? 有什么提示吗?

Thanks 谢谢

stripcslashes($p1);
stripcslashes($p2);

From: http://www.php.net/manual/en/function.stripcslashes.php 来自: http : //www.php.net/manual/zh/function.stripcslashes.php

string stripcslashes ( string $str ) 字符串stripcslashes (字符串$ str

Returns a string with backslashes stripped off. 返回带反斜杠的字符串。 Recognizes C-like \\n, \\r ..., octal and hexadecimal representation. 识别类似于C的\\ n,\\ r ...,八进制和十六进制表示形式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM