简体   繁体   English

如何在php和sql中进行特殊字符转换?

[英]How can I do special characters conversion in php and sql?

I am learning curl to fetch data from a site. 我正在学习curl以从站点获取数据。 Everything works fine with Curl except for special characters. 除特殊字符外,Curl都可以正常工作。 When I look at the source of the site it has following items. 当我查看网站的来源时,它包含以下项目。

<li class="page_item page-item"><a href="../categories/mens-health/">Men&#8217;s Health</a></li>
<li class="page_item page-item"><a href="../categories/nails-hair-skin/">Nails, Hair &#038; Skin</a></li>
<li class="page_item page-item"><a href="../categories/womens-health/">Women’s Health</a></li>  

When I get the data in array and echo it on browser I get the result as 当我得到数组中的数据并在浏览器上回显它时,结果为

Men&#8217;s Health  
Nails, Hair &#038; Skin  
Women’s Health

which I got by executing the following code 我通过执行以下代码获得

$search = array('&#146;');
$replace = array("'");  
$category_names[] = htmlentities(str_replace($search, $replace, $word), ENT_QUOTES);

$word being the 3 array items above. $ word是上面的3个数组项。 Now I am not able to convert them to proper characters while inserting into database. 现在,在插入数据库时​​,我无法将它们转换为正确的字符。 This is how it appears in my db 这是它出现在我的数据库中的方式

Men&amp;#8217;s Health
Nails, Hair &amp;#038; Skin
Women&rsquo;s Health

How can I insert it in proper format as follows? 如何按以下正确格式插入它?
Men's health 男性健康
Nails. 指甲 Hair & Skin 头发和皮肤
Women's Health 女性健康

I checked some of the solutions for having apostrophe but they are mostly single insert statements, where as I am inserting in a loop. 我检查了一些具有撇号的解决方案,但它们大多是单个插入语句,就像我在循环中插入的那样。

Way to insert text having ' (apostrophe) into a SQL table 将带有'(撇号)的文本插入SQL表的方法
How do I escape a single quote in SQL Server? 如何在SQL Server中转义单引号?

I did html_entity_decode($category_names[$i]); 我做了html_entity_decode($ category_names [$ i]); and now I get the following reult in my database 现在我在数据库中得到以下结果
Men’s Health 男性健康
Nails, Hair & Skin 指甲,头发和皮肤
Women’s Health 妇女的健康

html_entity_decode will decode HTML entities, including NCR s. html_entity_decode将解码HTML实体,包括NCR For example, &#8217; 例如, &#8217; will become ' . 会变成'

<?php
$in = 'Men&#8217;s Health  
Nails, Hair &#038; Skin  
Women’s Health';

echo html_entity_decode($in);

will print 将打印

Men’s Health  
Nails, Hair & Skin  
Women’s Health

The code above is hosted here: http://ideone.com/1rWL45 上面的代码托管在这里: http : //ideone.com/1rWL45

EDIT 编辑

Your DB table might be in Latin1 and inserting Unicode (eg. ' ) characters into it will result in such mangled characters. 您的数据库表可能位于Latin1中,并且在其中插入Unicode(例如' )字符会导致此类字符混乱。 Simply replacing a few Unicode characters to ASCII may mitigate certain part of your encoding problem. 只需将几个Unicode字符替换为ASCII可能会减轻编码问题的某些部分。 However, I recommend altering table's character set to UTF-8. 但是,我建议将表的字符集更改为UTF-8。

<?php

$map = [ '’' => "'", "..." => "..." ]; // from->to pairs
$normalized = str_replace(array_keys($map), array_values($map), $string);

may be .html and .text function can help you for example: 可能是.html和.text函数可以为您提供帮助,例如:

html html

<div id="test">&lt;&lt;</div>

jquery jQuery的

var t = $('#test');
t.html(t.text());

may be this can help you js fiddle link 也许这可以帮助您js小提琴链接

Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. 某些字符在HTML中具有特殊意义,如果要保留其含义,则应由HTML实体表示。 This function returns a string with some of these conversions made; 该函数返回一个字符串,其中包含一些转换。 the translations made are those most useful for everyday web programming. 所做的翻译对于日常Web编程最有用。 If you require all HTML character entities to be translated, use htmlentities() instead. 如果您需要翻译所有HTML字符实体,请改用htmlentities()

htmlspecialchars — Convert special characters to HTML entities htmlspecialchars —将特殊字符转换为HTML实体

string htmlspecialchars ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = ini_get("default_charset") [, bool $double_encode = true ]]] )

If the input string passed to this function and the final document share the same character set, this function is sufficient to prepare input for inclusion in most contexts of an HTML document. 如果传递给此函数的输入字符串和最终文档共享相同的字符集,则此函数足以准备将输入包含在HTML文档的大多数上下文中。 If, however, the input can represent characters that are not coded in the final document character set and you wish to retain those characters (as numeric or named entities), both this function and htmlentities() (which only encodes substrings that have named entity equivalents) may be insufficient. 但是,如果输入可以表示未在最终文档字符集中编码的字符,而您希望保留这些字符(作为数字或命名实体),则此函数和htmlentities() (仅对具有命名实体的子字符串进行编码htmlentities()等值)可能不足。 You may have to use mb_encode_numericentity() instead. 您可能不得不使用mb_encode_numericentity()

The translations performed are: 执行的翻译是:

'&' (ampersand) becomes '&amp;'
'"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set.
"'" (single quote) becomes '&#039;' (or &apos;) only when ENT_QUOTES is set.
'<' (less than) becomes '&lt;'
'>' (greater than) becomes '&gt;'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM