简体   繁体   中英

Is there a reason why sites like Facebook/Digg/Reddit would not parse the proper meta tags on a page for title/description?

Any given article on our site has the meta tags for title, description, image, and keywords in the head element, but for some reason none of the news aggregate sites won't pull any of it.

http://darthhater.com/2010/06/25/friday-update-preview http://darthhater.com/2010/06/24/official-bioware-stance-on-game-testing-leaks

Not trying to post an advertisement. We really do have a problem. The share link is in the bottom right of the article with links to Facebook, Digg, and Reddit. It's too bad none of them provide debugging systems to figure out why stuff is improperly pulled into their system.

I'm thinking it might have something to do with the gzip compression of the site, or maybe because the PHP XSL parser is outputting the site as XML (I remove the start tag programmatically, but even if I set the XSL to 'html' the problem persists. I thought maybe it had to do with stripped whitespace, or the order of the meta tags (ridiculous, I know). It's a little annoying, and if I put our URLs into SEO checkers like seocentro.com it find all of the meta tags just fine, so it's obviously not a page parsing error on their end.

My shot in the dark is that this is because you have the head part in one huge line:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:magasi="http://www.magasi-php.com/" xmlns:php="http://www.w3.org/1999/XSL/Transform"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><meta name="title" content="Friday Update Preview" /><meta name="description" content="Sean Dahlberg, Star Wars: The Old Republic Community Manager, informs the community that tomorrow's update will be a late one:  Just wanted to let everyone kno..." /><link rel="image_src" href="http://darthhater.com/images/fbimage.jpg" /><meta name="keywords" content="Friday Preview,Sean Dahlberg" /><link rel="alternate" type="application/rss+xml" title="Darth Hater - A Star Wars: The Old Republic Community RSS Feed" href="http://darthhater.com/feed/" /><link type="text/css" rel="stylesheet" href="/styles/DarthHater/style/main.css" /><script type="text/javascript" language="javascript">

it's probably valid HTML, but I wouldn't be surprised if a parser choked on it.

Also, you have 438 validation errors . This is probably not your problem, as it's mostly minor things and parsers should be able to deal with invalid HTML, but one never knows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM