如何从网址中删除Google跟踪参数（UTM）？

Question

I have a bunch of URLs which I would like to clean. 我有一堆要清理的URL。 They all contain UTM parameters, which are not necessary, or rather harmful in this case. 它们都包含UTM参数，这些参数不是必需的，在这种情况下甚至是有害的。 Example: 例：

http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29

All potential parameters begin with utm_ . 所有可能的参数都以utm_开头。 How can I remove them easily with a ruby script / structure without destroying other potentialy "good" URL parameters? 如何使用ruby脚本/结构轻松删除它们，而又不破坏其他潜在的“良好” URL参数？

Answer 1

This uses the URI lib to deconstruct and change the querystring (no regex): 这使用URI库来解构和更改查询字符串（无正则表达式）：

require 'uri'
str ='http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'

uri = URI.parse(str)
clean_key_vals = URI.decode_www_form(uri.query).reject{|k, _| k.start_with?('utm_')}
uri.query = URI.encode_www_form(clean_key_vals)
p uri.to_s #=> "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"

Answer 2

You can apply a regex to the urls to clean them up. 您可以将正则表达式应用于网址以进行清理。 Something like this should do the trick: 这样的事情应该可以解决问题：

url = 'http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'
url.gsub(/&?utm_.+?(&|$)/, '') => "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"

如何从网址中删除Google跟踪参数（UTM）？

问题描述

2 个解决方案

解决方案1
11 2012-10-10 15:56:55

解决方案2
8 已采纳 2012-10-10 15:01:06

如何从网址中删除Google跟踪参数（UTM）？

问题描述

2 个解决方案

解决方案1 11 2012-10-10 15:56:55

解决方案2 8 已采纳 2012-10-10 15:01:06

解决方案1
11 2012-10-10 15:56:55

解决方案2
8 已采纳 2012-10-10 15:01:06