简体   繁体   English

实现独特的页面查看计数器?

[英]Implement a unique page view counter?

I want to implement a user-facing view counter (similar to what SO has for question views) that tracks the number of unique views to a page. 我想实现一个面向用户的视图计数器(类似于SO对于问题视图的视图),它跟踪页面的唯一视图的数量。 There are a couple of similar questions here but none seem to answer my question fully. 这里有几个类似的问题,但似乎没有人完全回答我的问题。

What would be the best setup for this (in terms of database tables etc.)? 什么是最好的设置(在数据库表等方面)? Would it be good to add a 'views' column to the 'questions' table and simply increment this upon every page view? 将'views'列添加到'questions'表并在每个页面视图上增加它会不会很好? And if I want the views to be unique, I guess I could have another table with question id's and IP addresses and only increment the 'view' column if there isn't already an entry with the current IP. 如果我希望视图是唯一的,我想我可以有另一个带有问题ID和IP地址的表,如果还没有当前IP的条目,则只增加“视图”列。 However this 'ip-view' table would get enourmous really quickly...Mainly I am concerned with the overhead of having to store every page view and every IP in a table. 然而,这个'ip-view'表会很快变得很有用......主要是我担心必须将每个页面视图和每个IP存储在一个表中。

How could this be optimized so that it doesn't become a performance bottleneck? 如何对其进行优化以使其不会成为性能瓶颈? Is there a better approach than what I described? 有没有比我描述的更好的方法? Please note that it is very important for me that only unique views are counted. 请注意,对我来说非常重要的是只计算独特的观点。

Update : in addition to suggesting implementation methods, I'd also like to further understand where the performance issues come into play assuming the naive approach of simply checking if the IP exists and updating the 'view' column on every page view. 更新 :除了建议实现方法之外,我还想进一步了解性能问题在哪里发挥作用,假设只是简单地检查IP是否存在以及更新每个页面视图上的“视图”列的简单方法。 Is the main issue vast amount of insertions occuring (assuming heavy traffic) or is it more the size of the object-to-ip mapping table (which could be huge since a new row will be inserted per question for each new unique visitor). 主要问题是发生了大量的插入(假设流量很大),还是更大的对象到ip映射表的大小(这可能很大,因为每个新的唯一访问者每个问题都会插入一个新行)。 Should race conditions be considered (I just assumed that an update/increment sql statement was atomic)? 是否应考虑竞争条件(我只是假设更新/增量sql语句是原子的)? Sorry for all the questions but I am just lost as to how I should approach this. 抱歉所有的问题,但我只是迷失了我应该如何处理这个问题。

If you need to track unique views specifically, there's probably two ways to do this... unless you're operating with internal users that you can identify. 如果您需要专门跟踪独特的视图,可能有两种方法可以执行此操作...除非您使用可以识别的内部用户进行操作。 Now, in order to do this you need to keep track of every user that's visited the page. 现在,为了做到这一点,您需要跟踪访问该页面的每个用户。

Tracking can be done either server-side or client-side. 跟踪可以在服务器端或客户端完成。

Server-side will need to be IP addresses, unless you're dealing with internal users that you can identify. 服务器端需要是IP地址,除非您正在处理可以识别的内部用户。 And whenever you deal with IP addresses all the usual caveats about using them to identify people apply (there could be multiple users per IP, or multiple IPs per user) and you can't do anything about that. 每当您处理IP地址时,所有关于使用它们来识别人员的常见警告(每个IP可能有多个用户,或每个用户可能有多个IP),您无法做任何事情。

You should also consider that the "huge IP table of death" isn't that bad of a solution. 您还应该考虑“巨大的IP死亡表”并不是一个解决方案。 Performance will only become an issue if you have hundreds of thousands of users... assuming it's indexed properly, of course. 如果你有成千上万的用户,性能只会成为一个问题......当然,假设它被正确编入索引。

Client-side probably involves you leaving an "I've visited!" 客户端可能会让您离开“我已经访问过!” cookie. 曲奇饼。 If the cookie is NOT present, then increment your user count. 如果cookie不存在,则增加用户数。 If the cookie cannot be created, you'll have to live with an inflated user view. 如果无法创建cookie,则必须使用膨胀的用户视图。 And all the caveats about dealing with cookies apply... which is to say, they'll go bad eventually and disappear. 关于处理cookie的所有警告都适用......也就是说,它们最终会变坏并消失。

There seem to be a revolutionary approach (over the top of my head), which I myself isn't sure of yet about being scalable or rather feasible. 似乎有一种革命性的方法(在我的头脑中),我自己还不确定是否可扩展或相当可行。

If you really wish to store the IP in DB and wanted to avoid getting ur DB clogged up, you should think of storing them in a hierarchical order. 如果您真的希望将IP存储在数据库中并希望避免让数据库堵塞,您应该考虑按层次顺序存储它们。

<ID, IP_PART, LEVEL, PARENT_PART, VIEWS>

so, when a user visits ur website from IP 212.121.139.54, the rows in ur table would be: 因此,当用户从IP 212.121.139.54访问您的网站时,您的表中的行将是:

<1, 212, 1, 0, 0> <2, 121, 2, 1, 0> <3, 139, 3, 2, 0> <4, 54, 4, 3, 1> <1,212,1,0,0> <2,121,2,1,0> <3,139,3,2,0> <4,54,4,3,1>

Points to Note: 注意事项:

  1. Only rows with LEVEL val=4, will have the view count. 只有LEVEL val = 4的行才会有视图计数。
  2. To avoid redundancy of storing VIEWS val=0, for LEVEL val=1,2,3; 为了避免存储VIEWS val = 0的冗余,对于LEVEL val = 1,2,3; you can think of storing them in a different table. 您可以考虑将它们存储在不同的表中。
  3. The idea, as it has conceived, doesn't seem suitable for a small set of IPs. 正如它所设想的那样,这个想法似乎不适合一小部分IP。
  4. Though this may have neglected the fact that a public proxy IP sitting in front of a private network accessing ur website from more than one box. 虽然这可能忽略了这样一个事实,即位于私人网络前面的公共代理IP从多个盒子访问你的网站。 But that doesn't seem to be ur ques. 但这似乎不是问题。 i guess. 我猜。

so, chao, let me know what did u implement? 所以,chao,让我知道你实施了什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM