简体   繁体   English

从Java Iterator中随机跳过“X”字的百分比

[英]Randomly skip 'X' percentage of words from Java Iterator

I have some java code as : 我有一些java代码:

   String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);

   while (tokenizer.hasMoreTokens()) {
         // do someything   
   }

However, I want the code to randomly skip X percentage of tokens. 但是,我希望代码随机跳过X的令牌百分比。

Example : If tokens are [a , b , c , d] and skip percentage is 50% Valid execution could be printing any two tokens, say [ b , c ] or [a , d] etc 示例:如果标记为[a,b,c,d]且跳过百分比为50%有效执行可能是打印任意两个标记,例如[b,c]或[a,d]等

How can I implement it in the simplest manner? 我怎样才能以最简单的方式实现它?

first calculate the amount to skip ie (.50)*tokens.length (note thats pseudo code) 首先计算要跳过的金额,即(.50)* tokens.length(注意那是伪代码)

Then I would create an array of length tokens.length and fill it with the selected amount of 1's and the rest 0's 然后我会创建一个长度为tokens.length的数组,并用选定数量的1和其他0填充它

ie for 50% of 10 [1,1,1,1,1,0,0,0,0,0] 即10%[1,1,1,1,1,0,0,0,0,0]的50%

Then do a simple shuffle algorithm ( Random shuffling of an array ) 然后做一个简单的shuffle算法( 数组的随机改组

to get something like [0,1,1,0,0,1,0,1,1,0] 得到类似[0,1,1,0,0,1,0,1,1,0]的东西

Then as you run through your tokenizer loop walk throught this array and check 然后,当您遍历您的tokenizer循环时,请遍历此数组并检查

(if thisArray[i]==1){
  print(token);
}

First Solution: 第一解决方案

double percentage = 50.0;
int max = (int)percentage * token.length;

int[] skip = new int[token.length];
int count = 0;
while(count < max)
{
    int rand = rnd.nextInt(token.length);
    if(skip[rand] == 0){
        skip[rand] = 1;
        count++;
    }
}

//Use a for loop to print token where the index of skip is 0, and skip index of those with 1.

You may consider this. 你可以考虑这个。 Create a 1D array of switches (Can be boolean too). 创建一个开关数组(也可以是布尔值)。 Generate 1D array of random switches with size similar to token length. 生成大小与令牌长度相似的随机开关的一维数组。 Print token element if switch of the corresponding index is true, else don't print. 如果相应索引的切换为true,则打印令牌元素,否则不打印。


Second solution: 二解决方案:

Convert your token of array to an arrayList.
int count = 0, x = 0;

while(printed < max){  //where max is num of elements to be printed

    int rand = rnd.nextInt(2); //generate 2 numbers: 50% chance

    if (rand == 0){
        System.out.println(list.get(x);
        list.remove(x);
        printed ++;
    }
    x++;
}

Roll a probability (eg 50% chance) whether to print current element for every iteration. 滚动概率(例如50%几率)是否为每次迭代打印当前元素。 Once element is printed, remove it from list, so you won't print duplicates. 打印元素后,将其从列表中删除,这样就不会打印重复项。


Third solution: 第三种方案:

Randomly remove a percentage (eg 50%) of elements from your token. 从令牌中随机删除一定百分比(例如50%)的元素。 Just print the rest. 打印其余部分。 This is probably one of the most straight forward way I can think of. 这可能是我能想到的最直接的方式之一。

The following uses Floyd's subset selection algorithm to select a random subset of specified size. 以下使用Floyd的子集选择算法来选择指定大小的随机子集。 This may be overkill for a small number of tokens, but it's pretty darned efficient for larger sets. 对于少量令牌来说,这可能有点过分,但对于较大的集合来说,它的效率相当高。

import java.util.HashSet;

public class FloydsSubsetSelection {

   /*
    * Floyd's algorithm to chose a random subset of m integers
    * from a set of n, outcomes are zero-based.
    */
   public static HashSet<Integer> generateMfromN(int m, int n) {
      HashSet<Integer> s = new HashSet<Integer>();
      for (int j = n-m; j < n; ++j) {
         if(! s.add((int)((j+1) * Math.random()))) {
            s.add(j);
         }
      }
      return s;
   }

   public static void main(String[] args) {
      // Stuff the tokens into an array.  I've used chars,
      // but these could be anything you want.  You can also
      // store them in any container which is indexable.
      char[] tokens = {'a', 'b', 'c', 'd', 'e', 'f'};
      int desired_percent = 50;     // change as desired

      // Convert desired percent to a count.  I added 1/2 to cause rounding
      // rather than truncation, change if different behavior is desired.
      int m = (int) (((desired_percent * tokens.length) + 0.5) / 100.0);
      HashSet<Integer> results = generateMfromN(m, tokens.length);
      for (int i: results) {                 // iterate through the generated subset
         System.out.print(tokens[i] + " ");  // to print the selected tokens
      }
      System.out.println();
   }
}
 String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);
   double percentage = 1.0 / 0.5 // replace 0.5 with the percentage you want
   int x = 0;
   while (tokenizer.hasMoreTokens()) {
         ++x;
         if (x >= percentage) {
              // print here
              x = 0;
         }
   }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM