简体   繁体   English

如何使变量可用于Apache Flink中的所有TaskManagers?

[英]How to make a variable available to all of the TaskManagers in Apache Flink?

I need to set a list of values in my program and access them in all of the task managers. 我需要在程序中设置一个值列表,并在所有任务管理器中访问它们。 Currently, I declared a public field in my main class and set the values. 目前,我在主类中声明了一个公共字段并设置了值。 Later in my program, which will be run in a remote cluster, I would like to have access to this variables in all task managers. 稍后在我的程序中,它将在远程集群中运行,我希望能够在所有任务管理器中访问这些变量。 Here is my sample code. 这是我的示例代码。 It seems there is a problem however: there is not any compile or run-time error, the values are not available to the task managers. 但是,似乎存在一个问题:没有任何编译或运行时错误,任务管理器无法使用这些值。

public class myMainClass {
  public static ArrayList<String> mykey =  new ArrayList<String>();

  public static void main(String[] args) throws Exception {
    // assign value to the variable
    partitionedData = partitionedData.partitionCustom(new MyPartitioner(myKey), 2);
  }
}

public static class MyPartitioner implements Partitioner<String> {
 public String [] partitionKeys;
public static ArrayList<String> mykey;
 public MyPartitioner(ArrayList<String> mykey) {
        this.mykey = mykey;
    }

  @Override
  public int partition(String key, int numPartitions) {
    for (int i=0 ; i< numParalell-1 ; i++) {
      if(mykey.get(i).compareToIgnoreCase(key) > 0)
        return i;
    }

    return numParalell-1 ;              
  }
}

I would pass the mykey list as an constructor argument to the MyPartitioner class. 我会将mykey列表作为构造函数参数传递给MyPartitioner类。

Your code would look like this: 您的代码如下所示:

public class myMainClass {
    public static void main(String[] args) throws Exception {
        ArrayList<String> mykey =  new ArrayList<String>();
        // assign value to the vaiable
        partitionedData = partitionedData.partitionCustom(new MyPartitioner(mykey), 2);
    }
}

public static class MyPartitioner implements Partitioner<String> {
    private final ArrayList<String> mykey;
    public String [] partitionKeys;

    public MyPartitioner(ArrayList<String> mykey) {
        this.mykey = mykey;
    }

    @Override
    public int partition(String key, int numPartitions) {
        for (int i=0 ; i< numParalell-1 ; i++) {
            if(mykey.get(i).compareToIgnoreCase(key) > 0)
                return i;
        }

        return numParalell-1 ;
    }
}

I am not sure what you want to accomplish. 我不确定你想要完成什么。 If you want to pre-compute a (non-changing) value and distribute it to all task managers (I assume you need access those value in some operators), you can simple give those value via constructor parameters to your UDFs or use Flink's broadcast variables: https://ci.apache.org/projects/flink/flink-docs-release-0.8/programming_guide.html#broadcast-variables 如果要预先计算(不变)值并将其分发给所有任务管理器(我假设您需要在某些运算符中访问这些值),您可以通过构造函数参数将这些值简单地提供给UDF或使用Flink的广播变量: https//ci.apache.org/projects/flink/flink-docs-release-0.8/programming_guide.html#broadcast-variables

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM