[英]How to use custom JDBC jar file from GCS in Apache Beam Java SDK
我有一個用例,即從 GCS 讀取文件並通過 Apache Beam 將其寫入我們自己的數據倉庫產品。 我們有一個自定義的 JDBC 驅動程序(.jar)來連接倉庫,我正在嘗試使用 Apache Beam 的 JdbcIO 來執行 ETL 和 maven-pom 來管理依賴項。 有人可以幫助我了解如何在 Apache Beam 中利用這個自定義 jar 文件嗎?
p.apply(JdbcIO.<KV<Integer, String>>read()
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
"MYDRIVERCLASS", "DATABASE_URL")
.withUsername("username")
.withPassword("password"))
.withQuery("select id,name from Person")
.withCoder(KvCoder.of(BigEndianIntegerCoder.of(), StringUtf8Coder.of()))
.withRowMapper(new JdbcIO.RowMapper<KV<Integer, String>>() {
public KV<Integer, String> mapRow(ResultSet resultSet) throws Exception {
return KV.of(resultSet.getInt(1), resultSet.getString(2));
}
})
);
@Experimental(Experimental.Kind.SOURCE_SINK)
public class JdbcIO {
/**
* Read data from a JDBC datasource.
*
* @param Type of the data to be read.
*/
public static Read read() {
return new AutoValue_JdbcIO_Read.Builder().build();
}
/**
* Like {@link #read}, but executes multiple instances of the query substituting each element
* of a {@link PCollection} as query parameters.
*
* @param Type of the data representing query parameters.
* @param Type of the data to be read.
*/
public static ReadAll readAll() {
return new AutoValue_JdbcIO_ReadAll.Builder().build();
}
/**
* Write data to a JDBC datasource.
*
* @param Type of the data to be written.
*/
public static Write write() {
return new AutoValue_JdbcIO_Write.Builder().build();
}
private JdbcIO() {}
/**
* An interface used by {@link JdbcIO.Read} for converting each row of the {@link ResultSet} into
* an element of the resulting {@link PCollection}.
*/
@FunctionalInterface
public interface RowMapper extends Serializable {
T mapRow(ResultSet resultSet) throws Exception;
}
/**
* A POJO describing a {@link DataSource}, either providing directly a {@link DataSource} or all
* properties allowing to create a {@link DataSource}.
*/
@AutoValue
public abstract static class DataSourceConfiguration implements Serializable {
@Nullable abstract String getDriverClassName();
@Nullable abstract String getUrl();
@Nullable abstract String getUsername();
@Nullable abstract String getPassword();
@Nullable abstract String getConnectionProperties();
@Nullable abstract DataSource getDataSource();
abstract Builder builder();
@AutoValue.Builder
abstract static class Builder {
abstract Builder setDriverClassName(String driverClassName);
abstract Builder setUrl(String url);
abstract Builder setUsername(String username);
abstract Builder setPassword(String password);
abstract Builder setConnectionProperties(String connectionProperties);
abstract Builder setDataSource(DataSource dataSource);
abstract DataSourceConfiguration build();
}
public static DataSourceConfiguration create(DataSource dataSource) {
checkArgument(dataSource != null, "dataSource can not be null");
checkArgument(dataSource instanceof Serializable, "dataSource must be Serializable");
return new AutoValue_JdbcIO_DataSourceConfiguration.Builder()
.setDataSource(dataSource)
.build();
}
public static DataSourceConfiguration create(String driverClassName, String url) {
checkArgument(driverClassName != null, "driverClassName can not be null");
checkArgument(url != null, "url can not be null");
return new AutoValue_JdbcIO_DataSourceConfiguration.Builder()
.setDriverClassName(driverClassName)
.setUrl(url)
.build();
}
public DataSourceConfiguration withUsername(String username) {
return builder().setUsername(username).build();
}
public DataSourceConfiguration withPassword(String password) {
return builder().setPassword(password).build();
}
/**
您可以按照本示例構建和運行您的文件。 您可以查看更多文檔
# Build the project.
gradle('build')
# Check the generated build files.
run('ls -lh build/libs/')
# Run the shadow (fat jar) build.
gradle('runShadow')
# Sample the first 20 results, remember there are no ordering guarantees.
run('head -n 20 outputs/part-00000-of-*')
要使用其他依賴 jar,您可以在運行 Beam Java 管道時簡單地將此類 jar 添加到 CLASSPATH。 CLASSPATH 中的所有 jar 都應該由 Beam runner 進行上演。
您還可以使用此PipelineOption 來指定依賴項。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.