繁体   English   中英

GKE 升级后,Google Kubernetes 上的 DSpace jspui 部署失败

[英]DSpace jspui on Google Kubernetes deployment fails after GKE upgrade

我们将 DSpace 6.3 部署到 Google Kubernetes Engine (GKE) 上,该部署一直运行良好。 但是,当我们将 GKE 从 v1.12.7-gke.24 升级到 1.14.10-gke.50 时,容器突然出现故障。 对 k8s 版本的更改是工作和失败的 k8s 节点之间的唯一区别。 本地构建的 docker 容器运行良好。 我们正在单独的容器(例如 solr)中部署其他 DSpace 模块,它们工作正常,只有 jspui 模块出现故障。

DSpace 分支“dspace-6_x”标签“dspace-6.3”

Docker 镜像:tomcat:8-alpine

通过 gitlab CI/CD 管道部署

失败是由于 Spring Loader 在调用各种 DSpace 工厂服务单例模式 bean 的早期加载时发生故障而导致的。 这会导致加载站点时出现 404 错误,因为 Web 应用程序未能初始化。

/usr/local/tomcat/log/localhost.YYYY-MM-dd.log 中的错误信息:

28-Oct-2020 23:47:18.668 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.listenerStart 
Exception sending context initialized event to listener instance of class 
[org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener]
        java.lang.RuntimeException: Failure during filter init: Failed to startup the DSpace Service 
Manager: failure starting up spring service manager: Error creating bean with name 
'org.dspace.app.sherpa.submit.SHERPASubmitService' defined in URL 
[jar:file:/dspace/webapps/jspui/WEB-INF/lib/dspace-api-6.3.jar!/spring/spring-dspace-addon-sherpa-services.xml]: 
Cannot resolve reference to bean 'org.dspace.app.sherpa.submit.SHERPASubmitConfigurationService' while setting 
bean property'configuration'; nested exception is org.springframework.beans.factory.BeanCreationException: 
Error creating bean with name 'org.dspace.app.sherpa.submit.SHERPASubmitConfigurationService' defined in 
file [/dspace/config/spring/api/sherpa.xml]: Cannot create inner bean 
'org.dspace.app.sherpa.submit.MetadataValueISSNExtractor#1b511285' of type 
[org.dspace.app.sherpa.submit.MetadataValueISSNExtractor] while setting bean property 
'issnItemExtractors' with key [0]; nestedexception is 
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 
'org.dspace.app.sherpa.submit.MetadataValueISSNExtractor#1b511285': Injection of autowired dependencies 
failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire 
field: public org.dspace.content.service.ItemService 
org.dspace.app.sherpa.submit.MetadataValueISSNExtractor.itemService; nested exception is 
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 
'org.dspace.content.ItemServiceImpl#0': Injection of autowireddependencies failed; nested exception is
 org.springframework.beans.factory.BeanCreationException: Could not autowire field: protected 
org.dspace.handle.service.HandleService org.dspace.content.DSpaceObjectServiceImpl.handleService; ...

在以下位置引发“启动 spring 服务管理器失败”错误消息:

org.dspace.servicemanager.DSpaceServiceManager(\\dspace-services\\src\\main\\java\\org\\dspace\\servicemanager\\DSpaceServiceManager.java 第 215 行)

在第 212 行的 catch 语句中调用:

org.dspace.servicemanager.spring.SpringServiceManager.startup()(\\dspace-services\\src\\main\\java\\org\\dspace\\servicemanager\\spring\\SpringServiceManager.java 第 177 行)

它使用 Spring 框架提前加载工厂 bean。

我们的第一个想法是新的 k8s 版本可能需要更多内存。 所以我们将Tomcat内存从1.5GB增加到4GB。 这并没有解决问题。

我们研究了两次升级之间的 GKE 中间版本的发行说明,但没有发现任何帮助。

我们曾尝试使用其他 Tomcat docker 镜像,但无济于事。 因此,我们认为这不是操作系统的问题。

远程调试无法以足够快的速度连接到 Tomcat 以捕获异常。 我们尝试了 Google Cloud Debugger for Java,但 Alpine Linux 缺少一些必需的库。 无论如何,我不相信我们会发现比记录的错误消息更有帮助的东西。

如果有人有任何想法,我们将不胜感激。

我们的生产 k8s 配置 yaml 文件:

ingress:
  hosts:
    - our.url.uts.edu.au

database:
  secret: our_password
  name: our_db_name
  host: "our.db.instance.url"
  port: "5432"

dspace:
  env:
    - name: DSPACE_HOSTNAME
      value: our.url.uts.edu.au
    - name: SOLR_PORT
      value: "8080"
      # Include colon if port is specified
    - name: DSPACE_PORT
      value: ""
    - name: MAX_DB_CONNECTIONS
      value: "50"
    - name: "MAX_IDLE_DB_CONNECTIONS"
      value: "30"
    - name: INITIAL_DB_CONNECTIONS
      value: "20"
    - name: S3_ASSETSTORE_SUBFOLDER
      value: "our_folder"
    - name: S3_CONNECTION_TTL
      value: "120000"
    - name: S3_MAX_CONNECTIONS
      value: "50"
    - name: REST_EVENT_WEBHOOK_URL
      value: http://our.rest.service.url/dspace/v2/webhook
    - name: UTSLIB_FRAMEWORK_DSPACE_TOKEN
      value: OUR_TOKEN
    - name: CATALINA_OPTS
      value: "-Xms1512m -Xmx1512m"

  resources:
    requests:
      memory: "1640Mi"
      cpu: 100m
    limits:
      memory: "1896Mi"
      cpu: "450m"

solr:
  pvc:
    accessModes:
      - ReadWriteOnce
    annotations: {}
    size: 35Gi

  env:
    - name: CATALINA_OPTS
      value: "-Xms3904m -Xmx3904m -XX:+UseG1GC"

  resources:
    requests:
      memory: "4032Mi"
      cpu: 50m
    limits:
      memory: "4096Mi"
      cpu: "800m"

cron:
  env:
    - name: SOLR_PORT
      value: "8080"
    - name: MAX_DB_CONNECTIONS
      value: "3"
    - name: MAX_IDLE_DB_CONNECTIONS
      value: "1"
    - name: INITIAL_DB_CONNECTIONS
      value: "0"
    - name: S3_ASSETSTORE_SUBFOLDER
      value: "our_folder"
    - name: S3_CONNECTION_TTL
      value: "120000"
    - name: S3_MAX_CONNECTIONS
      value: "50"
    - name: JAVA_OPTS
      value: "-Xms32m -Xmx384m"
    - name: REST_EVENT_WEBHOOK_URL
      value: http://our.rest.service.url/dspace/v2/webhook
    - name: UTSLIB_FRAMEWORK_DSPACE_TOKEN
      value: OUR_TOKEN

我们的 Dockerfile 分为构建和运行时进程。 Dockerfile.build

FROM maven:3-jdk-8

# Modules that should be excluded from depdendency resolution
ARG EXCLUDE_MODULES=!dspace-rdf,!dspace-sword,!dspace-xmlui,!dspace-xmlui-mirage2

ENV DSPACE_VERSION=6.3 \
    DSPACE_SHA1=e60db8dee2726933fcc7b7949c16757a510a79c5

ENV ANT_VERSION=1.10.8
ENV ANT_HOME=/opt/ant-$ANT_VERSION
ENV PATH=$ANT_HOME/bin:$PATH \
    ANT_SHA1=20658b765bed8a7c3d18daa71a108e15d1937da2

WORKDIR /dspace-src

# Download DSpace source and install Ant
RUN curl -fSL "https://github.com/DSpace/DSpace/releases/download/dspace-${DSPACE_VERSION}/dspace-${DSPACE_VERSION}-src-release.tar.gz" -o dspace.tar.gz && \
    echo "${DSPACE_SHA1} *dspace.tar.gz" | sha1sum -c - && \
    tar -xz -f dspace.tar.gz --strip-components=1 && \
    rm -f dspace.tar.gz && \
    curl -fSL "https://archive.apache.org/dist/ant/binaries/apache-ant-${ANT_VERSION}-bin.tar.gz" -o ant.tar.gz && \
    echo "${ANT_SHA1} *ant.tar.gz" | sha1sum -c - && \
    mkdir ${ANT_HOME} && \
    tar -xz -f ant.tar.gz -C ${ANT_HOME} --strip-components=1 && \
    rm -rf ant.tar.gz

# Copy in custom artifacts
COPY ./src/artifacts/ ./artifacts

# Copy in pom.xml files
COPY ./src/dspace/pom.xml                          ./dspace/
COPY ./src/dspace/modules/pom.xml                  ./dspace/modules/
COPY ./src/dspace/modules/jspui/pom.xml            ./dspace/modules/jspui/
COPY ./src/dspace/modules/utslib-copyright/pom.xml ./dspace/modules/utslib-copyright/
COPY ./src/dspace/modules/utslib-taglib/pom.xml    ./dspace/modules/utslib-taglib/

# Install custom artifacts and prime the Maven repository 
RUN mvn clean install --batch-mode --fail-never -f ./artifacts/JRis-master && \
    mvn install -P ${EXCLUDE_MODULES} --batch-mode --fail-never -T 5

Dockerfile.runtime:

ARG BUILD_IMAGE=our.git.url/dspace/build:latest

FROM ${BUILD_IMAGE} as build

# Copy in our source changes
COPY ./src/dspace ./dspace

# We don't use these modules, but they'll be built anyway if not excluded
ARG EXCLUDE_MODULES=!dspace-rdf,!dspace-xmlui,!dspace-sword

# Unzip the MaxMind GeoLite database (IP location stuff for Solr).
# (MaxMind changed their privacy policy so you now have to login to download,
# which makes it fail for the standard DSpace installation)
# Build dspace with our source changes and move it to the installation directory
# Build only our customisations (skip building the specified modules)
# Could multithread the maven build, but there's dependency resolution problems
RUN tar -zxf ./dspace/config/GeoLite2-City_20191224.tar.gz --strip-components=1 -C ./dspace/config && \
    rm ./dspace/config/GeoLite2-City_20191224.tar.gz && \
    mvn package --batch-mode -P ${EXCLUDE_MODULES} -f ./dspace/pom.xml && \
    cd ./dspace/target/dspace-installer && \
    ant copy_webapps install_code

FROM tomcat:8-alpine
#FROM tomcat:8-jre8

ARG DSPACE_INSTALL_DIR=/dspace

ENV DSPACE_HOME=${DSPACE_INSTALL_DIR}

# Copy built source into this image
COPY --from=build ${DSPACE_INSTALL_DIR} ${DSPACE_INSTALL_DIR}

# Copy in our config overrides
# (These are not used in compilation, but are applied at runtime)
COPY ./src/local.cfg ${DSPACE_INSTALL_DIR}/config/

# Symlink all webapps and create temp upload directory
RUN ln -s ${DSPACE_INSTALL_DIR}/webapps/* ./webapps/

在 DSpace 和 Tomcat 中实现最详细的日志记录级别后,可以获得有关 Spring 错误来源的更多信息。

问题出在我们的一个自定义工厂类上。 错误日志摘录:

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 
'org.dspace.storage.bitstore.BitstreamStorageService' defined in file 
[/dspace/config/spring/api/bitstore.xml]: Cannot resolve reference to bean 's3Store' while 
setting bean property 'stores' with key [TypedStringValue: value [0], target type [null]]; 
nested exception is org.springframework.beans.factory.BeanCreationException: Error 
creating bean with name 's3Store' defined in file 
[/dspace/config/spring/api/bitstore.xml]: Error setting property values; nested exception 
is org.springframework.beans.NotWritablePropertyException: Invalid property 
's3ConnectionTTL' of bean class [org.dspace.storage.bitstore.S3BitStoreService]: Bean 
property 's3ConnectionTTL' is not writable or has an invalid setter method. Does the 
parameter type of the setter match the return type of the getter?

违规属性是可写的,具有有效的 getter 和 setter,并且 getter 和 setter 都是 long 类型。 我删除了属性集代码,让它保持默认值。 部署工作。

我们无法简单地提高 k8s 版本会导致此错误。 在具有先前 GKE 版本的 pod 中,完全相同的代码可以正常执行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM