简体   繁体   中英

Yocto in Docker yields pseudo inode errors

We are currently building an embedded Linux OS using Yocto inside an Docker container. All persistent directories are mounted as volumes.

This is accomplished by generating an conf/site.conf setting those directories:

DL_DIR="/artifacts/downloads"
TMPDIR="/artifacts/tmp"
SSTATE_DIR="/artifacts/sstate_cache"
PERSISTENT_DIR="/artifacts/persistent"
DEPLOY_DIR_IMAGE="/images"
DEPLOY_DIR_IPK="/ipk"

And therefore running the image with

docker run --rm \
  -v /opt/yocto/projectname:/artifacts \
  -v /opt/deploy/projectname/ipk:/ipk \
  -v /opt/deploy/projectname/images:/images \
  -it <container>

All of this is working fine, the output is deployed as expected and everything works great.

However, upon rebuilding various recipes due to updates, we see yoctos pseudo build-environment abort()ing frequently. Most of the time its an rm or an tar command being killed.

Almost all errors are ino path mismatches like

path mismatch [1 link]: ino 23204894 db '/ipk/aarch64/glibc-charmap-jis-c6229-1984-a_2.35-r0_aarch64.ipk' req '/ipk/aarch64/locale-base-is-is_2.35-r0.1_aarch64.ipk'.
dir err : 107508467 ['/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/glibc-binary-localedata-nb-no.iso-8859-1/CONTROL'] (db '/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/glibc-binary-localedata-sgs-lt/CONTROL/control') db mode 0100644, header mode 040755 (unlinking db)
Child process exit status 4: lock_held
Couldn't obtain lock: Resource temporarily unavailable.
lock already held by existing pid 3365057.

(I appended the error logs of the this example at the end of this post)

or just plain

path mismatch [1 link]: ino 23200106 db '/ipk/aarch64/libcap-ng-doc_0.8.2-r0_aarch64.ipk' req '/ipk/aarch64/libcap-ng-doc_0.8.2-r0.1_aarch64.ipk'.
Setup complete, sending SIGUSR1 to pid 2167709.

When we were building the same project natively without docker, we never have seen errors like this. So we assume there are some compatibility issues with docker and pseudo. We already tried dockers devicemapper and overlay2 storage drivers. The current workaround is to delete those affected files manually. But this mostly leads to other problems down the line.

We are out of ideas where to look in solving the problem. No yocto resources regarding pseudo-errors were of any help.

Is there any hint to debug those errors in a meaningful way or do we have to refactor the docker builds somehow to prevent those pseudo-errors?


Logs

Bitbake output

DEBUG: Hardlink test failed with [Errno 18] Invalid cross-device link: '/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/deploy-ipks/aarch64/glibc-localedata-tk-tm_2.35-r0.1_aarch64.ipk' -> '/ipk/testfile'
ERROR: Error executing a python function in exec_func_python() autogenerated:

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_func_python() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:sstate_task_postfunc(d)
     0003:
File: '/yoctoagl/external/poky/meta/classes/sstate.bbclass', lineno: 822, function: sstate_task_postfunc
     0818:
     0819:    sstateinst = d.getVar("SSTATE_INSTDIR")
     0820:    d.setVar('SSTATE_FIXMEDIR', shared_state['fixmedir'])
     0821:
 *** 0822:    sstate_installpkgdir(shared_state, d)
     0823:
     0824:    bb.utils.remove(d.getVar("SSTATE_BUILDDIR"), recurse=True)
     0825:}
     0826:sstate_task_postfunc[dirs] = "${WORKDIR}"
File: '/yoctoagl/external/poky/meta/classes/sstate.bbclass', lineno: 418, function: sstate_installpkgdir
     0414:
     0415:    for state in ss['dirs']:
     0416:        prepdir(state[1])
     0417:        bb.utils.rename(sstateinst + state[0], state[1])
 *** 0418:    sstate_install(ss, d)
     0419:
     0420:    for plain in ss['plaindirs']:
     0421:        workdir = d.getVar('WORKDIR')
     0422:        sharedworkdir = os.path.join(d.getVar('TMPDIR'), "work-shared")
File: '/yoctoagl/external/poky/meta/classes/sstate.bbclass', lineno: 343, function: sstate_install
     0339:
     0340:    # Run the actual file install
     0341:    for state in ss['dirs']:
     0342:        if os.path.exists(state[1]):
 *** 0343:            oe.path.copyhardlinktree(state[1], state[2])
     0344:
     0345:    for postinst in (d.getVar('SSTATEPOSTINSTFUNCS') or '').split():
     0346:        # All hooks should run in the SSTATE_INSTDIR
     0347:        bb.build.exec_func(postinst, d, (sstateinst,))
File: '/yoctoagl/external/poky/meta/lib/oe/path.py', lineno: 134, function: copyhardlinktree
     0130:            s_dir = os.getcwd()
     0131:        cmd = 'cp -afl --preserve=xattr %s %s' % (source, os.path.realpath(dst))
     0132:        subprocess.check_output(cmd, shell=True, cwd=s_dir, stderr=subprocess.STDOUT)
     0133:    else:
 *** 0134:        copytree(src, dst)
     0135:
     0136:def copyhardlink(src, dst):
     0137:    """Make a hard link when possible, otherwise copy."""
     0138:
File: '/yoctoagl/external/poky/meta/lib/oe/path.py', lineno: 94, function: copytree
     0090:    # This way we also preserve hardlinks between files in the tree.
     0091:
     0092:    bb.utils.mkdirhier(dst)
     0093:    cmd = "tar --xattrs --xattrs-include='*' -cf - -S -C %s -p . | tar --xattrs --xattrs-include='*' -xf - -C %s" % (src, dst)
 *** 0094:    subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT)
     0095:
     0096:def copyhardlinktree(src, dst):
     0097:    """Make a tree of hard links when possible, otherwise copy."""
     0098:    bb.utils.mkdirhier(dst)
File: '/usr/lib/python3.9/subprocess.py', lineno: 424, function: check_output
     0420:        else:
     0421:            empty = b''
     0422:        kwargs['input'] = empty
     0423:
 *** 0424:    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
     0425:               **kwargs).stdout
     0426:
     0427:
     0428:class CompletedProcess(object):
File: '/usr/lib/python3.9/subprocess.py', lineno: 528, function: run
     0524:            # We don't call process.wait() as .__exit__ does that for us.
     0525:            raise
     0526:        retcode = process.poll()
     0527:        if check and retcode:
 *** 0528:            raise CalledProcessError(retcode, process.args,
     0529:                                     output=stdout, stderr=stderr)
     0530:    return CompletedProcess(process.args, retcode, stdout, stderr)
     0531:
     0532:
Exception: subprocess.CalledProcessError: Command 'tar --xattrs --xattrs-include='*' -cf - -S -C /artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/deploy-ipks -p . | tar --xattrs --xattrs-include='*' -xf - -C /ipk' returned non-zero exit status 134.

Subprocess output:
abort()ing pseudo client by server request. See https://wiki.yoctoproject.org/wiki/Pseudo_Abort for more details on this.
Check logfile: /artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/pseudo//pseudo.log
Aborted (core dumped)

DEBUG: Python function sstate_task_postfunc finished

pseudo.log

debug_logfile: fd 2
pid 1234878 [parent 1234771], doing new pid setup and server start
Setup complete, sending SIGUSR1 to pid 1234771.
db cleanup for server shutdown, 16:26:33.548
memory-to-file backup complete, 16:26:33.548.
db cleanup finished, 16:26:33.548
debug_logfile: fd 2
pid 997357 [parent 997320], doing new pid setup and server start
Setup complete, sending SIGUSR1 to pid 997320.
db cleanup for server shutdown, 16:52:05.287
memory-to-file backup complete, 16:52:05.288.
db cleanup finished, 16:52:05.288
debug_logfile: fd 2
pid 30407 [parent 30405], doing new pid setup and server start
Setup complete, sending SIGUSR1 to pid 30405.
dir err : 20362030 ['/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/locale-base-kab-dz/CONTROL'] (db '/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/locale-base-fr-ca.iso-8859-1/CONTROL/control') db mode 0100644, header mode 040755 (unlinking db)
debug_logfile: fd 2
pid 1901634 [parent 1901625], doing new pid setup and server start
Setup complete, sending SIGUSR1 to pid 1901625.
db cleanup for server shutdown, 10:40:12.401
memory-to-file backup complete, 10:40:12.402.
db cleanup finished, 10:40:12.402
debug_logfile: fd 2
pid 3365057 [parent 3364988], doing new pid setup and server start
Setup complete, sending SIGUSR1 to pid 3364988.
debug_logfile: fd 2
pid 3365111 [parent 3365055], doing new pid setup and server start
lock already held by existing pid 3365057.
Couldn't obtain lock: Resource temporarily unavailable.
Child process exit status 4: lock_held
dir err : 107508467 ['/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/glibc-binary-localedata-nb-no.iso-8859-1/CONTROL'] (db '/artifacts/tmp/work/aarch64-agl-linux/glibc-locale/2.35-r0/packages-split/glibc-binary-localedata-sgs-lt/CONTROL/control') db mode 0100644, header mode 040755 (unlinking db)
path mismatch [1 link]: ino 23204894 db '/ipk/aarch64/glibc-charmap-jis-c6229-1984-a_2.35-r0_aarch64.ipk' req '/ipk/aarch64/locale-base-is-is_2.35-r0.1_aarch64.ipk'.
db cleanup for server shutdown, 10:41:32.164
memory-to-file backup complete, 10:41:32.164.
db cleanup finished, 10:41:32.164

did you make progress on the resolution of the issue? We are facing to the same problem, no solution so far except modifying the code of pseudo. Thanks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM