简体   繁体   中英

How to delete old artifacts in maven local repository in jenkins

I have a .m2/repository directory which has a lot of old artifacts. Is there a way to clean up the .m2/repository folder with a script or any plugins.

Also I would like to tell that the I want to delete the artifacts that are older than 14 days. The .m2/repository folder has no of subfolders.

Any lead would be highly appriciated

Something like this would be your answer :

now = new Date()
configuration = new Configuration()
cleanedSize = 0
details = []
directoryFilter = new DirectoryFilter()
nonSnapshotDirectoryFilter = new NonSnapshotDirectoryFilter()

def class Configuration {
    def homeFolder = System.getProperty("user.home")
    def path = homeFolder + "/.m2/repository"
    def dryRun = true
    def printDetails = true
    def maxAgeSnapshotsInDays = 60
    def maxAgeInDays = 14
    def versionsToKeep = ["3.1.0.M1"]
    def snapshotsOnly = true
}


private def cleanMavenRepository(File file) {
    def lastModified = new Date(file.lastModified());
    def ageInDays = now - lastModified;
    def directories = file.listFiles(directoryFilter);

    if (directories.length > 0) {
        directories.each {
            cleanMavenRepository(it);
        }
    } else {
        if (ageInDays > configuration.maxAgeSnapshotsInDays && file.canonicalPath.endsWith("-SNAPSHOT")) {
            int size = removeDirAndReturnFreedKBytes(file)
            details.add("About to remove directory $file.canonicalPath with total size $size and $ageInDays days old");
        } else if (ageInDays > configuration.maxAgeInDays && !file.canonicalPath.endsWith("-SNAPSHOT") && !configuration.snapshotsOnly) {
            String highest = obtainHighestVersionOfArtifact(file)
            if (file.name != highest && !configuration.versionsToKeep.contains(file.name)) {
                int size = removeDirAndReturnFreedKBytes(file)
                details.add("About to remove directory $file.canonicalPath with total size $size and $ageInDays days old and not highest version $highest");
            }
        }
    }
}

In this answer , the author removes files not accessed for a period of time. This is better than removing files based on modifications, as there will be a few files that are not modified for a long time, but still needed by your build (eg stable dependencies).

For your requirements, I would adapt it slightly with this

find ~jenkins/.m2/repository -atime +14 -iname '*.pom' | \
while read pom; \
    do parent=`dirname "$pom"`; \
    rm -rf "$parent"; \
done

Paraphrasing the author:

This will find all *.pom files which have last been accessed more than [14 days] ago [...] and delete their directories.

For our use case, we are using a similar command in a separate Jenkins job with a last_access build parameter.

  • This project is parameterized

    • String Parameter
      • Name: last_access
      • Default Value: 30
      • Description

        Remove files with a last access date older than the specified number of days in the past.

  • Build:

    • Execute shell, command:

       find $JENKINS_HOME/.m2/repository -atime +$last_access -iname '*.pom' | \\ while read pom; \\ do parent=`dirname "$pom"`; \\ rm -rf "$parent"; \\ done
  • Build Triggers:

    • Build periodically, schedule:

       H 22 * * *

      (every day)

Note : This could just be added to cron, but I prefer it in Jenkins.

I did spend some hours looking at this problem and to the answers, many of them rely on the atime (which is the last access time on UNIX systems), which is an unreliable solution for two reasons:

  1. Most UNIX systems (including Linux and macOS) update the atime irregularly at best, and that is for a reason: a complete implementation of atime would imply the whole file system would be slowed down by having to update (ie, write to the disk) the atime every time a file is read, moreover having a such an extreme number of updates would very rapidly wear out the modern, high performance SSD drives
  2. On a CI/CD environment, the VM that's used to build your Maven project will have its Maven repository restored from a shared storage, which in turn will make the atime get set to a "recent" value

I hence created a Maven repository cleaner and made it available on https://github.com/alitokmen/maven-repository-cleaner/ . The bash maven-repository-cleaner.sh script has one function, cleanDirectory , which is a recursive function looping through the ~/.m2/repository/ and does the following:

  • When the subdirectory is not a version number, it digs into that subdirectory for analysis
  • When a directory has subdirectories which appear to be version numbers, it only deletes all lower versions

In practice, if you have a hierarchy such as:

  • artifact-group
    • artifact-name
      • 1.8
      • 1.10
      • 1.2

... maven-repository-cleaner.sh script will:

  1. Navigate to artifact-group
  2. In artifact-group , navigate to artifact-name
  3. In artifact-name , delete the subfolders 1.8 and 1.2 , as 1.10 is superior to both 1.2 and 1.8

To run the tool on your CI/CD platform (or any other form of UNIX system), simply use the below three lines, either at the beginning or at the end of the build:

wget https://raw.githubusercontent.com/alitokmen/maven-repository-cleaner/main/maven-repository-cleaner.sh
chmod +x maven-repository-cleaner.sh
./maven-repository-cleaner.sh

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM