[英]Create import file for a dataset with single-label images in Google Cloud Vertex AI
I have a bucket in GCS with the following hierarchy:我在 GCS 中有一个具有以下层次结构的存储桶:
dataset/class1/image1.png
image2.png
..
dataset/class2/image1.png
image2.png
..
dataset/class3/image1.png
image2.png
..
So, all the examples for the same class are in the same folder.因此,同一个 class 的所有示例都在同一个文件夹中。
I would like to create an import file that for each images creates a new line with the URI and the class. It would look like this:我想创建一个导入文件,为每个图像创建一个包含 URI 和 class 的新行。它看起来像这样:
gs://dataset/class1/image1.png, class1
gs://dataset/class1/image2.png, class1
..
gs://dataset/class2/image1.png, class2
gs://dataset/class2/image2.png, class2
..
gs://dataset/class3/image1.png, class3
gs://dataset/class3/image2.png, class3
..
I am trying this but it doesn't work我正在尝试这个,但它不起作用
export BUCKET=<bucket name>
export IMPORT_DATA=<import file>
gsutil ls -r gs://$BUCKET/** > $IMPORT_DATA
sed -i '1d' $IMPORT_DATA
sed -e 's/$/$(basename $)/' -i filename
I might have found one way to do it.我可能找到了一种方法来做到这一点。
export BUCKET=<bucket name>
export IMPORT_DATA=<import file>
gsutil ls -r gs://$BUCKET/** > tmp.csv
sed -i '1d' tmp.csv # the first line is not a file
cat tmp.csv | while read line ; do echo $line ',' $(basename $(dirname $line)) ; done > $IMPORT_DATA
wc -l $IMPORT_DATA
rm tmp.csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.