简体   繁体   English

根据另一个文件中的行号从文件中拾取行

[英]Pick up lines from a file based on line numbers in another file

I have two files - one contains the addresses (line numbers) and the other one data, like this: 我有两个文件-一个包含地址(行号),另一个包含数据,如下所示:

address file: 地址文件:

2
4
6
7
1
3
5

data file 资料档案

1.000451451
2.000589214
3.117892278
4.479511994
5.484514874
6.784499874
7.021239396

I want to randomize the data file based on the numbers of address files so I get: 我想根据地址文件的数量随机化数据文件,所以我得到:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874

I want to do it either in python or in bash, but didn't yet find any solution. 我想用python或bash来做,但是还没有找到任何解决方案。

If you don't mind sed , we can use process substitution to achieve this easily: 如果您不介意sed ,我们可以使用流程替换轻松实现这一目标:

sed -nf <(sed 's/$/p/' addr.txt) data.txt
  • -n suppresses the default printing -n禁止默认打印
  • -f makes sed read commands from the process substitution <(...) -f使sed从进程替换<(...)读取命令
  • <(sed 's/$/p/' addr.txt) creates sed print commands based on line numbers in addr.txt <(sed 's/$/p/' addr.txt)根据addr.txt行号创建sed打印命令

Gives the output: 给出输出:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874

With awk : awk

awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt
  • NR==FNR {a[NR]=$0; next} NR==FNR {a[NR]=$0; next} creates an associative array a with keys being the record (line) number and values being the whole record, this would be applicable only for the first file ( NR==FNR ), which is data.txt . NR==FNR {a[NR]=$0; next}创建一个关联数组a ,其键为记录(行)号,值为整个记录,这仅适用于第一个文件( NR==FNR ),即data.txt next makes awk to go to the next line without processing the record any further next使awk转到下一行而不进一步处理记录

  • {print a[$0]} prints the value from the array with keys being the currect file's ( addr.txt ) line (record) number {print a[$0]}从数组中打印值,键为当前文件的( addr.txt )行(记录)号

Example: 例:

% cat addr.txt 
2
4
6
7
1
3
5

% cat data.txt 
1.000451451
2.000589214
3.117892278
4.479511994
5.484514874
6.784499874
7.021239396

% awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt
2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874

You can do it, also, within Python , like this example: 您也可以在Python此操作,例如以下示例:

with open("address_file", 'r') as f1, open("data_file", "r") as f2:
    data1 = f1.read().splitlines()
    data2 = f2.read().splitlines()

for k in data1:
    # Handle exceptions if there is any
    try:
        print(data2[int(k)-1])
    except Exception:
        pass

Edit: As suggested @heemayl, here is another solution using only one list : 编辑:如建议@heemayl,这是仅使用一个list另一种解决方案:

with open("file1", 'r') as f1, open("file2", 'r') as f2:
    data = f2.read().splitlines()

    for k in f1.read().splitlines():
        print(data[int(k)-1])

Both will output: 两者都将输出:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM