简体   繁体   中英

How do I export dicom datasets to excel?

I am still quite new to coding, and have a couple of questions. I was working on a few MRI Images with the file extension '.dcm'. I imported the 'dicom' module which allows me to extract specific parameters (such as the Patients Name Age, Type of Scan, etc.) from the file. These values are then written to notepad(values are tab separated) and then exported onto Excel.

The first feature I wanted to add to the script was being able to search in subfolders for files that have '.dcm' extension and being able to open each of them in the script and extract the information that I need. As of right now, I have it such that it only looks for the '.dcm' files in the current directory. If I use the code below, I can get all the file names from the subfolders, but when I would try to open them using the built in 'dicom.read_file()' method, it would obviously give me an error that the file could not be located. Is there a way around that?

my_List= []
for root, dirs, files in os.walk(path):
 for names in files:
  if names.endswith(".dcm"):
   my_List.append(names)

Secondly, how could I improve the efficiency of my code. I have a lot of recurring statements, especially when I am writing the values to notepad. Is there a better/faster way to do it? What else can I improve on?

Lastly, instead of exporting the values that I need to notepad and then to excel, is there a way to directly export them to excel?

for i in range(len(my_List)):     


    ds = dicom.read_file(my_List[i])
    if ds.SeriesDescription not in Series:

        info = {}
        info['PatientName']=ds.PatientName

        info['SeriesDescription']=ds.SeriesDescription
        Series.append(ds.SeriesDescription)
        getRepetitionTime(ds)
        getEchoTime(ds)
        getInversionTime(ds)
        getNumberOfAverages(ds)
        getSpacingBetweenSlices(ds)
        getPercentSampling(ds)
        getPercentPhaseFieldOfView(ds)
        getAcquisitionMatrix(ds)
        getFlipAngle(ds)
        getImagesInAcquisition(ds)
        getPixelSpacing(ds)
        f.write(info['PatientName'])
        f.write("\t")
        f.write(info['SeriesDescription'])
        f.write("\t")
        f.write(info['RepetitionTime'])
        f.write("\t")
        f.write(info['EchoTime'])
        f.write("\t")
        f.write(info['InversionTime'])
        f.write("\t")
        f.write(info['NumberOfAverages'])
        f.write("\t")
        f.write(info['SpacingBetweenSlices'])
        f.write("\t")
        f.write(info['PercentSampling'])
        f.write("\t")
        f.write(info['PercentPhaseFieldOfView'])
        f.write("\t")
        f.write(info['AcquisitionMatrix'])
        f.write("\t")
        f.write(info['FlipAngle'])
        f.write("\t")
        f.write(info['ImagesInAcquisition'])
        f.write("\t")     
        f.write(info['PixelSpacing'])
        f.write("\n")

As I am beginner myself and the answer to finding subdirections is already posted, I would like to point out other suggestions for code.

First of all I would advice you to put the information collection process into a method for readability and reusability like this:

def collect_info(filename):
    ds = dicom.read_file(filename)
    if ds.SeriesDescription not in Series:
    info = {}

    info['PatientName']=ds.PatientName

    info['SeriesDescription']=ds.SeriesDescription
    Series.append(ds.SeriesDescription)
    getRepetitionTime(ds)
    getEchoTime(ds)
    getInversionTime(ds)
    getNumberOfAverages(ds)
    getSpacingBetweenSlices(ds)
    getPercentSampling(ds)
    getPercentPhaseFieldOfView(ds)
    getAcquisitionMatrix(ds)
    getFlipAngle(ds)
    getImagesInAcquisition(ds)
    getPixelSpacing(ds)
    f.write(info['PatientName'])
    f.write("\t")
    f.write(info['SeriesDescription'])
    f.write("\t")
    f.write(info['RepetitionTime'])
    f.write("\t")
    f.write(info['EchoTime'])
    f.write("\t")
    f.write(info['InversionTime'])
    f.write("\t")
    f.write(info['NumberOfAverages'])
    f.write("\t")
    f.write(info['SpacingBetweenSlices'])
    f.write("\t")
    f.write(info['PercentSampling'])
    f.write("\t")
    f.write(info['PercentPhaseFieldOfView'])
    f.write("\t")
    f.write(info['AcquisitionMatrix'])
    f.write("\t")
    f.write(info['FlipAngle'])
    f.write("\t")
    f.write(info['ImagesInAcquisition'])
    f.write("\t")     
    f.write(info['PixelSpacing'])
    f.write("\n")
    f.close()

Secondly, does this program even work? If I am correct, you open the f only once and close it every time you collect the information . You have to move the f.close command to the very end of the program, outside the for loop. Now your program would look like this:

# ...stuff...
for i in range(len(my_List)):
    collect_info(my_List[i])
f.close()
print 'It took', time.time()-start, 'seconds.'

Thirdly, you can shorten the code by writing:

f.write(info['EchoTime'] + '\t')

instead of

f.write(info['EchoTime'])
f.write('\t')

Remember, bugs per LOC ratio is pretty constant, whatever the code or language is, so keep it short. Also, long code is difficult to navigate.

Fourth, you could put all your getters into a single get_info method that returns a tuple of information. Then you could do just:

for token in get_info():
    f.write(token + '\t')

For part one, try the following code:

my_List= []
for root, dirs, files in os.walk(path):
    for names in files:
        if names.endswith(".dcm"):
            my_List.append(os.path.join(root, names ))

For the writing part, yes, actually your functions looks a bit redundant, you can actually utilize the python CSV writer. Try with the CSV writer here: https://docs.python.org/2/library/csv.html

Might need some adjustments as I dont have any dcm files to test, but you can get the idea:

import xlsxwriter
import os
import dicom


dcm_files = []
for root, dirs, files in os.walk(path):
    for names in files:
        if names.endswith(".dcm"):
            dcm_files.append(os.path.join(root, names))

for dcm_file in dcm_files:
    ds = dicom.read_file(dcm_file)
    workbook = xlsxwriter.Workbook(os.path.basename(dcm_file) + '.xlsx')
    worksheet = workbook.add_worksheet()

    data = (
            ["RepetitionTime", ds.get("RepetitionTime", "None")],
            ["EchoTime", ds.get("EchoTime", "None")],
            .
            .
            .
            )

    row = 0
    col = 0

    for name, value in (data):
        worksheet.write(row, col,     name)
        worksheet.write(row, col + 1, value)
        row += 1

    workbook.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM