简体   繁体   中英

How to find parenthesis bound strings in python

I'm learning Python and wanted to automate one of my assignments in a cybersecurity class. I'm trying to figure out how I would look for the contents of a file that are bound by a set of parenthesis. The contents of the (.txt) file look like:

cow.jpg : jphide[v5](asdfl;kj88876)
fish.jpg : jphide[v5](65498ghjk;0-)
snake.jpg : jphide[v5](poi098*/8!@#)
test_practice_0707.jpg : jphide[v5](sJ*=tT@&Ve!2)
test_practice_0101.jpg : jphide[v5](nKFdFX+C!:V9)
test_practice_0808.jpg : jphide[v5](!~rFX3FXszx6)
test_practice_0202.jpg : jphide[v5](X&aC$|mg!wC2)
test_practice_0505.jpg : jphide[v5](pe8f%yC$V6Z3)
dog.jpg : negative`

And here is my code so far:

import sys, os, subprocess, glob, shutil

# Finding the .jpg files that will be copied.
sourcepath = os.getcwd() + '\\imgs\\'
destpath = 'stegdetect'
rawjpg = glob.glob(sourcepath + '*.jpg')

# Copying the said .jpg files into the destpath variable
for filename in rawjpg:
    shutil.copy(filename, destpath)

# Asks user for what password file they want to use.
passwords = raw_input("Enter your password file with the .txt extension:")
shutil.copy(passwords, 'stegdetect')

# Navigating to stegdetect. Feel like this could be abstracted.
os.chdir('stegdetect')

# Preparing the arguments then using subprocess to run
args = "stegbreak.exe -r rules.ini -f " + passwords + " -t p *.jpg"

# Uses open to open the output file, and then write the results to the file.
with open('cracks.txt', 'w') as f: # opens cracks.txt and prepares to w
        subprocess.call(args, stdout=f)

# Processing whats in the new file.
f = open('cracks.txt')

If it should just be bound by ( and ) you can use the following regex, which ensures starting ( and closing ) and you can have numbers and characters between them. You can add any other symbol also that you want to include.

[\\(][az AZ 0-9]*[\\)]

[\(] - starts the bracket
[a-z A-Z 0-9]* - all text inside bracket
[\)] - closes the bracket

So for input sdfsdfdsf(sdfdsfsdf)sdfsdfsdf , the output will be (sdfdsfsdf) Test this regex here: https://regex101.com/

I'm learning Python

If you are learning you should consider alternative implementations, not only regexps.

TO iterate line by line of a text file you just open the file and for over the file handle:

with open('file.txt') as f:
    for line in f:
        do_something(line)

Each line is a string with the line contents, including the end-of-line char '/n'. To find the start index of a specific substring in a string you can use find:

>>> A = "hello (world)"
>>> A.find('(')
6
>>> A.find(')')
12

To get a substring from the string you can use the slice notation in the form:

>>> A[6:12]
'(world'

You should use regular expressions which are implemented in the Python re module

a simple regex like \\(.*\\) could match your "parenthesis string" but it would be better with a group \\((.*)\\) which allows to get only the content in the parenthesis.

import re

test_string = """cow.jpg : jphide[v5](asdfl;kj88876)
fish.jpg : jphide[v5](65498ghjk;0-)
snake.jpg : jphide[v5](poi098*/8!@#)
test_practice_0707.jpg : jphide[v5](sJ*=tT@&Ve!2)
test_practice_0101.jpg : jphide[v5](nKFdFX+C!:V9)
test_practice_0808.jpg : jphide[v5](!~rFX3FXszx6)
test_practice_0202.jpg : jphide[v5](X&aC$|mg!wC2)
test_practice_0505.jpg : jphide[v5](pe8f%yC$V6Z3)
dog.jpg : negative`"""

REGEX = re.compile(r'\((.*)\)', re.MULTILINE)

print(REGEX.findall(test_string))
# ['asdfl;kj88876', '65498ghjk;0-', 'poi098*/8!@#', 'sJ*=tT@&Ve!2', 'nKFdFX+C!:V9' , '!~rFX3FXszx6', 'X&aC$|mg!wC2', 'pe8f%yC$V6Z3']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM