简体   繁体   中英

How can I read Perl data structures from Python?

I've often seen people use Perl data structures in lieu of configuration files; ie a lone file containing only:

%config = (
    'color' => 'red',
    'numbers' => [5, 8],
    qr/^spam/ => 'eggs'
);

What's the best way to convert the contents of these files into Python-equivalent data structures, using pure Python? For the time being we can assume that there are no real expressions to evaluate, only structured data.

Is using pure Python a requirement? If not, you can load it in Perl and convert it to YAML or JSON. Then use PyYAML or something similar to load them in Python.

I'd just turn the Perl data structure into something else. Not seeing the actual file, there might be some extra work that my solution doesn't do.

If the only thing that's in the file is the one variable declaration (so, no 1; at the end, and so on), it can be really simple to turn your %config it into YAML:

perl -MYAML -le 'print YAML::Dump( { do shift } )' filename 

The do returns the last thing it evaluated, so in this little code it returns the list of hash key-value pairs. Things such as YAML::Dump like to work with references so they get a hint about the top-level structure, so I make that into a hash reference by surrounding the do with the curly braces. For your example, I'd get this YAML output:

---
(?-xism:^spam): eggs
color: red
numbers:
  - 5
  - 8

I don't know how Python will like that stringified regex, though. Do you really have a key that is a regex? I'd be curious to know how that's being used as part of the configuration.


If there's extra stuff in the file, life is a bit more tough. There's probably a really clever way to get around that, but I used the same idea, but just hard-coded the variable name that I wanted.

I tried this on the Perl data structure that the CPAN.pm module uses, and it looks like it came out fine. The only ugliness is the fore-knowledge of the variable name that it supplies. Now that you've seen the error of configuration in Perl code, avoid making the same mistake with Python code. :)

YAML:

 perl -MYAML -le 'do shift; print YAML::Dump( $CPAN::Config )' MyConfig.pm

JSON:

 perl -MJSON::Any -le 'do shift; my $j = JSON::Any->new; print $j->objToJson( $CPAN::Config )' MyConfig.pm

or

# suggested by JF Sebastian
perl -MJSON -le 'do shift; print to_json( $CPAN::Config )' MyConfig.pm

XML::Simple doesn't work out so well because it treated everything like an attribute, but maybe someone can improve on this:

perl -MXML::Simple -le 'do shift; print XMLout( $CPAN::Config )' MyConfig.pm

Not sure what the use case is. Here's my assumption: you're going to do a one-time conversion from Perl to Python.

Perl has this

%config = (
    'color' => 'red',
    'numbers' => [5, 8],
    qr/^spam/ => 'eggs'
);

In Python, it would be

config = {
    'color' : 'red',
    'numbers' : [5, 8],
    re.compile( "^spam" ) : 'eggs'
}

So, I'm guessing it's a bunch of RE's to replace

  • %variable = ( with variable = {
  • ); with }
  • variable => value with variable : value
  • qr/.../ => with re.compile( r"..." ) : value

However, Python's built-in dict doesn't do anything unusual with a regex as a hash key. For that, you'd have to write your own subclass of dict , and override __getitem__ to check REGEX keys separately.

class PerlLikeDict( dict ):
    pattern_type= type(re.compile(""))
    def __getitem__( self, key ):
        if key in self:
            return super( PerlLikeDict, self ).__getitem__( key )
        for k in self:
            if type(k) == self.pattern_type:
                if k.match(key):
                    return self[k]
        raise KeyError( "key %r not found" % ( key, ) )

Here's the example of using a Perl-like dict.

>>> pat= re.compile( "hi" )
>>> a = { pat : 'eggs' } # native dict, no features.
>>> x=PerlLikeDict( a )
>>> x['b']= 'c'
>>> x
{<_sre.SRE_Pattern object at 0x75250>: 'eggs', 'b': 'c'}
>>> x['b']
'c'
>>> x['ji']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 10, in __getitem__
KeyError: "key 'ji' not found"
>>> x['hi']
'eggs'

I've also found PyPerl , but it doesn't seem to be maintained. I guess something like this is what I was looking for -- a module that did some basic interpretation of Perl and passed the result as a Python object. A Perl interpreter that died on anything too complex would be fine. :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM