简体   繁体   English

自动将类型签名添加到顶级功能

[英]Automatically add type signatures to top-level functions

I was lazy and wrote a Haskell module (using the excellent EclipseFP IDE) without giving type signatures to my top-level functions. 我很懒,写了一个Haskell模块(使用优秀的EclipseFP IDE)而没有给我的顶级函数提供类型签名。

EclipseFP uses HLint to automatically label every offending function, and I can fix each one with 4 mouse clicks. EclipseFP使用HLint自动标记每个违规函数,我可以通过4次鼠标点击来修复每个函数。 Effective, but tedious. 有效,但乏味。

Is there a utility program that will scan a .hs file, and emit a modified version that adds type signatures to each top-level function? 是否有一个实用程序将扫描.hs文件,并发出一个修改版本,为每个顶级函数添加类型签名?

Example: 例:

./addTypeSignatures Foo.hs 

would read a file Foo.hs : 会读取文件Foo.hs

foo x = foo + a

and emit 并发出

foo :: Num a => a -> a
foo x = x + 1

Bonus points if the tool automatically edits Foo.hs in place and saves a backup Foo.bak.hs 如果工具自动编辑Foo.hs并保存备份Foo.bak.hs则奖励积分

There's haskell-mode for emacs that has a shortcut to insert type signature of a function: Cu, Cc, Ct. emacs的haskell-mode具有插入函数类型签名的快捷方式:Cu,Cc,Ct。 It is not automatic, you have to do it for each function. 它不是自动的,您必须为每个功能执行此操作。 But if you have only one module, it will probably take you a few minutes to go through it. 但是如果你只有一个模块,可能需要几分钟的时间来完成它。

Here's a variation of the above script, that uses ":browse" instead of ":type", per ehird's comment. 以下是上述脚本的变体,根据ehird的评论使用“:browse”而不是“:type”。

One major problem with this solution is that ":browse" displays fully qualified type names, whereas ":type" uses the imported (abbreviated) type names. 此解决方案的一个主要问题是“:browse”显示完全限定的类型名称,而“:type”使用导入的(缩写的)类型名称。 This, if your module uses unqualified imported types (a common case), the output of this script will fail compilation. 这样,如果您的模块使用非限定导入类型(常见情况),则此脚本的输出将无法编译。

That shortoming is fixable (using some parsing of imports), but this rabbit hole is getting deep. 这种短缺是可以解决的(使用一些进口解析),但这个兔子洞正在变得越来越深。

#!/usr/bin/env perl

use warnings;
use strict;

sub trim {
   my $string = shift;
   $string =~ s/^\s+|\s+$//g;
   return $string;
}


my $sig=0;
my $file;

my %funcs_seen = ();
my %keywords = ();
for my $kw qw(type newtype data class) { $keywords{$kw} = 1;}

foreach $file (@ARGV) 
{
  if ($file =~ /\.lhs$/) 
  {
    print STDERR "$file: .lhs is not supported. Skipping.\n";
    next;
  }

  if ($file !~ /\.hs$/) 
  {
    print STDERR "$file is not a .hs file. Skipping.\n";
    next;
  }

  my $module = $file;
  $module =~ s/\.hs$//;

  my $browseInfo = `echo :browse | ghci $file`;
  if ($browseInfo =~ /Failed, modules loaded:/)
  {
   print STDERR "$browseInfo\n";
   print STDERR "$file is not valid Haskell source file. Skipping.\n";
   next;
  }

  my @browseLines = split("\n", $browseInfo);
  my $browseLine;
  my $func = undef;
  my %dict = ();
  for $browseLine  (@browseLines) { 
   chomp $browseLine;
   if ($browseLine =~ /::/) {
    my ($data, $type) = split ("::", $browseLine);
    $func = trim($data);
    $dict{$func} = $type;
    print STDERR "$func :: $type\n";
   } elsif ($func && $browseLine =~ /^  /) { # indent on continutation
    $dict{$func} .= " " . trim($browseLine);
    print STDERR "$func ... $browseLine\n";
   } else {
    $func = undef;
   }
  }



  my $backup = "$file.bak";
  my $new = "$module.New.hs";
  -e $backup and die "Backup $backup file exists. Refusing to overwrite. Quitting";
  open OLD, $file;
  open NEW, ">$new"; 

  print STDERR "Functions in $file:\n";
  my $block_comment = 0;
  while (<OLD>) 
  {
    my $original_line = $_;
    my $line = $_;
    my $skip = 0;
    $line =~ s/--.*//;
    if ($line =~ /{-/) { $block_comment = 1;} # start block comment
    $line =~ s/{-.*//;
    if ($block_comment and $line =~ /-}/) { $block_comment=0; $skip=1} # end block comment

    if ($line =~ /^ *$/) { $skip=1; } # comment/blank
    if ($block_comment) { $skip = 1};
    if (!$skip) 
    {
      if (/^(('|\w)+)( +(('|\w)+))* *=/ ) 
      { 
        my $object = $1;
        if ((! $keywords{$object}) and !($funcs_seen{$object})) 
        {
          $funcs_seen{$object} = 1;
          print STDERR "$object\n";
          my $type = $dict{$1};

          unless ($sig) 
          {
            if ($type) {
              print NEW "$1 :: $type\n";
              print STDERR "$1 :: $type\n";
            } else {
              print STDERR "no type for $1\n";
            }
          }
        }
      }

    $sig = /^(('|\w)+) *::/; 
    }
    print NEW $original_line;
  }
  close OLD;
  close NEW;

  my $ghciPostTest = `echo 1 | ghci $new`;
  if ($ghciPostTest !~ /Ok, modules loaded: /)
  {
   print $ghciPostTest;
   print STDERR "$new is not valid Haskell source file. Will not replace original (but you might find it useful)";
   next;
  } else {
    rename ($file, $backup) or die "Could not make backup of $file -> $backup";
    rename ($new, $file) or die "Could not make new file $new";
  }
}

For the Atom Editor its possible to automatically insert the type signature per function with the package haskell-ghc-mod which provides: 对于Atom编辑器,可以使用包haskell-ghc-mod自动为每个函数插入类型签名,该包提供:

 'ctrl-alt-T': 'haskell-ghc-mod:insert-type'

https://atom.io/packages/haskell-ghc-mod#keybindings https://atom.io/packages/haskell-ghc-mod#keybindings

Here's another hacky attempt based on parsing GHC -Wmissing-signatures warnings, so the script doesn't have to parse Haskell. 这是基于解析GHC -Wmissing-signatures警告的另一个hacky尝试,因此脚本不必解析Haskell。 It transforms the warnings into a sed script that does the insertions and prints its result to stdout, or modifies the file inplace if -i is given. 它将警告转换为sed脚本,该脚本执行插入并将其结果打印到stdout,或者如果给出-i则将文件修改为原位。

Requires a Stack project as configured below, but you can change the buildCmd . 需要下面配置的Stack项目,但您可以更改buildCmd

Works on the few files I tried it on with GHC 8.2.2 and 8.4.3, but same warnings as in @misterbee's first answer apply :) Also, it will obviously break with older or newer GHCs if they produce differently formatted warnings (but for me, the more sophisticated tooling seem to break all the time too, so...). 使用GHC 8.2.2和8.4.3尝试使用的几个文件,但是@misterbee的第一个答案中的警告同样适用:)此外,如果它们产生不同格式的警告,它显然会破坏旧的或更新的GHC(但是对我来说,更复杂的工具似乎也一直在打破,所以...)。

#!/bin/zsh

set -eu
setopt rematchpcre

help="Usage: ${0:t} [-d] [-i | -ii] HASKELL_FILE

Options:
  -d   Debug
  -i   Edit target file inplace instead of printing to stdout
           (Warning: Trying to emulate this option by piping from 
            and to the same file probably won't work!)
  -ii  Like -i, but no backup
"


### CONFIG ###

buildCmd() {
    touch $inputFile
    stack build --force-dirty --ghc-options='-fno-diagnostics-show-caret -Wmissing-signatures'
}

# First group must be the filename, second group the line number
warningRegexL1='^(.*):([0-9]+):[0-9]+(-[0-9]+)?:.*-Wmissing-signatures'

# First group must be the possible same-line type signature (can be empty)
warningRegexL2='Top-level binding with no type signature:\s*(.*)'

# Assumption: The message is terminated by a blank line or an unindented line
messageEndRegex='^(\S|\s*$)'


### END OF CONFIG ###


zparseopts -D -E d=debug i+=inplace ii=inplaceNoBackup h=helpFlag

[[ -z $helpFlag ]] || { printf '%s' $help; exit 0 }

# Make -ii equivalent to -i -i
[[ -z $inplaceNoBackup ]] || inplace=(-i -i)

inputFile=${1:P} # :P takes the realpath

[[ -e $inputFile ]] || { echo "Input file does not exist: $inputFile" >&2; exit 2 }

topStderr=${${:-/dev/stderr}:P}

debugMessage()
{
    [[ -z $debug ]] || printf '[DBG] %s\n' "$*" > $topStderr
}

debugMessage "inputFile = $inputFile"

makeSedScript() 
{
    local line

    readline() {
        IFS= read -r line || return 1
        printf '[build] %s\n' $line >&2
    }

    while readline; do
        [[ $line =~ $warningRegexL1 ]] || { debugMessage "^ Line doesn't match warningRegexL1"; continue }
        file=${match[1]}
        lineNumber=${match[2]}

        [[ ${file:P} = $inputFile ]] || { debugMessage "^ Not our file: $file"; continue }

        # Begin sed insert command
        printf '%d i ' $lineNumber

        readline

        [[ $line =~ $warningRegexL2 ]] ||\
            { printf 'WARNING: Line after line matching warningRegexL1 did not match warningRegexL2:\n %s\n' $line >&2
              continue }

        inlineSig=${match[1]}

        debugMessage "^ OK, inlineSig = $inlineSig"

        printf '%s' $inlineSig

        readline


        if [[ ! ($line =~ $messageEndRegex) ]]; then

            [[ $line =~ '^(\s*)(.*)$' ]]

            indentation=${match[1]}

            [[ -z $inlineSig ]] || printf '\\n'

            printf ${match[2]}

            while readline && [[ ! ($line =~ $messageEndRegex) ]]; do
                printf '\\n%s' ${line#$indentation}
            done
        fi

        debugMessage "^ OK, Type signature ended above this line"

        # End sed insert command
        printf '\n'

    done
}

prepend() {
    while IFS= read -r line; do printf '%s%s\n' $1 $line; done
}

sedScript="$(buildCmd |& makeSedScript)"

if [[ -z $sedScript ]]; then
    echo "No type-signature warnings for the given input file were detected (try -d option to debug)" >&2
    exit 1
fi

printf "\nWill apply the following sed script:\n" >&2
printf '%s\n' $sedScript | prepend "[sed] " >&2

sedOptions=()

if [[ $#inplace -ge 1 ]]; then 
    sedOptions+=(--in-place)
    [[ $#inplace -ge 2 ]] || cp -p --backup=numbered $inputFile ${inputFile}.bak
fi


sed $sedOptions -f <(printf '%s\n' $sedScript) $inputFile

This perl script does a hack job at it, making some assumptions about source file structure. 这个perl脚本在它上面做了一个黑客工作,对源文件结构做了一些假设。 (Such as: .hs file (not .lhs ), signatures are on the line immediately preceding definitions, definitions are flush on the left margin, etc) (例如: .hs文件(不是.lhs ),签名紧接在定义之前的行,定义在左边距处是齐平的,等等)

It tries to handle (skip over) comments, equation-style definitions (with repeated left-hand-sides), and types that generate multi-line output in ghci . 它试图处理(跳过)注释,方程式定义(带有重复的左侧),以及在ghci中生成多行输出的类型。

No doubt, many interesting valid cases are not handled properly. 毫无疑问,许多有趣的有效案件处理不当。 The script isn't close to respecting the actual syntax of Haskell. 该脚本并不接近尊重Haskell的实际语法。

It is incredibly slow, as it launches a ghci session for each function that needs a signature. 它非常慢,因为它为每个需要签名的函数启动了一个ghci会话。 It makes a backup file File.hs.bak , prints the functions it finds to stderr, as well as signatures for functions missing signatures, and writes the upgraded source code to File.hs . 它创建一个备份文件File.hs.bak ,将它找到的函数打印到stderr,以及缺少签名的函数的签名,并将升级后的源代码写入File.hs It uses an intermediate file File.hs.new , and has a few safety checks to avoid overwriting your content with garbage. 它使用中间文件File.hs.new ,并进行一些安全检查,以避免使用垃圾覆盖您的内容。

USE AT YOUR OWN RISK. 自行承担使用风险。

This script might format your hard drive, burn your house down, unsafePerformIO, and have other impure side effects. 此脚本可能会格式化您的硬盘驱动器,烧毁您的房子,unsafePerformIO,并有其他不纯的副作用。 In fact, it probably will. 事实上,它可能会。

I feel so dirty. 我觉得很脏

Tested on Mac OS X 10.6 Snow Leopard with a couple of my own .hs source files. 在Mac OS X 10.6 Snow Leopard上测试了几个我自己的.hs源文件。

#!/usr/bin/env perl

use warnings;
use strict;

my $sig=0;
my $file;

my %funcs_seen = ();
my %keywords = ();
for my $kw qw(type newtype data class) { $keywords{$kw} = 1;}

foreach $file (@ARGV) 
{
  if ($file =~ /\.lhs$/) 
  {
    print STDERR "$file: .lhs is not supported. Skipping.";
    next;
  }

  if ($file !~ /\.hs$/) 
  {
    print STDERR "$file is not a .hs file. Skipping.";
    next;
  }

  my $ghciPreTest = `echo 1 | ghci $file`;
  if ($ghciPreTest !~ /Ok, modules loaded: /)
  {
   print STDERR $ghciPreTest;
   print STDERR "$file is not valid Haskell source file. Skipping.";
   next;
  }

  my $module = $file;
  $module =~ s/\.hs$//;

  my $backup = "$file.bak";
  my $new = "$module.New.hs";
  -e $backup and die "Backup $backup file exists. Refusing to overwrite. Quitting";
  open OLD, $file;
  open NEW, ">$new"; 

  print STDERR "Functions in $file:\n";
  my $block_comment = 0;
  while (<OLD>) 
  {
    my $original_line = $_;
    my $line = $_;
    my $skip = 0;
    $line =~ s/--.*//;
    if ($line =~ /{-/) { $block_comment = 1;} # start block comment
    $line =~ s/{-.*//;
    if ($block_comment and $line =~ /-}/) { $block_comment=0; $skip=1} # end block comment

    if ($line =~ /^ *$/) { $skip=1; } # comment/blank
    if ($block_comment) { $skip = 1};
    if (!$skip) 
    {
      if (/^(('|\w)+)( +(('|\w)+))* *=/ ) 
      { 
        my $object = $1;
        if ((! $keywords{$object}) and !($funcs_seen{$object})) 
        {
          $funcs_seen{$object} = 1;
          print STDERR "$object\n";
          my $dec=`echo ":t $1" | ghci $file  | grep -A100 "^[^>]*$module>" | grep -v "Leaving GHCi\." | sed -e "s/^[^>]*$module> //"`;

          unless ($sig) 
          {
            print NEW $dec;
            print STDERR $dec;
          }
        }
      }

    $sig = /^(('|\w)+) *::/; 
    }
    print NEW $original_line;
  }
  close OLD;
  close NEW;

  my $ghciPostTest = `echo 1 | ghci $new`;
  if ($ghciPostTest !~ /Ok, modules loaded: /)
  {
   print $ghciPostTest;
   print STDERR "$new is not valid Haskell source file. Will not replace original (but you might find it useful)";
   next;
  } else {
    rename ($file, $backup) or die "Could not make backup of $file -> $backup";
    rename ($new, $file) or die "Could not make new file $new";
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM