简体   繁体   English

Scala和Jython中的中缀运算符

[英]Infix operators in Scala and Jython

I'm evaluating languages for a computational oriented app that needs an easy embedded scripting language for end users. 我正在为面向计算的应用评估语言,该应用需要最终用户使用简单的嵌入式脚本语言。 I have been thinking of using Scala as the main underlying language and Jython for the scripting interface. 我一直在考虑使用Scala作为主要的基础语言,使用Jython作为脚本接口。 An appeal of Scala is that I can define methods such as :* for elementwise multiplication of a matrix object and use it with infix syntax a :* b . Scala的吸引力在于,我可以定义诸如:*方法,用于矩阵对象的元素乘法,并将其与infix语法a :* b But :* is not a valid method name in Python. 但是:*在Python中不是有效的方法名称。 How does Jython deal with this? Jython如何处理?

I would consider using Scala as the scripting language, due to its flexibility. 由于其灵活性,我会考虑将Scala用作脚本语言。 But even with type inference, all the val and var and required type definitions are too much for lay users used to dynamic language like matlab. 但是即使使用类型推断,所有valvar以及所需的类型定义对于习惯于使用诸如matlab之类的动态语言的非专业用户来说也太多了。 By comparison, Boo has the option -ducky option which might work, but I'd like to stay on the JVM rather than .NET. 相比之下,Boo具有选项-ducky选项,该选项可能会起作用,但是我想保留在JVM而不是.NET上。 I assume there is no -ducky for Scala. 我认为Scala没有-ducky

More generally, consider the following DSL (from http://www.cs.utah.edu/~hal/HBC/ ) to model a Latent Dirichlet Allocation: 更一般地,请考虑以下DSL(来自http://www.cs.utah.edu/~hal/HBC/ )来对潜在狄利克雷分配进行建模:

model {
      alpha     ~ Gam(0.1,1)
      eta       ~ Gam(0.1,1)
      beta_{k}  ~ DirSym(eta, V)           , k \in [1,K]
      theta_{d} ~ DirSym(alpha, K)         , d \in [1,D]
      z_{d,n}   ~ Mult(theta_{d})          , d \in [1,D] , n \in [1,N_{d}]
      w_{d,n}   ~ Mult(beta_{z_{d,n}})     , d \in [1,D] , n \in [1,N_{d}]
}

result = model.simulate(1000)

This syntax is terrific (compared to PyMCMC for instance) for users familiar with hierarchical Bayesian modeling. 对于熟悉分层贝叶斯建模的用户来说,这种语法很棒(例如,与PyMCMC相比)。 Is there any language on the JVM that would make is easy to define such syntax, along with having access to a basic scripting language like python? JVM上是否有任何语言可以轻松定义此类语法,并且可以访问诸如python之类的基本脚本语言?

Thoughts appreciated. 思想表示赞赏。

Personally, I think you overstate the overhead of Scala. 我个人认为您夸大了Scala的开销。 For instance, this: 例如,这:

alpha     ~ Gam(10,10)
mu_{k}    ~ NorMV(vec(0.0,1,dim), 1, dim)     , k \in [1,K]
si2       ~ IG(10,10)
pi        ~ DirSym(alpha, K)
z_{n}     ~ Mult(pi)                          , n \in [1,N]
x_{n}     ~ NorMV(mu_{z_{n}}, si2, dim)       , n \in [1,N]

could be written as 可以写成

def alpha =                   Gam(10, 10)
def mu    = 1 to 'K map (k => NorMV(Vec(0.0, 1, dim), 1, dim)
def si2   =                   IG(10, 10)
def pi    =                   DirSym(alpha, 'K)
def z     = 1 to 'N map (n => Mult(pi))
def x     = 1 to 'N map (n => NormMV(mu(z(n)), si2, dim))

In this particular case, almost nothing was done, except define Gam , Vec , NorMV , etc, and create an implicit definition from Symbol to Int or Double , reading from a table where you'll store such definitions later on (such as with a loadM equivalent). 在这种情况下,除了定义GamVecNorMV等,并创建一个从SymbolIntDouble的隐式定义外,几乎什么都没做,要从一个表中读取,稍后您将在其中存储这些定义(例如使用相当于loadM )。 Such implicit definitions would go like this: 这样的隐式定义将如下所示:

import scala.reflect.Manifest
val unknowns = scala.collection.mutable.HashMap[Symbol,(Manifest[_], Any)]()
implicit def getInt(s: Symbol)(implicit m: Manifest[Int]): Int = unknowns.get(s) match {
  case Some((`m`, x)) => x.asInstanceOf[Int]
  case _ => error("Undefined unknown "+s)
}
// similarly to getInt for any other desired type

It could be written as such, too: 也可以这样写:

Model (
  'alpha    -> Gam(10, 10),
  'mu -> 'n -> NorMV(Vec(0.0, 1, dim), 1, dim)      With ('k in (1 to 'K)),
  'si2      -> IG(10, 10),
  'pi       -> DirSym('alpha, 'K),
  'z -> 'n  -> Mult('pi)                            With ('n in (1 to 'N)),
  'x -> 'n  -> NorMV('mu of ('z of 'n), 'si2, dim)) With ('n in (1 to 'N)) 
)

In which case Gam , Mult , etc would need to be defined a bit different, to handle the symbols being passed to them. 在这种情况下,需要对GamMult等进行定义,以处理传递给它们的符号。 The excess of "'" is definitely annoying, though. 但是,多余的“'”绝对令人讨厌。

It's not like HBC doesn't have it's own idiosyncrasies, such as the occasional need for type declarations, underscores before indices, the occasional need to replace " ~ " with " \\in ", or even the backslash that needs to preceed the later. 这不像HBC没有它自己的特质,例如偶尔需要类型声明,在索引之前加下划线,偶尔需要用“ \\in ”替换“ ~ ”,甚至是需要在后面加反斜杠的地方。 As long as there is a real benefit from using it instead of HBC, MathLab, or whatever else the person is used to, they'll trouble themselves a bit. 只要使用它代替HBC,MathLab或其他人习惯的方法能带来真正的好处,他们就会为自己带来麻烦。

EDIT: 编辑:

After reading all the discussion, probably the best way to go is to define the grammar of your DSL and then parse it with the inbuilt parsing utilities of scala. 阅读所有讨论之后,最好的方法可能是定义DSL的语法,然后使用内置的scala解析实用程序对其进行解析。

I'm not sure though what you are trying to achieve. 我不确定您要达到的目标。 Will your scripting language be more of a "what" or of a "how" type? 您的脚本语言会是“什么”还是“如何”类型? The example you have given me is a "what" type DSL -> you describe what you are trying to achieve, and not care about the implementation. 您给我的示例是“什么”类型的DSL->您描述了您要实现的目标,而不关心实现。 These are languages best used to describe a problem, and by the domain you are building the app for, I think it's the best way to go. 这些是最能描述问题的语言,按照您要为其构建应用的领域,我认为这是最好的方法。 The user just describes the problem in a syntax very familiar to the problem domain, the application parses this description and uses it as an input in order to run the simulation. 用户仅以问题域非常熟悉的语法描述问题,应用程序将解析此描述并将其用作输入,以运行模拟。 For this, building a grammar and parsing it with the scala parsing utilities will probably be the best way to go (you only want to expose a small subset of features for the users). 为此,构建语法并使用scala解析实用程序进行解析可能是最好的方法(您只想向用户公开一小部分功能)。

If you need a "how" script, then using an already established scripting language is the way to go (unless you want to implement loops, basic data structures, etc yourself). 如果需要“如何”脚本,那么使用已建立的脚本语言是可行的方法(除非您想自己实现循环,基本数据结构等)。

In designing a system, there will always be trade-offs to be made. 在设计系统时,总是需要权衡取舍。 Here it is between the amount of features you want to expose to the user and the terseness of your script. 这里是介于您要向用户公开的功能数量和脚本的简洁性之间。 Myself, I'll go with exposing as few features as possible to get the job done, and get it done in a "how" way - the user doesn't need to know how you are going to simulate its problem if the simulation gives correct results and runs in reasonable time. 就我自己而言,我将尽可能少地暴露功能以完成工作,并以“如何”方式完成它-如果模拟给出,用户不需要知道如何模拟其问题更正结果并在合理的时间内运行。

If you expose a full scripting language to the user, your DSL will just be a small API in that scripting language and the user will have to learn a full language to be able to use its full power. 如果向用户公开完整的脚本语言,则DSL只是该脚本语言中的一个小型API,用户将必须学习完整的语言才能使用其全部功能。 And you may not want a user to use its full power (it may wreck havoc to your app!). 并且您可能不希望用户使用其全部功能(这可能会破坏您的应用程序!)。 Why would you expose, for example, TCP socket support when your application doesn't need to connect to the internet? 当您的应用程序不需要连接到Internet时,为什么要公开例如TCP套接字支持? That could be a possible security hole. 这可能是一个安全漏洞。

-- The following section discusses possible scripting languages. --以下部分讨论了可能的脚本语言。 My above answer advises against using them, but I have left the discussion for completeness. 我上面的答案建议不要使用它们,但是为了完整起见,我离开了讨论。

I have no experience with it, but have a look at Groovy . 我没有经验,但是可以看一下Groovy It is a dynamically typed scripting language for the JVM (with JVM support probably going to get better in JDK 7 due to invokedynamic ). 它是JVM的一种动态类型的脚本语言(由于invokedynamic ,JVM的支持在JDK 7中可能会变得更好)。 It also has good support for operator overloading and writing DSLs . 它还对操作员重载编写DSL提供良好的支持。 Unfortunately, it doesn't have support for user defined operators, at least not to my knowledge. 不幸的是,它不支持用户定义的运算符,至少据我所知。

I would still go with scala though (partially because I like static typing and I find its type inference good :). 我仍然会使用scala(部分是因为我喜欢静态类型,并且发现其类型推断很好:)。 It's scripting support is quite good, and you can make almost anything look like native language support (for example have a look at its actors library!). 它的脚本支持非常好,您几乎可以使任何看起来像本地语言的支持(例如,看看它的actors库!)。 It also has very good support for functional programming, which can make scripts very short and concise. 它还对函数式编程提供了很好的支持,可以使脚本非常简短。 And as a benefit, you'll have all the power of the Java libraries at your disposal. 作为好处,您将可以使用Java库的所有功能。

In order to use scala as a scripting language, just put your script in a file ending with .scala and then run scala filename.scala . 为了使用scala作为脚本语言,只需将脚本放入以.scala结尾的文件中,然后运行scala filename.scala See Scala as a scripting Language for a discussion, comparing scala with JRuby. 有关将Scala与JRuby进行比较的讨论,请参见Scala作为脚本语言

None of the obvious suspects among JVM scripting languages -- JavaScript Rhino, JRuby, Jython, and Groovy -- have support for user-defined operators (which you'll probably need). JVM脚本语言中没有明显的可疑对象-JavaScript Rhino,JRuby,Jython和Groovy-不支持用户定义的运算符(您可能需要)。 Neither does Fan. 范也没有。

You might try using JRuby with superators gem. 您可以尝试将JRuby与superators gem一起使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM