简体   繁体   English

Java SystemClipboard包含其他字节

[英]Java SystemClipboard contains additional bytes

I have to following setting: Ubuntu 12.04, Mathematica 9 and IntelliJIDEA 12. Every time I copy some text from Mathematica and paste it into IDEA, there are a lot of additional bytes at the end of the pasted text. 我必须遵循以下设置:Ubuntu 12.04,Mathematica 9和IntelliJIDEA 12.每次我从Mathematica复制一些文本并将其粘贴到IDEA中时,粘贴文本的末尾会有很多额外的字节。 What first appeared to be a bug in IDEA seems now rather be a bug in java itself. 最初似乎是IDEA中的一个错误似乎现在似乎是java本身的一个错误。 I have appended a minimal java example which shows the behavior. 我附加了一个显示行为的最小java示例。

Therefore, when I type Plot inside Mathematica, select and copy it, and then run the example I get the following output where the first line is the printed form and the second line are the bytes: 因此,当我在Mathematica中键入Plot时,选择并复制它,然后运行示例我得到以下输出,其中第一行是打印的表单,第二行是字节:

在此输入图像描述

As you can see the Plot is followed by a 0 byte and some other, not necessarily zero, bytes. 如您所见, Plot后跟一个0字节和一些其他字母,不一定是零字节。 Throughout all of my tests, I found that a valid solution is to use the string until the first 0 is found, but that does not solve the underlying problem. 在我的所有测试中,我发现一个有效的解决方案是使用字符串直到找到第一个0 ,但这并不能解决潜在的问题。 I really want to see this fixed, because I often copy code between Mathematica and IntelliJIDEA, but first I need to know who to blame for this. 我真的希望看到这个问题得到解决,因为我经常在Mathematica和IntelliJIDEA之间复制代码,但首先我需要知道应该为此归咎于谁。

Question: 题:

How can I find out whether Mathematica or Java is the doing something wrong here? 我怎样才能知道Mathematica或Java是否在这里做错了什么? I can copy Mathematica content to different editors, browsers, etc and I never saw something like this. 我可以将Mathematica内容复制到不同的编辑器,浏览器等,我从未见过这样的东西。 On the other hand, I never found IntelliJ (Java) copying waste either. 另一方面,我从未发现IntelliJ(Java)复制浪费。 What is a good way to find out whether Mathematica is using the clipboard wrong or Java has a bug? 找出Mathematica是否使用剪贴板错误或Java有错误的好方法是什么?

Minimal example 最小的例子

Select some text in Mathematica , press Ctrl + C and run the following Mathematica中选择一些文本,按Ctrl + C并运行以下命令

import java.awt.*;
import java.awt.datatransfer.Clipboard;
import java.awt.datatransfer.DataFlavor;

public class CopyPasteTest {

  public static void main(String[] args) {
    final String text;
    try {
      final Clipboard systemClipboard =
        Toolkit.getDefaultToolkit().getSystemClipboard();
      text = (String) systemClipboard.getData(DataFlavor.stringFlavor);
      System.out.println(text);
      for (byte a : text.getBytes()) {
        System.out.print(a + " ");
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Further information requested in comments 评论中要求提供更多信息

Could just take a look at the clipboard contents after the copy-from-Mathematica operation? 可以在从Mathematica复制操作后查看剪贴板内容吗?

Sure. 当然。 Unfortunately it returns absolutely nothing. 不幸的是,它绝不会返回。 When I mark and copy the following something from the browser for instance, like "this here" I get 例如,当我在浏览器中标记并复制以下内容时,就像“这里”一样

patrick@lenerd:~$ xclip -out | hexdump -C
00000000  74 68 69 73 20 68 65 72  65                       |this here|
00000009

Edit 编辑

I tried the following things where I used always the same copied "Plot" string from Mathematica . 我尝试了下面的东西,我总是使用Mathematica中复制的“Plot”字符串。 First of all, I tried the larger test-class from David as suggested in his comment. 首先,我按照评论中的建议尝试了大卫大型测试课程 With both, the Oracle JRE and the OpenJRE that comes with Ubuntu I got the following output: 有了这两个,Oracle JRE和Ubuntu附带的OpenJRE,我得到了以下输出:

===========
Plot[00][7f][00][00]
===========
Obtained transferrable of type sun.awt.datatransfer.ClipboardTransferable
Plot[00][7f][00][00]
===========

My short sniped from above gives the same result (although not in hex representation). 从上面我的短狙击给出相同的结果(虽然不是十六进制表示)。 Then I tried the different selections from xclip and using the value clipboard brought the following up 然后我尝试了从xclip的不同选择 ,并使用值clipboard带来了以下内容

patrick@lenerd:~$ xclip -o -verbose -selection clipboard | hexdump -C
Connected to X server.
Using selection: XA_CLIPBOARD
Using UTF8_STRING.
00000000  50 6c 6f 74 00 00 00 00                           |Plot....|
00000008

Important to note, when I don't use verbose output with xclip , I only see "Plot" in the terminal. 需要注意的是,当我不使用xclip verbose输出时,我只在终端中看到“Plot”。 Above, you see that there are exactly 4 more bytes in the buffer which are probably not shown, because they start with a 00 . 在上面,您会看到缓冲区中还有4个字节可能未显示,因为它们以00开头。 Additionally, the extra for bytes are 00 00 00 00 , at least this is what is displayed. 另外,字节的额外值是00 00 00 00 ,至少这是显示的内容。 In java we have a 7f (or 127 ) at second position. 在java中,我们在第二个位置有一个7f (或127 )。

I guess this all suggests that the bug comes from Mathematica since it copies additional stuff in the buffer and Java is just a bit sloppy because it doesn't cut at the first 00 . 我想这一切都表明该错误来自Mathematica,因为它在缓冲区中复制了额外的东西而Java只是有点草率,因为它不会在第一个00处切割。

These conclusions look sound. 这些结论看起来很合理

If found the following references about behaviour of the X clipboard: 如果找到以下有关X剪贴板行为的引用:

X11r6 Inter-Client Communication Conventions Manual , in particular Peer-to-Peer Communication by Means of Selections , and also a more compressed explanation (and Python test tools) at Developer's corner: copy-paste in Linux X11r6客户端间通信约定手册 ,特别是通过选择进行的点对点通信 ,以及开发人员角落的更加压缩的解释(和Python测试工具) :Linux中的复制粘贴

Thus, the data "Plot[00][7f][00][00]" or maybe "Plot[00][00][00][00]" is the data that is actually provided by Mathematica on request to the application that "reads" the clipboard. 因此,数据“Plot [00] [7f] [00] [00]”或可能“Plot [00] [00] [00] [00]”是由Mathematica根据请求向应用程序实际提供的数据。 “读取”剪贴板。 I can only imagine that Mathematica says "here is the string with eight bytes" and the reading application tries to deal with this, reading past the end of the actual character array. 我只能想象Mathematica说“这里是八字节的字符串”,阅读应用程序试图处理这个,读取超过实际字符数组的结尾。

It could also be a bug in X (but Ubuntu 12.04 doesn't use Mir yet, so probably not.) 它也可能是X中的一个错误(但是Ubuntu 12.04还没有使用Mir,所以可能不是。)

Note that in Java Strings are not NUL-terminated and "Plot[00][7f][00][00]" is a valid string indeed. 注意,在Java字符串中不是NUL终止的,并且“Plot [00] [7f] [00] [00]”确实是有效的字符串。

A quick glance at the source of xclip (obtained with yumdownloader --source xclip on my Fedora) seems to reveal that it just calls XFetchBuffer or memcpy (not fully sure) to obtain bytes, then calls fwrite on those, so it will happily write the NULs to the output. 快速浏览一下xclip的源xclip (用我的Fedora上的yumdownloader --source xclip获得)似乎表明它只是调用XFetchBuffermemcpy (不完全确定)来获取字节,然后在那些上调用fwrite ,所以它会愉快地写NUL到输出。

It's looks like some issues with string end character(I had similar issues with data modified by c++ dll, and sent through external system). 它看起来像字符串结束字符的一些问题(我有类似的问题与c ++ dll修改的数据,并通过外部系统发送)。 I don't know how to fix the problem, but I think you can make simple workaround to remove invalid chars - simple call trim() method on text. 我不知道如何解决这个问题,但我认为你可以简单的解决方法来删除无效的字符 - 简单的调用trim()方法对文本。

text = (String) systemClipboard.getData(DataFlavor.stringFlavor);
text = text.trim();
System.out.println(text);

I guess, it's zero terminated "c-style" string and there is some misunderstanding about it between Matematica and Java. 我猜,它是零终止的“c-style”字符串,并且在Matematica和Java之间存在一些误解。 I'd ask somewhere on a Linux forum how the clipboard is supposed to work. 我会在Linux论坛上的某个地方询问剪贴板应该如何工作。

As a workaround, I'd suggest 作为一种解决方法,我建议

test.replaceFirst("\u0000(?s:.*)", "");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM