简体   繁体   中英

Output index of ELKI

I am using ELKI to cluster data from CSV file

I use

-resulthandler ResultWriter
-out folder/

to save the outputdata

But as an output I have some strange indexes

ID=2138 0.1799 0.2761
ID=2137 0.1797 0.2778
ID=2136 0.1796 0.2787
ID=2109 0.1161 0.2072
ID=2007 0.1139 0.2047

The ID is more than 2000 despite I have less than 100 training samples

DBIDs are internal; the documentation clearly says that you shouldn't make too much assumptions on them because their implementation may change . The only reason they are written to the output at all is because some methods (such as OPTICS) may require cross-referencing objects by this unique ID.

Because they are meant to be unique identifiers, they are usually continuously incremented. The next time you click on "run" in the MiniGUI, you will get the next n IDs... so clearly, you clicked run more than once.

The "Tips & Tricks" in the ELKI DBID documentation probably answer your underlying question - how to use map DBIDs to line numbers of your input file. The best way is to if you want to have object identifiers, assign object identifiers yourself by using an identifier column (and configuring it to be an external identifier).

For further information, see the documentation: https://elki-project.github.io/dev/dbids

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM