I have a university project where I need to do some simple analysis on a large dataset of my choosing, and we are to run this in the Hadoop system. I am choosing to use Hive because I have essentially no experience with databases, but I like Hive.
Anyway, I have got a chess dataset, and I have been able to extract some columns of interest, such as the names of the opening moves, and find how often they occur. Things like that.
I would like to be able to take a look at the first few move from each game, and that brings me to my problem. The notation for all moves is stored in a column called moves
, and looks like this:
This column is in a .csv file called chess_game
.
How would I go about extracting say, the first 4 moves into a new table called something like opening_moves
.
Thanks in advance for any advice.
You can split moves string using split function. Like this:
select rating,
moves[0] as first,
moves[1] as second,
moves[2] as third,
moves[3] as fourth
(
select rating, split(moves, ' ') as moves from your_table
) s
;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.