I have data that looks like this:
original_data =
[['not',
'ahead',
'um let me think',
'thats not very encouraging if they had a cast of thousands on the other end'],
['okay civil liberties tell me your position',
'probably would go ahead'],
['oh',
'it up so i dont know where you really go',
'well most of my problem with this latest task',
'its some i kind of dont want to put in the time to do it',
'right so im saying ive got a lot of other things to do']]
However, I am doing some preprocessing before modeling its structure needs to look like this:
new_data =
[[['not'],
['ahead'],
['um let me think'],
['thats not very encouraging if they had a cast of thousands on the other end']],
[['okay civil liberties tell me your position'],
['probably would go ahead']],
[['oh'],
['it up so i dont know where you really go'],
['well most of my problem with this latest task'],
['its some i kind of dont want to put in the time to do it'],
['right so im saying ive got a lot of other things to do']]]
How can I go about doing this? Thanks.
Just try to do
new_data = [[[j] for j in i] for i in original_data]
The [j] is wrapping the words into a list
And the rest is only to loop through the list
This list concatenation will return what you want:
new_data = [[[l] for l in lst] for lst in original_data]
Output:
[[['not'], ['ahead'], ['um let me think'], ['thats not very encouraging if they had a cast of thousands on the other end']], [['okay civil liberties tell me your position'], ['probably would go ahead']], [['oh'], ['it up so i dont know where you really go'], ['well most of my problem with this latest task'], ['its some i kind of dont want to put in the time to do it'], ['right so im saying ive got a lot of other things to do']]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.