I have a list in pyspark where list looks like
result = ['2016-12-11T04:12:58.797', '2016-12-11T03:50:28.253', '2016-12-11T03:49:52.613', '2016-12-11T03:37:49.857']
I have to fetch only the year from the list. What I tried is
resultYear = result[0:4]
But, I know this is not the solution. I am very new to python and pyspark, so I need help. Thanks.
To answer the question in the title, you simply have to iterate through the list and get the first 4 letters for each element of the list:
for element in result:
year = element[:4]
# do what you want with this, e.g print it
print(year)
>>>2016
>>>2016
...
But a more concise way to do it is list comprehension:
r = [el[:4] for element in result]
# returns a list of years
print(r)
>>> ['2016', '2016',...]
Use string split function and split the string where 'T' occurs and use the string before character 'T'
INPUT
result = ['2016-12-11T04:12:58.797', '2016-12-11T03:50:28.253', '2016-12-11T03:49:52.613', '2016-12-11T03:37:49.857']
result = [(r.split('-')[0]) for r in result]
OUTPUT
['2016', '2016', '2016', '2016']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.