So I started a TikTok tool for data analysis but I cannot extract hashtags from a saved.txt file. Here there's what I did:
from tiktok_bot import TikTokBot # TikTok API
import csv
import os
import sys
import re # attempt to use findall, but it didn't work
try:
os.mkdir("./data") . # Creating data folder
except OSError as e:
print("Directory exists")
def getData(): # date in file name
return datetime.datetime.now().strftime ("%Y-%m-%d")
def buildFileName(type): # building .csv name
return ("./data/") + getData() + (type) + ".csv"
def buildText(type): # building .txt name
return ("./data/") + getData() + (type) + ".txt"
with open(buildFileName("_shares"), mode='a') as csv_file: # writing .csv file
fieldnames = ['User ID', 'URL', 'Description', 'Comments', 'Likes']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
for post in most_shared_posts:
print(str(post.author_user_id) , str(post.share_url) , str(post.desc) , post.statistics.comment_count , post.statistics.digg_count)
writer.writerow({'User ID': str(post.author_user_id), 'URL': str(post.share_url), 'Description': str(post.desc), 'Comments': post.statistics.comment_count, 'Likes': post.statistics.digg_count})
with open(buildFileName("_shares"), mode='r') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
for lines in csv_reader:
print(lines['Description']) # save .csv
sys.stdout = open(buildText("_shares"), "w") . # .csv saved into .txt
print (lines['Description'])
What can I do now to extract hashtags from the descriptions printed in the.txt file? Note: Description is made by.txt and hashtags, so basically I think is a string.
You can do
import re
m = re.findall(r'#(\w+)', lines['Description'])
print(m)
I'm not sure I understand your question but am I correct in assuming you want to get the hashtags from the description string? If so you can use re to find all hashtag words in the string.
hashtags = re.findall(r"#\w*", description)
This should return a list for what you're looking for
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.