Strange output for topic modeling in python using PAM LDA

Question

I am trying to do topic modeling on my dataframe which just consists of words in english,you can substitute it with any text-

dfi['clean_text']
Out[154]: 
0        thank you for calling my name is gabrielle and...
1        your available my first name is was there you ...
2                                                    good 
3                                           go head sorry 
4        no go head i mean how do you want to pull my r...
                       
14676                              just the email is fine 
14677    okay great so then everything is process here ...
14678                         no thats it i appreciate it 
14679    yes and thank you very much we appreciated hav...
14680                                   thank you bye bye

My model -

#Pachinko Allocation Model
import tomotopy as tp
from pprint import pprint

model = tp.LDAModel(k=2, seed=1)  #k is the number of topics

for texts in dfi['clean_text']:
    model.add_doc(texts)

model.train(iter=100)

#Extracting the word distribution of a topic
for k in range(model.k):
    print(f"Topic {k}")
    pprint(model.get_topic_words(k, top_n=5))
Topic 0
[(' ', 0.2129271924495697),
 ('e', 0.08137548714876175),
 ('o', 0.0749373733997345),
 ('a', 0.07390690594911575),
 ('t', 0.06929121911525726)]
Topic 1
[(' ', 0.19975200295448303),
 ('e', 0.09751541167497635),
 ('t', 0.06939278542995453),
 ('i', 0.06373799592256546),
 ('o', 0.06239694356918335)]

But as you can see here, the output is showing no string or words by topic, it just shows alphabets for some strange reason. Im new to python and may be missing something here.

Answer 1

I am trying to do topic modeling on my dataframe which just consists of words in english,you can substitute it with any text-

dfi['clean_text']
Out[154]: 
0        thank you for calling my name is gabrielle and...
1        your available my first name is was there you ...
2                                                    good 
3                                           go head sorry 
4        no go head i mean how do you want to pull my r...
                       
14676                              just the email is fine 
14677    okay great so then everything is process here ...
14678                         no thats it i appreciate it 
14679    yes and thank you very much we appreciated hav...
14680                                   thank you bye bye

My model -

#Pachinko Allocation Model
import tomotopy as tp
from pprint import pprint

model = tp.LDAModel(k=2, seed=1)  #k is the number of topics

for texts in dfi['clean_text']:
    model.add_doc(texts)

model.train(iter=100)

#Extracting the word distribution of a topic
for k in range(model.k):
    print(f"Topic {k}")
    pprint(model.get_topic_words(k, top_n=5))
Topic 0
[(' ', 0.2129271924495697),
 ('e', 0.08137548714876175),
 ('o', 0.0749373733997345),
 ('a', 0.07390690594911575),
 ('t', 0.06929121911525726)]
Topic 1
[(' ', 0.19975200295448303),
 ('e', 0.09751541167497635),
 ('t', 0.06939278542995453),
 ('i', 0.06373799592256546),
 ('o', 0.06239694356918335)]

But as you can see here, the output is showing no string or words by topic, it just shows alphabets for some strange reason. Im new to python and may be missing something here.

Strange output for topic modeling in python using PAM LDA

Question

1 answers

solution1
0 2021-04-25 19:30:22

Strange output for topic modeling in python using PAM LDA

Question

1 answers

solution1 0 2021-04-25 19:30:22

solution1
0 2021-04-25 19:30:22