This week i struggled a little with modifying interior elements of an object in the lapply loop.
Help was this article at the stackoverflow forum.
Through last week I focused on integration some of the existing methods from packages such as NLP and tm for tmCorpus object.
I integrated beside others methods such as content(x) which for tmCorpus returns list of documents.
Also, I prepared version of tm_map function that can be applied for tmCorpus. At last it is already possible to create Term Document Matrix and Document Term Matrix based on just tmCorpus document. Last commit was solving bug which requires to use always English stop-words. At the moment it is more universal.
for this week I plan to:
July 4 – July 12
Start working on the “Integration of TreeTagger for topic modelling within specific part of speech.”
Testing the koRpus package for this purpose
Designing API to include TreeTagger without using the koRpus package.
Also, I plan to invest some time into sentiment analysis and other packages for LDA and visualizations.