Hi there!

This week I am going to start my coding project with Google Summer of Code! Through the last weeks I started getting deeper into the visualization and analytical tools for text mining. This included the sentiment analysis with the syuzhet, visualizations of the Topic Models using LDA and LDAvis, and topics browser made by Andrew Goldstone dfr-browser.
Also, I studied the methodology of the Test Driven Development. This software development process consists of short development cycle. Author of Test Driven Development, Kent Beck, remarked that TDD encourages simple designs and inspires confidence.

Plans for this week:

May 23 – May 29

  • Start the work on the “Interface for the mallet package”.
    Solution of this problem had been partially implemented by Andrew Goldstone. Some interface functions are also available at my GitHub repository.
  • Designing the API and preparing the skeleton for the most important functions.
    The structure of the basic data structures is already under discussion with mentors.

References:

  • https://github.com/tedunderwood/pmla-scripts/blob/master/ConvertPMLAtoNetwork.R
  • https://github.com/agoldst/dfr-analysis
  • http://andrewgoldstone.com/
  • https://begriffs.com/posts/2015-02-25-text-mining-in-r.html
  • http://www.r-bloggers.com/a-link-between-topicmodels-lda-and-ldavis/