**DSTK - Data Science Toolkit 3 ** is a set of data and text mining softwares, following the CRISP DM model. DSTK offers **data understanding** using statistical and text analysis, **data preparation** using normalization and text processing, **modeling and evaluation** for machine learning and statistical learning algorithms. It is based on the old version DSTK at https://sourceforge.net/projects/dstk2/

**The new version is a lot smaller in file size (~150 mb), no more downloading of gigabytes of files**. DSTK 3 will offer attractive features like Deep Neural Network (Deep Learning), Text Link Analysis with Visualizations, KMeans Clustering, and etc. in future. Some of these features may be presented in older version, but because the algorithms are rewritten to reduce the use of external libraries like Weka to reduce file size, we need more time to develop them.
DSTK Engine is still in beta stage, hence, there may be some bugs and inaccuracy.

**DSTK 3 consists** of **DSTK Engine**, **DSTK ScriptWriter**, **DSTK Studio** and **DSTK Text Explorer**. DSTK Engine is R simplified, focusing on Data Mining. DSTK ScriptWriter offers GUI to write script for DSTK Engine. DSTK Studio offers SPSS Statistics like GUI for data mining, and DSTK Text Explorer offers GUI for Text Mining.

**DSTK Engine and DSTK ScriptWriter are free of charge **** **and have been uploaded to Sourceforge.net They are under GNU GPL License. For __commercial license__, please Contact Us.

**DSTK Studio and Text Explorer**, however, **requires a small fee of $59 usd** to help support us. A demo version of DSTK Studio and DSTK Text Explorer is included in DSTK 3 package, but you can only use them 10 times.

DSTK 3 is written in C# and Java.

**License:** *DSTK Engine uses WordNet, MIT JWI, GATE's Gazetteers, Stanford NLP POS Tagger, Harvard University Inquirer Sentiments Data, Porter Stemmer C# library, RPortable, Math.Net Numerics, and etc. Each have their own licenses and are included in DSTK Engine distribution. DSTK Studio, DSTK Text Explorer, DSTK ScriptWriter are Standalone softwares providing easy to use GUI to write script for DSTK Engine. DSTK 3 has no warranty, but we will take feedbacks. *

**Purchase DSTK Studio and DSTK Text Explorer: **

1. Descriptives (mean, median, variance, standard deviations, ...)

2. Inferential (T-Test, Chi Square ...)

3. Regression (Simple Linear, Multiple Linear...)

... And interface with R and Python, ...

1. Histogram

2. Scatter Plots

3. Box Plots

... And more...

1. Log Transform

2. Feature Scaling

3. Standard Score

4. Remove Missing Values

... And more...

1. Neural Network (in future, Deep Neural Network)

2. Naive Bayes

3. KNN

4. Linear Regression

5. Multiple Linear Regressions

6. Bags of Words

1. Text Preprocessing (stopwords, porter stemmer, regular expressions, ...)

2. POS Tagging, Name Entity, Word Net

3. Sentiment Analysis

4. Text Classification (Naive Bayes, NN, ...)

... And more with gazetteers from GATE...

1. Expand features with R Scripts...

2. Included plugins for Big Data Analysis using Microsoft Azure...

**Purchase DSTK Studio and DSTK Text Explorer: **

DSTK - DataScience ToolKit is a free software for statistical analysis, data visualization, text analysis, and predictive analytics. It is designed to be straight forward and easy to use, and familar to SPSS user. The application is written in R, Python, NLTK, Scikit Learn, and etc. The product is currently available as FREEware.

JAOSS - Just Another Online Statistical System. A simple statistical system with fairly sophisticated features such as descriptive statistics analysis, inferential statistics with ANOVA and T-Test, Predictive Analytics with Neural Network. The application is written in R and Shiny, with an aim in mind to provide online access to simplistic statistical system before proceed to advanced softwares such as SPSS or DSTK. This product is currently available as FREEware.

JATAS - Just Another Text Analysis System. A simple text analytics system with fairly sophisticated features such as text preprocessing (stemmer, stopwords...), Visualizations, and Predictive Analytics with SVM. The application is written in R and Shiny, with an aim in mind to provide online access to simplistic text analysis system before proceed to advanced softwares such as SPSS Modeler or DSTK. This product is currently available as FREEware..

JATI is just another interface to the Tesseract OCR engine, providing GUI interface to convert an image to text. It can do batch conversion, including converting only portion of the image into text. This product is currently available as FREEware.

JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. This product is currently available as FREEware..

JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.This product is currently available as FREEware. Enjoy.

We have develop our own Data and Text Mining software at DSTK.Tech. This technical book aim to equip the reader with Data and Text Mining fundamentals in a fast and practical way using our DSTK - Data Science ToolKit 3 software. There will be many examples and explanations that are straight to the point.
**Contents**

1. Introduction

2. Getting Started

3. DSTK ScripWriter Essentials

4. DSTK Studio Essentials

5. DSTK Text Explorer Essentials

6. Conclusion
**Now Free.**

Have you ever wanted to learn data and text mining? Data Science is a very hot trend now. This **FREE** course will equip you with the fundamentals of data and text mining knowledge, with the use of our own DSTK - Data Science Toolkit 3.

DSTK Tech is part of SVBook. Our main goal is to create useful data science technology for practitioners in both academia and business to reach fast conclusions for data science and analysis before going into deeper tools like SPSS Statistics. DSTK was designed with the user in mind, using SPSS and Excel like interface to reduce the learning curves. DSTK Engine and DSTK ScriptWriter are free of charge and have been uploaded to Sourceforge.net. DSTK Studio and DSTK Text Explorer require a small fee of 59 usd to support us.

Question?

Singapore