DSTK - Data Science ToolKits


DSTK - Data Science Toolkit 3 is a set of data and text mining softwares, following the CRISP DM model. DSTK offers data understanding using statistical and text analysis, data preparation using normalization and text processing, modeling and evaluation for machine learning and statistical learning algorithms. It is based on the old version DSTK at https://sourceforge.net/projects/dstk2/

The new version is a lot smaller in file size (~150 mb), no more downloading of gigabytes of files. DSTK 3 will offer attractive features like Deep Neural Network (Deep Learning), Text Link Analysis with Visualizations, KMeans Clustering, and etc. in future. Some of these features may be presented in older version, but because the algorithms are rewritten to reduce the use of external libraries like Weka to reduce file size, we need more time to develop them. DSTK Engine is still in beta stage, hence, there may be some bugs and inaccuracy.

DSTK 3 consists of DSTK Engine, DSTK ScriptWriter, DSTK Studio and DSTK Text Explorer. DSTK Engine is R simplified, focusing on Data Mining. DSTK ScriptWriter offers GUI to write script for DSTK Engine. DSTK Studio offers SPSS Statistics like GUI for data mining, and DSTK Text Explorer offers GUI for Text Mining.

DSTK Engine and DSTK ScriptWriter are free of charge and have been uploaded to Sourceforge.net They are under GNU GPL License. For commercial license, please Contact Us.

DSTK Studio and Text Explorer, however, requires a small fee of $59 usd to help support us. A demo version of DSTK Studio and DSTK Text Explorer is included in DSTK 3 package, but you can only use them 10 times.

DSTK 3 is written in C# and Java. You need Microsoft .Net framework and Java runtime to run the softwares.



License: DSTK Engine uses WordNet, MIT JWI, GATE's Gazetteers, Stanford NLP POS Tagger, Harvard University Inquirer Sentiments Data, Porter Stemmer C# library, RPortable, Math.Net Numerics, and etc. Each have their own licenses and are included in DSTK Engine distribution. DSTK Studio, DSTK Text Explorer, DSTK ScriptWriter are Standalone softwares providing easy to use GUI to write script for DSTK Engine. DSTK 3 has no warranty, but we will take feedbacks.


Download Data Science TooKit 3




Purchase DSTK Studio and DSTK Text Explorer:

FEATURES



Data Understanding using Statistical Analysis

1. Descriptives (mean, median, variance, standard deviations, ...)
2. Inferential (T-Test, Chi Square ...)
3. Regression (Simple Linear, Multiple Linear...)
... And interface with R and Python, ...

Data Understanding using Data Visualizations

1. Histogram
2. Scatter Plots
3. Box Plots
... And more...

Data Preparation

1. Log Transform
2. Feature Scaling
3. Standard Score
4. Remove Missing Values
... And more...

Modeling and Evaluation

1. Neural Network (in future, Deep Neural Network)
2. Naive Bayes
3. KNN
4. Linear Regression
5. Multiple Linear Regressions
6. Bags of Words

Text Mining and Analysis

1. Text Preprocessing (stopwords, porter stemmer, regular expressions, ...)
2. POS Tagging, Name Entity, Word Net
3. Sentiment Analysis
4. Text Classification (Naive Bayes, NN, ...)
... And more with gazetteers from GATE...

Plugins

1. Expand features with R Scripts...
2. Included plugins for Big Data Analysis using Microsoft Azure...


Download Data Science TooKit 3




Purchase DSTK Studio and DSTK Text Explorer:

Screenshots






Other Data Science Technology...



DSTK Old Version


DSTK - DataScience ToolKit is a free software for statistical analysis, data visualization, text analysis, and predictive analytics. It is designed to be straight forward and easy to use, and familar to SPSS user. The application is written in R, Python, NLTK, Scikit Learn, and etc. The product is currently available as FREEware.

Download DSTK - DataScience ToolKit

JAOSS - Online Statistical System


JAOSS - Just Another Online Statistical System. A simple statistical system with fairly sophisticated features such as descriptive statistics analysis, inferential statistics with ANOVA and T-Test, Predictive Analytics with Neural Network. The application is written in R and Shiny, with an aim in mind to provide online access to simplistic statistical system before proceed to advanced softwares such as SPSS or DSTK. This product is currently available as FREEware.

View details »

Demo »

JATAS - Online Text Analysis


JATAS - Just Another Text Analysis System. A simple text analytics system with fairly sophisticated features such as text preprocessing (stemmer, stopwords...), Visualizations, and Predictive Analytics with SVM. The application is written in R and Shiny, with an aim in mind to provide online access to simplistic text analysis system before proceed to advanced softwares such as SPSS Modeler or DSTK. This product is currently available as FREEware..

View details »

Demo »



JATI - Just Another Tesseract Interface


JATI is just another interface to the Tesseract OCR engine, providing GUI interface to convert an image to text. It can do batch conversion, including converting only portion of the image into text. This product is currently available as FREEware.

Download JATI - Just Another Tesseract Interface

JAVT - Just Another Voice Transformer


JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. This product is currently available as FREEware..

Download JAVT - Just Another Voice Transformer

JAWS - Just Another WebScraper


JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.This product is currently available as FREEware. Enjoy.

Download JAWS - Just Another Web Scraper

Free DSTK Books and Courses with Certifications


DSTK 3 Book


We have develop our own Data and Text Mining software at DSTK.Tech. This technical book aim to equip the reader with Data and Text Mining fundamentals in a fast and practical way using our DSTK - Data Science ToolKit 3 software. There will be many examples and explanations that are straight to the point.

Contents
1. Introduction
2. Getting Started
3. DSTK ScripWriter Essentials
4. DSTK Studio Essentials
5. DSTK Text Explorer Essentials
6. Conclusion

Now Free.

Get Now for Free »



Introduction to Data and Text Mining with DSTK 3 Course


Have you ever wanted to learn data and text mining? Data Science is a very hot trend now. This FREE course will equip you with the fundamentals of data and text mining knowledge, with the use of our own DSTK - Data Science Toolkit 3.

View Course »

About Us


DSTK Tech is part of SVBook. Our main goal is to create useful data science technology for practitioners in both academia and business to reach fast conclusions for data science and analysis before going into deeper tools like SPSS Statistics. DSTK was designed with the user in mind, using SPSS and Excel like interface to reduce the learning curves. DSTK Engine and DSTK ScriptWriter are free of charge and have been uploaded to Sourceforge.net. DSTK Studio and DSTK Text Explorer require a small fee of 59 usd to support us.

Contact

Question?

Singapore