The right tools can make a world of difference. If you work with data, here are three tools to add to your toolbox.
1. Data Preprocessing
KNIME is an open source data analytics and integration platform. The interface allows you to assemble workflow nodes for data preprocessing (ETL) and data analysis. Modeling and data visualization nodes are also available, but I use other tools for those.
Need to create a monster Pivot Table? The Pivoting node can handle very large files with ease. I used a dataset in comma separated format (csv) and a simple KNIME workflow to create a pivot table with over 100,000 columns.
Download KNIME at knime.org.
2. Data Mining
Weka is a collection of machine learning algorithms that help you complete data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka also contains tools for data preprocessing, but can also manage classification, regression, clustering, association rules, and visualization.
Weka has a large online community and lots of support. The interface is easy to use.
Weka also provides some great visualizations of your dataset.
Download Weka here.
Here are a few sample datasets to get you started.
3. Data Visualization
Tableau Public is a free tool to create interactive data stories on the web. It’s available as a service so you can be up and running as soon as you download it. Connect, create, and publish interactive data visualizations directly to your website. No coding required!
Tableau even provides How-to Videos and sample datasets.
Leave a Reply