Views
Possible Mentors
-
Jamie Bullock
Description
Clustering is the assignment of objects into groups (called clusters) so that objects from the same cluster are more similar to each other than objects from different clusters. Often similarity is assessed according to a distance measure. Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. The purpose of this project is to develop a set of tools in the form of Pd patches and abstractions that make various clustering techniques available in Pd. This could include (but not limited to) k-Means clustering, Principal Components Analysis and Multi-dimensional Scaling analysis.
Related projects
-
Pd DataViz: a library of Gem, Pd and Pd data structures abstractions for data visualization
-
ArtificialNeuralNetworksLibrary: extending Pd's neural network externals with new algorithms and recurrent networks
Resources to start:
-
http://en.wikipedia.org/wiki/Data_clustering Clustering WikiPedia entry
Required Skills
-
knowledge and interest in data clustering approaches
-
decent mathematical skills (geometry)
-
reasonable Pd patching skills
-
possible C, Python or Lua skills
Possible Breakdown of Steps
-
Investigate existing collections of abstractions/patches that fit the requirements
-
Identify the set of abstractions/patches that will form the basis of the library
-
Create new abstractions/patches as necessary
-
Package up the abstractions for distribution