GithubHelp home page GithubHelp logo

kaizhang / clustering Goto Github PK

View Code? Open in Web Editor NEW
15.0 15.0 2.0 244 KB

fast clustering algorithms

Home Page: https://hackage.haskell.org/package/clustering

License: MIT License

Haskell 100.00%
algorithm clustering haskell

clustering's People

Contributors

kaizhang avatar maksbotan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

silky agrafix

clustering's Issues

Build failure on GHC 7.10

Unpacking to clustering-0.2.0/
Resolving dependencies...
Configuring clustering-0.2.0...
Building clustering-0.2.0...
Preprocessing library clustering-0.2.0...
[1 of 6] Compiling AI.Clustering.KMeans.Internal ( src/AI/Clustering/KMeans/Internal.hs, dist/build/AI/Clustering/KMeans/Internal.o )

src/AI/Clustering/KMeans/Internal.hs:60:5:
    Non type-variable argument
      in the constraint: Data.Matrix.Generic.Matrix m U.Vector Double
    (Use FlexibleContexts to permit this)
    When checking that ‘loop’ has the inferred type
      loop :: forall (m :: (* -> *) -> * -> *).
              Data.Matrix.Generic.Matrix m U.Vector Double =>
              [Int] -> Int -> m0 (m U.Vector Double)
    In an equation for ‘kmeansPP’:
        kmeansPP g k dat fn
          | k > n = error "k is larger than sample size"
          | otherwise
          = do { c1 <- uniformR (0, n - 1) g;
                 loop [c1] 1 }
          where
              loop centers !k'
                | k' == k
                = return
                  $ MU.fromRows $ map (\ i -> fn $ dat `G.unsafeIndex` i) centers
                | otherwise
                = do { c' <- chooseWithProb g
                             $ U.map (shortestDist centers) rowIndices;
                       .... }
              n = G.length dat
              rowIndices = U.enumFromN 0 n
              shortestDist centers x
                = minimum
                  $ map
                      (\ i
                         -> sumSquares
                              (fn $ dat `G.unsafeIndex` x) (fn $ dat `G.unsafeIndex` i))

Parallelism in K-means

Hi,

I wonder if you could provide a parallel version implementation of K-means? I used to deal with problems which have a lot of data points. The part that assigns every point to a cluster can be parallel.

Benchmark build failure

Preprocessing benchmark 'bench' for clustering-0.3.0...
[1 of 4] Compiling Bench.Utils      ( benchmarks/Bench/Utils.hs, dist/build/bench/bench-tmp/Bench/Utils.o )
[2 of 4] Compiling Bench.KMeans     ( benchmarks/Bench/KMeans.hs, dist/build/bench/bench-tmp/Bench/KMeans.o )

benchmarks/Bench/KMeans.hs:31:27: error:
    • Found hole: _clusters :: a0 -> U.Vector Int
      Where: ‘a0’ is an ambiguous type variable
      Or perhaps ‘_clusters’ is mis-spelled, or not in scope
    • In the first argument of ‘fmap’, namely ‘_clusters’
      In the first argument of ‘(.)’, namely ‘fmap _clusters’
      In the expression: fmap _clusters . kmeans g method k
    • Relevant bindings include
        k :: Int (bound at benchmarks/Bench/KMeans.hs:31:18)
        method :: Method (bound at benchmarks/Bench/KMeans.hs:31:11)
        g :: GenIO (bound at benchmarks/Bench/KMeans.hs:31:9)
        kmeans' :: GenIO
                   -> Method -> Int -> MU.Matrix Double -> IO (U.Vector Int)
          (bound at benchmarks/Bench/KMeans.hs:31:1)

benchmarks/Bench/KMeans.hs:31:39: error:
    • Couldn't match expected type ‘MU.Matrix Double -> IO a0’
                  with actual type ‘KMeans (U.Vector Double)’
    • Possible cause: ‘kmeans’ is applied to too many arguments
      In the second argument of ‘(.)’, namely ‘kmeans g method k’
      In the expression: fmap _clusters . kmeans g method k
      In an equation for ‘kmeans'’:
          kmeans' g method k = fmap _clusters . kmeans g method k

Missing example for kmeans

Can you add some simple example where you cluster points? For example the iris dataset would be sufficient.

Test suite fails to build

Preprocessing test suite 'test' for clustering-0.3.1...
[1 of 4] Compiling Test.Utils       ( tests/Test/Utils.hs, .stack-work/dist/x86_64-osx/Cabal-1.22.5.0/build/test/test-tmp/Test/Utils.o )
[2 of 4] Compiling Test.KMeans      ( tests/Test/KMeans.hs, .stack-work/dist/x86_64-osx/Cabal-1.22.5.0/build/test/test-tmp/Test/KMeans.o )

..../clustering/tests/Test/KMeans.hs:51:34: Not in scope: ‘decode’

..../clustering/tests/Test/KMeans.hs:52:18: Not in scope: ‘kmeansWith’

..../clustering/tests/Test/KMeans.hs:53:34: Not in scope: ‘decode’

..../clustering/tests/Test/KMeans.hs:53:48:
    ‘_clusters’ is not a (visible) constructor field name
Progress: 1/2
--  While building package clustering-0.3.1 using:
     ..../x86_64-osx/Cabal-simple_mPHDZzAJ_1.22.5.0_ghc-7.10.3 --builddir=.stack-work/dist/x86_64-osx/Cabal-1.22.5.0 build lib:clustering test:test bench:bench --ghc-options " -ddump-hi -ddump-to-file"
    Process exited with code: ExitFailure 1

Some symbols are just not exported from AI.Clustering.KMeans but kmeansWith has been replaced in 9d0fa5f and I assume the tests haven't been run since.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.